用于处理 Microsoft Word 和/或 OpenOffice 文件的 Git(或 Hg)插件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3292792/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 08:46:32  来源:igfitidea点击:

Git (or Hg) plugin for dealing with Microsoft Word and/or OpenOffice files

gitversion-controlpluginsmercurialopenoffice.org

提问by JudoWill

Has anyone come across a Git or Hg plugin for "meaningful" diffs/merging/branching of OpenOffice or Microsoft word files.

有没有人遇到过用于 OpenOffice 或 Microsoft Word 文件“有意义的”差异/合并/分支的 Git 或 Hg 插件。

I know I can 'checkin' .doc files but both Git and Hg treat them as binary blobs. I'd like to be able to do all (or at least many) of the normal revision based operations on the text of the file.

我知道我可以“签入” .doc 文件,但 Git 和 Hg 都将它们视为二进制 blob。我希望能够对文件文本执行所有(或至少许多)基于正常修订的操作。

And yes, I do know that I should be using Latex or converting files back-and-forth between RTF. I'm just looking for a more "native" solution since I'm trying to manage collaboration between techies and "management people".

是的,我确实知道我应该使用 Latex 或在 RTF 之间来回转换文件。我只是在寻找更“本地”的解决方案,因为我正在尝试管理技术人员和“管理人员”之间的协作。

This is related to my question on Biostar here: http://biostar.stackexchange.com/questions/1749/writing-collaboration-with-source-control-and-microsoft-word

这与我在这里关于映泰的问题有关:http: //biostar.stackexchange.com/questions/1749/writing-collaboration-with-source-control-and-microsoft-word

Thanks.

谢谢。

回答by aparkerlue

How about:

怎么样:

  1. Save your Word docs in XML.
  2. Commit your XML Word files.
  3. Diff using an external XML diff tool. For example:

    $ git difftool -t xmldiff c3d293 498571

  1. 以 XML 格式保存您的 Word 文档。
  2. 提交您的 XML Word 文件。
  3. 使用外部 XML 差异工具进行差异化。例如:

    $ git difftool -t xmldiff c3d293 498571

Transforming the XML files to have one element per line should make the check-in process run efficiently and also allow the external XML diff tool to process quickly.

将 XML 文件转换为每行一个元素应该会使签入过程有效运行,并允许外部 XML diff 工具快速处理。

References:

参考:

回答by Mark Mikofski

If you are on MS Windows, use TortoiseGit. I just had to go through this painful experience, and TGit, although inelegant takes some of the pain out it. A couple of other points:

如果您使用的是 MS Windows,请使用TortoiseGit。我只是不得不经历这种痛苦的经历,而 TGit,虽然不雅消除了它的一些痛苦。其他几点:

  • Surprisingly git diff and gitk both do a reasonably good job of at least visualizingdiffs between .docx (not sure about .doc, but I would assume it's the same). This is good for just a quick scan of diffs when doing commits.
  • You are completely out of luck as far as fast forward and automerging is concerned. Unfortunately I have not found a tool that can handle this (although I like the xml idea above), so you will have to do all merges manually.
  • Microsoft Word (MS Word) has a decent, if flawed, merge tool. AFAIK, it can only do 2-way merges (i.e.:X0 + dX = X1), not 3-way or 2-parent merges, which are more common in version control (i.e.:X0 + dX1 + dX2 = X1). You couldsolve merge conflicts using this tool, but there would be some legwork right - checking out each branch, exporting HEAD as an untracked version, etc.

    X0 = *.BASE.docx,
    X0 + dX1 = *.LOCAL.docx and
    X0 + dX2 = *.REMOTE.docx
    
  • Luckily this is exactly what TGit (and TSVN too) do. I would unfortunately, avoid rebasesince if you have to replay several changes in a row, it can be very tiring, but mergefor short documents is fine, just not great.

  • 令人惊讶的是 git diff 和 gitk 至少在可视化.docx 之间的差异方面做得相当好(不确定 .doc,但我认为它是相同的)。这对于在提交时快速扫描差异很有用。
  • 就快进和自动合并而言,您完全不走运。不幸的是,我还没有找到可以处理这个问题的工具(尽管我喜欢上面的 xml 想法),因此您必须手动完成所有合并。
  • Microsoft Word (MS Word) 有一个不错的合并工具,如果有缺陷的话。AFAIK,它只能做 2 路合并(:)X0 + dX = X1,而不是 3 路或 2 父合并,这在版本控制中更常见(即:)X0 + dX1 + dX2 = X1。您可以使用此工具解决合并冲突,但会有一些繁琐的工作 - 检查每个分支,将 HEAD 导出为未跟踪的版本等。

    X0 = *.BASE.docx,
    X0 + dX1 = *.LOCAL.docx and
    X0 + dX2 = *.REMOTE.docx
    
  • 幸运的是,这正是 TGit(以及 TSVN)所做的。不幸的是,我会避免,rebase因为如果您必须连续重播多个更改,这可能会非常累人,但merge对于简短的文档来说很好,只是不太好。

回答by rlegendi

A nice trick I was able to come up with that also works on Open Office files, PPTs, etc.:

我想出的一个不错的技巧也适用于 Open Office 文件、PPT 等:

http://xcafebabe.blogspot.hu/2012/09/sexy-comparison-of-word-documents-with.html

http://xcafebabe.blogspot.hu/2012/09/sexy-comparison-of-word-documents-with.html

Here's a screenshot that demonstrates the result:

这是演示结果的屏幕截图:

enter image description here

在此处输入图片说明

回答by Robert Cowham

Answering JudoWill's question - Workshare is probably leading tool used by Lawyers.

回答 JudoWill 的问题 - Workshare 可能是律师使用的主要工具。

回答by nachocab

I compiled instructions for multiple places here: http://bit.ly/17LaxVY

我在这里编译了多个地方的说明:http: //bit.ly/17LaxVY

# download docx2txt by Sandeep Kumar
wget -O docx2txt.pl http://www.cs.indiana.edu/~kinzler/home/binp/docx2txt

# make a wrapper 
echo '#!/bin/bash
docx2txt.pl  -' > docx2txt
chmod +x docx2txt

# make sure docx2txt.pl and docx2txt are your current PATH. Here's a guide
http://shapeshed.com/using_custom_shell_scripts_on_osx_or_linux/
mv docx2txt docx2txt.pl ~/bin/

# set .gitattributes (unfortunately I don't this can't be set by default, you have to create it for every project)
echo "*.docx diff=word" > .git/info/attributes

# add the following to ~/.gitconfig
[diff "word"]
    binary = true
    textconv = docx2txt

# add a new alias
[alias]
    wdiff = diff --color-words

# try it
git init

# create my_file.docx, add some content

git add my_file.docx

git ci -m "Initial commit"

# change something in my_file.docx

git wdiff my_file.docx

# awesome!

It works great on OSX

它在 OSX 上运行良好

回答by Marwen Trabelsi

Git 1.6.1 or later now comes with the textconvfeatures, which allows using an arbitrary command to convert a file to text before diffing.

Git 1.6.1 或更高版本现在带有textconv功能,允许使用任意命令在比较之前将文件转换为文本。

check this also: https://gist.github.com/17twenty/4985374

也检查一下:https: //gist.github.com/17twenty/4985374

回答by Ry4an Brase

Law firms have extremely robust systems for doing this. One's that don't trust the revision history in the document (because it's externally sourced) and instead do their own comparisons and can provide deltas. If that's what they really need you're better off buying that than putting a wrapper into git or mercurial that will never really be useable for them.

律师事务所有非常强大的系统来做到这一点。那些不相信文档中的修订历史(因为它是外部来源的)而是自己进行比较并可以提供增量的人。如果那是他们真正需要的东西,那么您最好购买它,而不是将包装器放入 git 或 mercurial 中,因为它们永远不会真正对他们有用。

Sorry to sound like pessimist, but it's more likely that the techies will use (while grumbling) the over priced commercial tool than it is that the office folks will use git or mercurial to any level of satisfaction.

抱歉,听起来像悲观主义者,但技术人员更有可能使用(同时抱怨)定价过高的商业工具,而不是办公室人员使用 git 或 mercurial 达到任何满意程度。

回答by Christophe Muller

Using svn (not git or hg, but you could have a gateway), there is an extension for Ooo working on uncompressed XML files, see my answerabout a similar question. BTW, if everyou look at the plugin code and make it hg-awareinstead of svn, please let me know! ;-)

使用 svn(不是 git 或 hg,但您可以有一个网关),Ooo 有一个用于处理未压缩 XML 文件的扩展,请参阅对类似问题的回答。顺便说一句,如果曾经你看插件代码,使其汞感知,而不是SVN,请让我知道!;-)