git 适合二进制文件吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4697216/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 09:51:43  来源:igfitidea点击:

Is git good with binary files?

git

提问by

Is git good with binary files?

git 适合二进制文件吗?

If I have a lot of uncompressed files being modified, and many compressed files never (or almost never) modified, would git handle it well? For example, if I insert or remove the middle and insert data near the end it will notice it as it does with text?

如果我有很多未压缩的文件被修改,并且很多压缩文件从未(或几乎从未)修改过,git 会处理得很好吗?例如,如果我插入或删除中间并在末尾附近插入数据,它会像处理文本一样注意到它吗?

If git isn't good with binary files, what tool might I consider?

如果 git 不能处理二进制文件,我可以考虑使用什么工具?

回答by ndim

Out of the box, git can easily add binary files to its index, and also store them in an efficient way unless you do frequent updates on large uncompressable files.

开箱即用,git 可以轻松地将二进制文件添加到其索引中,并以有效的方式存储它们,除非您对大型不可压缩文件进行频繁更新。

The problems begin when git needs to generate diffs and merges: git cannot generate meaningful diffs, or merge binary files in any way that could make sense. So all merges, rebases or cherrypicks involving a change to a binary file will involve you making a manual conflict resolution on that binary file.

当 git 需要生成差异和合并时,问题就开始了:git 无法生成有意义的差异,或以任何有意义的方式合并二进制文件。因此,所有涉及更改二进制文件的合并、rebase 或cherrypicks 都将涉及您对该二进制文件进行手动冲突解决。

You need to decide whether the binary file changes are rare enough that you can live with the extra manual work they cause in the normal git workflow involving merges, rebases, cherrypicks.

您需要确定二进制文件更改是否足够罕见,以便您可以忍受它们在正常 git 工作流程中引起的额外手动工作,包括合并、变基、cherrypicks。

回答by Jakub Nar?bski

In addition to other answers.

除了其他答案。

  • You can send a diff to binary file using so called binary diffformat. It is not human-readable, and it can only be applied if you have exactpreimage in your repository, i.e. without any fuzz.
    An example:

    diff --git a/gitweb/git-favicon.png b/gitweb/git-favicon.png
    index de637c0608090162a6ce6b51d5f9bfe512cf8bcf..aae35a70e70351fe6dcb3e905e2e388cf0cb0ac3 100
    GIT binary patch
    delta 85
    zcmZ3&SUf?+pEJNG#Pt9J149GD|NsBH{?u>)*{Yr{jv*Y^lOtGJcy4sCvGS>LGzvuT
    nGSco!%*slUXkjQ0+{(x>@rZKt$^5c~Kn)C@u6{1-oD!M<s|Fj6
    
    delta 135
    zcmXS3!Z<;to+rR3#Pt9J149GDe=s<ftM(tr<t*@sEM{Qf76xHPhFNnYfP!|OE{-7;
    zjI0MY3OYE5upapO?DR{I1pyyR7cx(jY7y^{FfMCvb5IaiQM`NJfeQjFwttKJyJNq@
    hveI=@x=fAo=hV3$-MIWu9%vGSr>mdKI;RB2CICA_GnfDX
    
  • You can use textconvgitattributeto have git diffshow human-readable diff for binary files, or parts of binary files. For example for *.jpg files it can be difference in EXIF information, for PDF files it can be difference between their text representation (pdf2text or something like that).

  • 您可以使用所谓的二进制差异格式将差异发送到二进制文件。它不是人类可读的,并且只有在您的存储库中有精确的原像时才能应用,即没有任何模糊。
    一个例子:

    diff --git a/gitweb/git-favicon.png b/gitweb/git-favicon.png
    index de637c0608090162a6ce6b51d5f9bfe512cf8bcf..aae35a70e70351fe6dcb3e905e2e388cf0cb0ac3 100
    GIT binary patch
    delta 85
    zcmZ3&SUf?+pEJNG#Pt9J149GD|NsBH{?u>)*{Yr{jv*Y^lOtGJcy4sCvGS>LGzvuT
    nGSco!%*slUXkjQ0+{(x>@rZKt$^5c~Kn)C@u6{1-oD!M<s|Fj6
    
    delta 135
    zcmXS3!Z<;to+rR3#Pt9J149GDe=s<ftM(tr<t*@sEM{Qf76xHPhFNnYfP!|OE{-7;
    zjI0MY3OYE5upapO?DR{I1pyyR7cx(jY7y^{FfMCvb5IaiQM`NJfeQjFwttKJyJNq@
    hveI=@x=fAo=hV3$-MIWu9%vGSr>mdKI;RB2CICA_GnfDX
    
  • 您可以使用textconv gitattributegit diff显示二进制文件或部分二进制文件的人类可读差异。例如,对于 *.jpg 文件,EXIF 信息可能存在差异,对于 PDF 文件,它们的文本表示(pdf2text 或类似内容)之间可能存在差异。

HTH.

哈。

回答by John Gibb

If you've got really large binary files, you can use git-annex to store the data outside of the repository. Check out: http://git-annex.branchable.com/

如果您有非常大的二进制文件,您可以使用 git-annex 将数据存储在存储库之外。查看:http: //git-annex.branchable.com/

回答by coreyward

I don't know of any tools that try to store diffs of binary files for version control, but it's worth noting that Git doesn't do this even for text files. Git stores files as blobs, and it does a diff between them when it needs to.

我不知道有什么工具会尝试存储二进制文件的差异以进行版本控制,但值得注意的是,即使对于文本文件,Git 也不会这样做。Git 将文件存储为 blob,并在需要时在它们之间进行比较。

If you're looking to do version control on something like Photoshop/Illustrator documents, GridIron Flowmight do the trick for you. If you're trying to keep them in sync between machines, Dropbox or Rsync can handle it, but they aren't going to do intelligent diff-ing.

如果您希望对 Photoshop/Illustrator 文档等进行版本控制,GridIron Flow可能会为您提供帮助。如果你试图让它们在机器之间保持同步,Dropbox 或 Rsync 可以处理它,但它们不会进行智能差异。

回答by Lo?c Faure-Lacroix

Well git is good with binaries. But it won't handle binaries like text files. It's like you want to merge binary files. I mean, a diff on a jpeg will never return you anything. Git works very well with text file and probably as bad as every other solution with binary files!

git 对二进制文件很好。但它不会像文本文件那样处理二进制文件。就像你想合并二进制文件一样。我的意思是,jpeg 上的差异永远不会给你任何回报。Git 与文本文件一起工作得很好,可能与其他二进制文件解决方案一样糟糕!

回答by danfromisrael

if you want a solution for versioning you might wanna consider git-lfsthat has a lightweight pointer to your file.

如果您想要一个版本控制解决方案,您可能需要考虑git-lfs,它具有指向您的文件的轻量级指针。

it means when you clone your repo it doesnt download all the versions but only the one that is checked-out.

这意味着当您克隆存储库时,它不会下载所有版本,而只会下载已检出的版本。

Here's a nice tutorialof how to use it

这是一个关于如何使用它的很好的教程