当 git 说它正在“解决增量”时，它实际上在做什么？

Question

提问by Nik Reiman

During the first clone of a repository, git first receives the objects (which is obvious enough), and then spends about the same amount of time "resolving deltas". What's actually happening during this phase of the clone?

在存储库的第一次克隆期间，git 首先接收对象（这很明显），然后花费大约相同的时间“解析增量”。在克隆的这个阶段实际发生了什么？

Answer 1

采纳答案by Amber

Git uses delta encodingto store some of the objects in packfiles. However, you don't want to have to play back every single change everon a given file in order to get the current version, so Git also has occasional snapshots of the file contents stored as well. "Resolving deltas" is the step that deals with making sure all of that stays consistent.

Git 使用delta 编码将一些对象存储在包文件中。但是，你不希望有播放的每一个修改过，以获得最新的版本在给定的文件，这样的Git还具有存储和文件内容偶尔的快照。“解决增量”是确保所有这些保持一致的步骤。

Here's a chapterfrom the "Git Internals" section of the Pro Git book, which is available online, that talks about this.

这是Pro Git 书籍的“Git Internals”部分的一章，可在线获取，讨论了这一点。

Answer 2

回答by araqnid

The stages of git cloneare:

的阶段git clone是：

Receive a "pack" file of all the objects in the repo database
Create an index file for the received pack
Check out the head revision (for a non-bare repo, obviously)

接收 repo 数据库中所有对象的“打包”文件
为收到的包创建一个索引文件
检查头部修订（对于非裸回购，显然）

"Resolving deltas" is the message shown for the second stage, indexing the pack file ("git index-pack").

“解析增量”是第二阶段显示的消息，索引包文件（“git index-pack”）。

Pack files do nothave the actual object IDs in them, only the object content. So to determine what the object IDs are, git has to do a decompress+SHA1 of each object in the pack to produce the object ID, which is then written into the index file.

包文件中没有实际的对象 ID，只有对象内容。所以要确定对象ID是什么，git必须对包中的每个对象进行解压+SHA1以生成对象ID，然后将其写入索引文件。

An object in a pack file may be stored as a delta i.e. a sequence of changes to make to some other object. In this case, git needs to retrieve the base object, apply the commands and SHA1 the result. The base object itself might have to be derived by applying a sequence of delta commands. (Even though in the case of a clone, the base object will have been encountered already, there is a limit to how many manufactured objects are cached in memory).

包文件中的对象可以存储为增量，即对其他对象进行的一系列更改。在这种情况下，git 需要检索基础对象，应用命令并对结果进行 SHA1。可能必须通过应用一系列增量命令来派生基础对象本身。（即使在克隆的情况下，基础对象已经遇到过，但内存中缓存的制造对象数量是有限的）。

In summary, the "resolving deltas" stage involves decompressing and checksumming the entire repo database, which not surprisingly takes quite a long time. Presumably decompressing and calculating SHA1s actually takes more time than applying the delta commands.

总之，“解析增量”阶段涉及对整个 repo 数据库进行解压缩和校验和，这并不奇怪，这需要相当长的时间。据推测，解压缩和计算 SHA1 实际上比应用 delta 命令花费更多的时间。

In the case of a subsequent fetch, the received pack file may contain references (as delta object bases) to other objects that the receiving git is expected to already have. In this case, the receiving git actually rewrites the received pack file to include any such referenced objects, so that any storedpack file is self-sufficient. This might be where the message "resolving deltas" originated.

在后续获取的情况下，接收到的包文件可能包含对接收 git 预计已经拥有的其他对象的引用（作为增量对象库）。在这种情况下，接收 git 实际上会重写接收到的包文件以包含任何此类引用的对象，因此任何存储的包文件都是自给自足的。这可能是“解析增量”消息的来源。

Answer 3

回答by Johan

Amber seems to be describing the object model that Mercurial or similar uses. Git does not store the deltas between subsequent versions of an object, but rather full snapshots of the object, every time. It then compresses these snapshots using delta compression, trying to find good deltas to use, regardless of where in the history these exist.

Amber 似乎在描述 Mercurial 或类似产品使用的对象模型。Git 不存储对象的后续版本之间的增量，而是每次存储对象的完整快照。然后它使用增量压缩来压缩这些快照，试图找到要使用的好的增量，而不管它们存在于历史的何处。

当 git 说它正在“解决增量”时，它实际上在做什么？

提问by Nik Reiman

采纳答案by Amber

回答by araqnid

回答by Johan

相关推荐

最近更新

标签

当 git 说它正在“解决增量”时，它实际上在做什么？

提问by Nik Reiman

采纳答案by Amber

回答by araqnid

回答by Johan

相关推荐

如何在 Git 存储库中移动现有的 Git 子模块？

git 从 GitHub 下载单个文件

未跟踪的文件未显示在 git status 中

我可以从 git-diff 获得补丁兼容的输出吗？

相关推荐

最近更新

标签