“git merge”如何详细工作?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14961255/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 15:32:40  来源:igfitidea点击:

How does 'git merge' work in details?

gitmergeconflict

提问by abyss.7

I want to know an exact algorithm (or near that) behind 'git merge'. The answers at least to these sub-questions will be helpful:

我想知道 'git merge' 背后的确切算法(或接近该算法)。至少对这些子问题的回答会有所帮助:

  • How does git detect the context of a particular non-conflicting change?
  • How does git find out that there is a conflict in these exact lines?
  • Which things does git auto-merge?
  • How does git perform when there is no common base for merging branches?
  • How does git perform when there are multiple common bases for merging branches?
  • What happens when I merge multiple branches at once?
  • What is a difference between merge strategies?
  • git 如何检测特定非冲突更改的上下文?
  • git 如何发现这些确切的行中存在冲突?
  • git 自动合并哪些内容?
  • 当合并分支没有共同的基础时,git 的表现如何?
  • 当合并分支有多个公共基础时,git 的表现如何?
  • 当我一次合并多个分支时会发生什么?
  • 合并策略之间有什么区别?

But the description of a whole algorithm will be much better.

但是对整个算法的描述会好得多。

采纳答案by twalberg

You might be best off looking for a description of a 3-way merge algorithm. A high-level description would go something like this:

您最好寻找 3 路合并算法的描述。高级描述将是这样的:

  1. Find a suitable merge base B- a version of the file that is an ancestor of both of the new versions (Xand Y), and usually the most recent such base (although there are cases where it will have to go back further, which is one of the features of gits default recursivemerge)
  2. Perform diffs of Xwith Band Ywith B.
  3. Walk through the change blocks identified in the two diffs. If both sides introduce the same change in the same spot, accept either one; if one introduces a change and the other leaves that region alone, introduce the change in the final; if both introduce changes in a spot, but they don't match, mark a conflict to be resolved manually.
  1. 找到一个合适的合并基础B-一个版本的文件,既新版本(的祖先XY),并且通常是最新的这样的基础(尽管有情况下,它会继续向后走,这是一个gits默认recursive合并的特点)
  2. 执行XwithBYwith 的差异B
  3. 浏览两个差异中标识的更改块。如果双方在同一地点引入相同的变化,接受其中之一;如果一个人引入了变化而另一个人离开了该区域,则在决赛中引入变化;如果两者都在某个地方引入了更改,但它们不匹配,请标记要手动解决的冲突。

The full algorithm deals with this in a lot more detail, and even has some documentation (https://github.com/git/git/blob/master/Documentation/technical/trivial-merge.txtfor one, along with the git help XXXpages, where XXX is one of merge-base, merge-file, merge, merge-one-fileand possibly a few others). If that's not deep enough, there's always source code...

完整的算法更详细地处理了这个问题,甚至还有一些文档(https://github.com/git/git/blob/master/Documentation/technical/trivial-merge.txt以及git help XXX页面,其中XXX是一个merge-basemerge-filemergemerge-one-file以及可能的其他一些)。如果这还不够深入,总有源代码......

回答by 13ren

I'm interested too. I don't know the answer, but...

我也有兴趣。我不知道答案,但...

A complex system that works is invariably found to have evolved from a simple system that worked

一个运行的复杂系统总是被发现是从一个运行的简单系统演变而来的

I think git's merging is highly sophisticated and will be very difficult to understand - but one way to approach this is from its precursors, and to focus on the heart of your concern. That is, given two files that don't have a common ancestor, how does git merge work out how to merge them, and where conflicts are?

我认为 git 的合并是非常复杂的,并且很难理解——但是解决这个问题的一种方法是从它的前身开始,并专注于您关注的核心。也就是说,给定两个没有共同祖先的文件,git merge 如何计算如何合并它们,以及冲突在哪里?

Let's try to find some precursors. From git help merge-file:

让我们试着找出一些先驱。来自git help merge-file

git merge-file is designed to be a minimal clone of RCS merge; that is,
       it implements all of RCS merge's functionality which is needed by
       git(1).

From wikipedia: http://en.wikipedia.org/wiki/Git_%28software%29-> http://en.wikipedia.org/wiki/Three-way_merge#Three-way_merge-> http://en.wikipedia.org/wiki/Diff3-> http://www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf

维基百科:http://en.wikipedia.org/wiki/Git_%28software%29- > http://en.wikipedia.org/wiki/Three-way_merge#Three-way_merge- > HTTP://en.wikipedia .org/wiki/Diff3-> http://www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf

That last link is a pdf of a paper describing the diff3algorithm in detail. Here's a google pdf-viewer version. It's only 12 pages long, and the algorithm is only a couple of pages - but a full-on mathematical treatment. That might seem a bit too formal, but if you want to understand git's merge, you'll need to understand the simpler version first. I haven't checked yet, but with a name like diff3, you'll probably also need to understand diff (which uses a longest common subsequencealgorithm). However, there may be a more intuitive explanation of diff3out there, if you have a google...

最后一个链接是diff3详细描述算法的论文的pdf 。这是一个google pdf-viewer 版本。它只有 12 页长,算法也只有几页——但它是全面的数学处理。这可能看起来有点过于正式,但是如果您想了解 git 的合并,则需要先了解更简单的版本。我还没有检查过,但是对于类似 的名称diff3,您可能还需要了解 diff(它使用最长公共子序列算法)。但是,diff3如果您有谷歌搜索,可能会有更直观的解释...



Now, I just did an experiment comparing diff3and git merge-file. They take the same three input files version1 oldversion version2and mark conflicts the way same, with <<<<<<< version1, =======, >>>>>>> version2(diff3also has ||||||| oldversion), showing their common heritage.

现在,我刚刚做了一个比较diff3和的实验git merge-file。它们采用相同的三个输入文件version1 oldversion version2并以相同的方式标记冲突,用<<<<<<< version1, =======, >>>>>>> version2diff3也有||||||| oldversion),显示它们的共同遗产。

I used an empty file for oldversion, and near-identical files for version1and version2with just one extra line added to version2.

我为oldversion使用了一个空文件,为version1version2使用了几乎相同的文件,只在version2 中添加了一行。

Result: git merge-fileidentified the single changed line as the conflict; but diff3treated the whole two files as a conflict. Thus, sophisticated as diff3 is, git's merge is even more sophisticated, even for this simplest of cases.

结果:git merge-file将单个更改的行识别为冲突;但diff3将整个两个文件视为冲突。因此,像 diff3 一样复杂,git 的合并甚至更复杂,即使对于这种最简单的情况也是如此。

Here's the actual results (I used @twalberg's answer for the text). Note the options needed (see respective manpages).

这是实际结果(我在文本中使用了@twalberg 的答案)。请注意所需的选项(请参阅相应的联机帮助页)。

$ git merge-file -p fun1.txt fun0.txt fun2.txt

$ git merge-file -p fun1.txt fun0.txt fun2.txt

You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
<<<<<<< fun1.txt
=======
THIS IS A BIT DIFFERENT
>>>>>>> fun2.txt

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...

$ diff3 -m fun1.txt fun0.txt fun2.txt

$ diff3 -m fun1.txt fun0.txt fun2.txt

<<<<<<< fun1.txt
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
||||||| fun0.txt
=======
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
THIS IS A BIT DIFFERENT

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
>>>>>>> fun2.txt

If you are truly interested in this, it's a bit of a rabbit hole. To me, it seems as deep as regular expressions, the longest common subsequencealgorithm of diff, context free grammars, or relational algebra. If you want to get to the bottom of it, I think you can, but it will take some determined study.

如果你真的对此感兴趣,那就有点像兔子洞了。对我来说,它似乎与正则表达式、diff的最长公共子序列算法、上下文无关文法或关系代数一样深。如果你想深入了解它,我认为你可以,但这需要一些坚定的研究。

回答by aamontal

Here is the original implementation

这是原始实现

http://git.kaarsemaker.net/git/blob/857f26d2f41e16170e48076758d974820af685ff/git-merge-recursive.py

http://git.kaarsemaker.net/git/blob/857f26d2f41e16170e48076758d974820af685ff/git-merge-recursive.py

Basically you create a list of common ancestors for two commits and then recursively merge them, either fast forwarding them, or creating virtual commits that get used for the basis of a three-way merge on the files.

基本上,您为两个提交创建一个共同祖先列表,然后递归合并它们,或者快速转发它们,或者创建用于文件三向合并基础的虚拟提交。

回答by Nevik Rehnel

How does git detect the context of a particular non-conflicting change?
How does git find out that there is a conflict in these exact lines?

git 如何检测特定非冲突更改的上下文?
git 如何发现这些确切的行中存在冲突?

If the same line has changed on both side of the merge, it's a conflict; if they haven't, the change from one side (if existent) is accepted.

如果合并两边的同一行发生了变化,则是冲突;如果没有,则接受来自一侧(如果存在)的更改。

Which things does git auto-merge?

git 自动合并哪些内容?

Changes that do not conflict (see above)

不冲突的更改(见上文)

How does git perform when there are multiple common bases for merging branches?

当合并分支有多个公共基础时,git 的表现如何?

By the definition of a Git merge-base, there is only ever one (the latest common ancestor).

根据Git merge-base的定义,只有一个(最新的共同祖先)。

What happens when I merge multiple branches at once?

当我一次合并多个分支时会发生什么?

That depends on the merge strategy (only the octopusand the ours/theirsstrategies support merging more than two branches).

这取决于合并策略(只有octopusours/theirs策略支持合并两个以上的分支)。

What is a difference between merge strategies?

合并策略之间有什么区别?

This is explained in the git mergemanpage.

这在git merge联机帮助页中进行了解释。