如何读取 git diff 的输出?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2529441/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 08:09:13  来源:igfitidea点击:

How to read the output from git diff?

gitdiffgit-diff

提问by poseid

The man page for git-diffis rather long, and explains many cases which don't seem to be necessary for a beginner. For example:

手册页git-diff很长,解释了许多初学者似乎不需要的情况。例如:

git diff origin/master

回答by Jakub Nar?bski

Lets take a look at example advanced diff from git history (in commit 1088261f in git.git repository):

让我们看一下来自 git 历史的示例高级差异(在 git.git 存储库中的提交 1088261f 中):

diff --git a/builtin-http-fetch.c b/http-fetch.c
similarity index 95%
rename from builtin-http-fetch.c
rename to http-fetch.c
index f3e63d7..e8f44ba 100644
--- a/builtin-http-fetch.c
+++ b/http-fetch.c
@@ -1,8 +1,9 @@
 #include "cache.h"
 #include "walker.h"

-int cmd_http_fetch(int argc, const char **argv, const char *prefix)
+int main(int argc, const char **argv)
 {
+       const char *prefix;
        struct walker *walker;
        int commits_on_stdin = 0;
        int commits;
@@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, const char *prefix)
        int get_verbosely = 0;
        int get_recover = 0;

+       prefix = setup_git_directory();
+
        git_config(git_default_config, NULL);

        while (arg < argc && argv[arg][0] == '-') {

Lets analyze this patch line by line.

让我们逐行分析这个补丁。

  • The first line

    diff --git a/builtin-http-fetch.c b/http-fetch.c
    is a "git diff" header in the form diff --git a/file1 b/file2. The a/and b/filenames are the same unless rename/copy is involved (like in our case). The --gitis to mean that diff is in the "git" diff format.

  • Next are one or more extended header lines. The first three

    similarity index 95%
    rename from builtin-http-fetch.c
    rename to http-fetch.c
    tell us that the file was renamed from builtin-http-fetch.cto http-fetch.cand that those two files are 95% identical (which was used to detect this rename).

    The last line in extended diff header, which is
    index f3e63d7..e8f44ba 100644
    tell us about mode of given file (100644means that it is ordinary file and not e.g. symlink, and that it doesn't have executable permission bit), and about shortened hash of preimage (the version of file before given change) and postimage (the version of file after change). This line is used by git am --3wayto try to do a 3-way merge if patch cannot be applied itself.

  • Next is two-line unified diff header

    --- a/builtin-http-fetch.c
    +++ b/http-fetch.c
    Compared to diff -Uresult it doesn't have from-file-modification-time nor to-file-modification-time after source (preimage) and destination (postimage) file names. If file was created the source is /dev/null; if file was deleted, the target is /dev/null.
    If you set diff.mnemonicPrefixconfiguration variable to true, in place of a/and b/prefixes in this two-line header you can have instead c/, i/, w/and o/as prefixes, respectively to what you compare; see git-config(1)

  • Next come one or more hunks of differences; each hunk shows one area where the files differ. Unified format hunks starts with line like

    @@ -1,8 +1,9 @@
    or
    @@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, ...
    It is in the format @@ from-file-range to-file-range @@ [header]. The from-file-range is in the form -<start line>,<number of lines>, and to-file-range is +<start line>,<number of lines>. Both start-line and number-of-lines refer to position and length of hunk in preimage and postimage, respectively. If number-of-lines not shown it means that it is 0.

    The optional header shows the C function where each change occurs, if it is a C file (like -poption in GNU diff), or the equivalent, if any, for other types of files.

  • Next comes the description of where files differ. The lines common to both files begin with a space character. The lines that actually differ between the two files have one of the following indicator characters in the left print column:

    • '+' -- A line was added here to the first file.
    • '-' -- A line was removed here from the first file.


    So, for example, first chunk

     #include "cache.h"
     #include "walker.h"
    
    -int cmd_http_fetch(int argc, const char **argv, const char *prefix)
    +int main(int argc, const char **argv)
     {
    +       const char *prefix;
            struct walker *walker;
            int commits_on_stdin = 0;
            int commits;
    

    means that cmd_http_fetchwas replaced by main, and that const char *prefix;line was added.

    In other words, before the change, the appropriate fragment of then 'builtin-http-fetch.c' file looked like this:

    #include "cache.h"
    #include "walker.h"
    
    int cmd_http_fetch(int argc, const char **argv, const char *prefix)
    {
           struct walker *walker;
           int commits_on_stdin = 0;
           int commits;
    

    After the change this fragment of now 'http-fetch.c' file looks like this instead:

    #include "cache.h"
    #include "walker.h"
    
    int main(int argc, const char **argv)
    {
           const char *prefix;
           struct walker *walker;
           int commits_on_stdin = 0;
           int commits;
    
  • There might be

    \ No newline at end of file
    line present (it is not in example diff).

  • 第一行

    diff --git a/builtin-http-fetch.c b/http-fetch.c
    是表单中的“git diff”标题diff --git a/file1 b/file2。该a/b/,除非重命名/复制参与(如在我们的例子中)的文件名是相同的。这--git意味着差异采用“git”差异格式。

  • 接下来是一个或多个扩展的标题行。前三个

    similarity index 95%
    rename from builtin-http-fetch.c
    rename to http-fetch.c
    告诉我们该文件从builtin-http-fetch.cto重命名,http-fetch.c并且这两个文件 95% 相同(用于检测此重命名)。

    扩展差异标头中的最后一行,即
    index f3e63d7..e8f44ba 100644
    告诉我们给定文件的模式(100644意味着它是普通文件而不是符号链接,并且它没有可执行权限位),以及关于原像(给定更改之前的文件版本)和后图像(更改后的文件版本)。git am --3way如果无法应用补丁本身,则此行用于尝试进行 3 路合并。

  • 接下来是两行统一的diff header

    --- a/builtin-http-fetch.c
    +++ b/http-fetch.c
    diff -U结果相比,它在源(原像)和目标(后像)文件名之后没有从文件修改时间也没有到文件修改时间。如果文件已创建,则源为/dev/null; 如果文件被删除,目标是/dev/null.
    如果设置diff.mnemonicPrefix配置变量设置为true,以代替a/b/在此两行标题前缀,你可以有替代c/i/w/o/作为前缀,分别你比较什么; 参见git-config(1)

  • 接下来是一个或多个差异;每个大块显示文件不同的一个区域。统一格式的大块头以行开头

    @@ -1,8 +1,9 @@
    或者
    @@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, ...
    它是在格式中@@ from-file-range to-file-range @@ [header]。from-file-range 的格式为-<start line>,<number of lines>,to-file-range 的格式为+<start line>,<number of lines>。起始行和行数分别指原像和后像中大块的位置和长度。如果未显示行数,则表示为 0。

    可选标头显示每次更改发生的 C 函数,如果它是 C 文件(如-pGNU diff 中的选项),或者其他类型文件的等效文件(如果有)。

  • 接下来是文件不同之处的描述。两个文件共有的行以空格字符开头。两个文件之间实际不同的行在左侧打印列中具有以下指示符之一:

    • '+' -- 在此处向第一个文件添加了一行。
    • '-' -- 此处从第一个文件中删除了一行。


    因此,例如,第一个块

     #include "cache.h"
     #include "walker.h"
    
    -int cmd_http_fetch(int argc, const char **argv, const char *prefix)
    +int main(int argc, const char **argv)
     {
    +       const char *prefix;
            struct walker *walker;
            int commits_on_stdin = 0;
            int commits;
    

    表示cmd_http_fetch由 替换main,并const char *prefix;添加了该行。

    换句话说,在更改之前,“builtin-http-fetch.c”文件的相应片段如下所示:

    #include "cache.h"
    #include "walker.h"
    
    int cmd_http_fetch(int argc, const char **argv, const char *prefix)
    {
           struct walker *walker;
           int commits_on_stdin = 0;
           int commits;
    

    更改后,现在 'http-fetch.c' 文件的这个片段看起来像这样:

    #include "cache.h"
    #include "walker.h"
    
    int main(int argc, const char **argv)
    {
           const char *prefix;
           struct walker *walker;
           int commits_on_stdin = 0;
           int commits;
    
  • 可能有

    \ No newline at end of file
    行存在(它不在示例差异中)。

As Donal Fellows saidit is best to practice reading diffs on real-life examples, where you know what you have changed.

正如Donal Fellows 所说,最好在现实生活中练习阅读差异,在那里你知道你改变了什么。

References:

参考:

回答by irudyak

Here's the simple example.

这是一个简单的例子。

diff --git a/file b/file 
index 10ff2df..84d4fa2 100644
--- a/file
+++ b/file
@@ -1,5 +1,5 @@
 line1
 line2
-this line will be deleted
 line4
 line5
+this line is added

Here's an explanation (see details here).

这是一个解释(请参阅此处的详细信息)。

  • --gitis not a command, this means it's a git version of diff (not unix)
  • a/ b/are directories, they are not real. it's just a convenience when we deal with the same file (in my case a/ is in index and b/ is in working directory)
  • 10ff2df..84d4fa2are blob IDs of these 2 files
  • 100644is the “mode bits,” indicating that this is a regular file (not executable and not a symbolic link)
  • --- a/file +++ b/fileminus signs shows lines in the a/ version but missing from the b/ version; and plus signs shows lines missing in a/ but present in b/ (in my case --- means deleted lines and +++ means added lines in b/ and this the file in the working directory)
  • @@ -1,5 +1,5 @@in order to understand this it's better to work with a big file; if you have two changes in different places you'll get two entries like @@ -1,5 +1,5 @@; suppose you have file line1 ... line100 and deleted line10 and add new line100 - you'll get:
  • --git不是命令,这意味着它是 diff 的 git 版本(不是 unix)
  • a/ b/是目录,它们不是真实的。当我们处理同一个文件时,这只是一种方便(在我的情况下,a/ 在索引中,b/ 在工作目录中)
  • 10ff2df..84d4fa2是这两个文件的 blob ID
  • 100644是“模式位”,表示这是一个常规文件(不可执行,也不是符号链接)
  • --- a/file +++ b/file减号显示 a/ 版本中的行,但 b/ 版本中缺少;加号显示 a/ 中缺少但存在于 b/ 中的行(在我的情况下 --- 表示已删除的行,+++ 表示在 b/ 中添加了行,这是工作目录中的文件)
  • @@ -1,5 +1,5 @@为了理解这一点,最好使用大文件;如果你在不同的地方有两个变化,你会得到两个条目,比如@@ -1,5 +1,5 @@;假设你有文件 line1 ... line100 并删除 line10 并添加新 line100 - 你会得到:
@@ -7,7 +7,6 @@ line6
 line7
 line8
 line9
-this line10 to be deleted
 line11
 line12
 line13
@@ -98,3 +97,4 @@ line97
 line98
 line99
 line100
+this is new line100
@@ -7,7 +7,6 @@ line6
 line7
 line8
 line9
-this line10 to be deleted
 line11
 line12
 line13
@@ -98,3 +97,4 @@ line97
 line98
 line99
 line100
+this is new line100

回答by Donal Fellows

The default output format (which originally comes from a program known as diffif you want to look for more info) is known as a “unified diff”. It contains essentially 4 different types of lines:

默认输出格式(最初来自一个程序,diff如果您想查找更多信息)被称为“统一差异”。它基本上包含 4 种不同类型的行:

  • context lines, which start with a single space,
  • insertion lines that show a line that has been inserted, which start with a +,
  • deletion lines, which start with a -, and
  • metadata lines which describe higher level things like which file this is talking about, what options were used to generate the diff, whether the file changed its permissions, etc.
  • 上下文行,以单个空格开头,
  • 插入线,显示已被插入的线,其与启动+
  • 以 a 开头的删除行-,以及
  • 元数据行描述了更高级的事情,比如这是在谈论哪个文件,使用什么选项来生成差异,文件是否改变了它的权限等。

I advise that you practice reading diffs between two versions of a file where you know exactly what you changed. Like that you'll recognize just what is going on when you see it.

我建议你练习阅读一个文件的两个版本之间的差异,在那里你确切地知道你改变了什么。就像这样,当您看到它时,您就会意识到正在发生的事情。

回答by stefanB

On my mac:

在我的 Mac 上:

info diffthen select: Output formats-> Context-> Unified format-> Detailed Unified:

info diff然后选择:Output formats-> Context-> Unified format-> Detailed Unified

Or online man diffon gnu following the same path to the same section:

或者在 gnu 上的在线 man diff遵循相同的路径到同一部分:

File: diff.info, Node: Detailed Unified, Next: Example Unified, Up: Unified Format

Detailed Description of Unified Format ......................................

The unified output format starts with a two-line header, which looks like this:

 --- FROM-FILE FROM-FILE-MODIFICATION-TIME
 +++ TO-FILE TO-FILE-MODIFICATION-TIME

The time stamp looks like `2002-02-21 23:30:39.942229878 -0800' to indicate the date, time with fractional seconds, and time zone.

You can change the header's content with the `--label=LABEL' option; see *Note Alternate Names::.

Next come one or more hunks of differences; each hunk shows one area where the files differ. Unified format hunks look like this:

 @@ FROM-FILE-RANGE TO-FILE-RANGE @@
  LINE-FROM-EITHER-FILE
  LINE-FROM-EITHER-FILE...

The lines common to both files begin with a space character. The lines that actually differ between the two files have one of the following indicator characters in the left print column:

`+' A line was added here to the first file.

`-' A line was removed here from the first file.

文件:diff.info,节点:详细统一,下一个:示例统一,上:统一格式

统一格式的详细说明......................................................

统一输出格式以两行标题开头,如下所示:

 --- FROM-FILE FROM-FILE-MODIFICATION-TIME
 +++ TO-FILE TO-FILE-MODIFICATION-TIME

时间戳看起来像“2002-02-21 23:30:39.942229878 -0800”,表示日期、带小数秒的时间和时区。

您可以使用`--label=LABEL' 选项更改标题的内容;请参阅 *注意备用名称::。

接下来是一个或多个差异;每个大块显示文件不同的一个区域。统一格式看起来像这样:

 @@ FROM-FILE-RANGE TO-FILE-RANGE @@
  LINE-FROM-EITHER-FILE
  LINE-FROM-EITHER-FILE...

两个文件共有的行以空格字符开头。两个文件之间实际不同的行在左侧打印列中具有以下指示符之一:

`+' 在第一个文件中添加了一行。

`-' 此处从第一个文件中删除了一行。

回答by Cascabel

It's unclear from your question which part of the diffs you find confusing: the actually diff, or the extra header information git prints. Just in case, here's a quick overview of the header.

从您的问题中不清楚您发现哪些差异部分令人困惑:实际差异或 git 打印的额外标题信息。以防万一,这里是标题的快速概览。

The first line is something like diff --git a/path/to/file b/path/to/file- obviously it's just telling you what file this section of the diff is for. If you set the boolean config variable diff.mnemonic prefix, the aand bwill be changed to more descriptive letters like cand w(commit and work tree).

第一行类似于diff --git a/path/to/file b/path/to/file- 显然它只是告诉您差异的这一部分用于哪个文件。如果您设置布尔配置变量diff.mnemonic prefix,则aandb将更改为更具描述性的字母,如cand w(提交和工作树)。

Next, there are "mode lines" - lines giving you a description of any changes that don't involve changing the content of the file. This includes new/deleted files, renamed/copied files, and permissions changes.

接下来,是“模式行”——这些行为您提供了不涉及更改文件内容的任何更改的描述。这包括新的/删除的文件、重命名/复制的文件和权限更改。

Finally, there's a line like index 789bd4..0afb621 100644. You'll probably never care about it, but those 6-digit hex numbers are the abbreviated SHA1 hashes of the old and new blobs for this file (a blob is a git object storing raw data like a file's contents). And of course, the 100644is the file's mode - the last three digits are obviously permissions; the first three give extra file metadata information (SO post describing that).

最后,有一行像index 789bd4..0afb621 100644. 您可能永远不会关心它,但那些 6 位十六进制数字是该文件的新旧 blob 的缩写 SHA1 哈希(blob 是一个 git 对象,用于存储原始数据,如文件内容)。当然,这100644是文件的模式——最后三位数字显然是权限;前三个提供额外的文件元数据信息(SO 帖子描述了这一点)。

After that, you're on to standard unified diff output (just like the classic diff -U). It's split up into hunks - a hunk is a section of the file containing changes and their context. Each hunk is preceded by a pair of ---and +++lines denoting the file in question, then the actual diff is (by default) three lines of context on either side of the -and +lines showing the removed/added lines.

之后,您将进入标准的统一差异输出(就像经典的diff -U)。它分为大块 - 大块是包含更改及其上下文的文件部分。每个大块前面都有一对---and+++行,表示有问题的文件,然后实际的差异(默认情况下)是-and两侧的三行上下文,+显示已删除/添加的行。