bash 使用 comm 来区分两个文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8598250/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 01:12:28  来源:igfitidea点击:

using comm to diff two files

bashshell

提问by user121196

I am trying to use comm to compute the difference between two sorted files, however the result doesn't make sense, what's wrong? I want to show the strings that exists in test2 but not test1, and then show the strings that exist in test1 but not test2

我正在尝试使用 comm 来计算两个排序文件之间的差异,但是结果没有意义,有什么问题?我想显示存在于test2但不存在于test1的字符串,然后显示存在于test1但不存在于test2的字符串

>test1
a
b
d
g

>test2
e
g 
k
p

>comm test1 test2
a
b
d
    e
g
    g 
    k
    p

回答by ruakh

To show the lines that exist in test2but not in test1, write either of these:

要显示存在于 中test2但不存在于 中的行test1,请编写以下任一行:

comm -13 test1 test2
comm -23 test2 test1

(-1hides the column with lines that exist only in the first file; -2hides the column with lines that exist only in the second file; -3hides the column with lines that exist in both files.)

-1使用仅存在于第一个文件中的-2行隐藏列;隐藏包含仅存在于第二个文件中的-3行的列;隐藏包含两个文件中都存在的行的列。)

And, vice versa to show the lines that exist in test1but not in test2.

并且,反之亦然以显示存在于 中test1但不存在于 中的行test2

Note that gon a line by itself is considered distinct from gwith a space after it, which is why you get

请注意,g单独一行被认为g与后面有一个空格不同,这就是为什么你得到

g
    g 

instead of

代替

        g

回答by shellter

Add a character in common between the 2 files, say 'z' at the end. You'll see that a 3rd columns appears, to indicate that that value is common to both.

在 2 个文件之间添加一个共同的字符,在末尾说 'z'。您会看到出现了第三列,表明该值对两者都是通用的。

The output is meant to show 'data in col1 is uniq to file1', while 'data in col2 is unique to file2'.

输出旨在显示“col1 中的数据对 file1 来说是唯一的”,而“col2 中的数据对 file2 来说是唯一的”。

Finally, arguments to comm '-1, -2, -3' mean suppress output from column numbered supplied, for example, -1.

最后, comm '-1, -2, -3' 的参数意味着抑制提供的列编号的输出,例如 -1。

I hope this helps.

我希望这有帮助。