比较linux中的两个未排序列表,在第二个文件中列出唯一的
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11099894/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Comparing two unsorted lists in linux, listing the unique in the second file
提问by mvrasmussen
I have 2 files with a list of numbers (telephone numbers).
我有 2 个带有数字列表(电话号码)的文件。
I'm looking for a method of listing the numbers in the second file that is not present in the first file.
我正在寻找一种列出第一个文件中不存在的第二个文件中的数字的方法。
I've tried the various methods with:
我已经尝试了各种方法:
comm (getting some weird sorting errors)
fgrep -v -x -f second-file.txt first-file.txt (unsure of the result, there should be more)
采纳答案by Hari Menon
grep -Fxv -f first-file.txt second-file.txt
Basically looks for all lines in second-file.txt
which don't match any line in first-file.txt
. Might be slow if the files are large.
基本上查找second-file.txt
与first-file.txt
. 如果文件很大,可能会很慢。
Also, once you sort the files (Use sort -n
if they are numeric), then comm
should also have worked. What error does it give? Try this:
此外,一旦您对文件进行排序(sort -n
如果它们是数字,则使用),那么comm
也应该有效。它给出了什么错误?尝试这个:
comm -23 second-file-sorted.txt first-file-sorted.txt
回答by rush
You need to use comm
:
您需要使用comm
:
comm -13 first.txt second.txt
will do the job.
会做的工作。
ps. order of first and second file in command line matters.
附:命令行中第一个和第二个文件的顺序很重要。
also you may need to sort files before:
您也可能需要在之前对文件进行排序:
comm -13 <(sort first.txt) <(sort second.txt)
in case files are numerical add -n
option to sort
.
如果文件是数字,则将-n
选项添加到sort
.
回答by Nahuel Fouilleul
This should work
这应该工作
comm -13 <(sort file1) <(sort file2)
It seems sort -n (numeric) cannot work with comm, which uses sort (alphanumeric) internally
似乎 sort -n(数字)不能与 comm 一起使用,它在内部使用 sort(字母数字)
f1.txt
f1.txt
1
2
21
50
f2.txt
f2.txt
1
3
21
50
21 should appear in third column
21 应该出现在第三列
#WRONG
$ comm <(sort -n f1.txt) <(sort -n f2.txt)
1
2
21
3
21
50
#OK
$ comm <(sort f1.txt) <(sort f2.txt)
1
2
21
3
50
回答by tom
cat f1.txt f2.txt | sort |uniq > file3