bash 在两个不同的文件中显示重复的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15645847/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 04:59:51  来源:igfitidea点击:

Display duplicate lines in two different files

linuxbash

提问by Chad D

I have two files and I would like to display the duplicate line. I tried this but it doesn't work :

我有两个文件,我想显示重复的行。我试过这个,但它不起作用:

cat id1.txt | while read id; do grep "$id" id2.txt; done

cat id1.txt | while read id; do grep "$id" id2.txt; done

I am wondering if there are any other way to display the duplicate lines in the file. Both of my 2 files contain list of ids. Thank you.

我想知道是否还有其他方法可以显示文件中的重复行。我的两个文件都包含 ID 列表。谢谢你。

回答by Jonathan Leffler

Are the files sorted? Can they be sorted?

文件排序了吗?它们可以排序吗?

If sorted:

如果排序:

comm -12 id1.txt id2.txt

If not sorted but using bash4.x:

如果未排序但使用bash4.x:

comm -12 <(sort id1.txt) <(sort id2.txt)

There are solutions using temporary files if you don't have bash4.x and 'process substitution'.

如果您没有bash4.x 和“进程替换”,则有使用临时文件的解决方案。

You could also use grep -F:

您还可以使用grep -F

grep -F -f id1.txt id2.txt

This looks for the words in id1.txtthat appear in id2.txt. The only problem here is ensuring that an ID 1doesn't match every ID containing a 1somewhere. The -wor -xoptions available in some versions of grepwill work here.

这将查找id1.txt出现在 中的单词id2.txt。这里唯一的问题是确保 ID1不匹配每个包含1某处的ID 。某些版本中可用的-w-x选项grep将在此处起作用。

回答by kamituel

If by detecting duplicates you mean printing lines which are the present in both files (or duplicate within one file), you can use uniq:

如果通过检测重复项,您的意思是打印两个文件中都存在的行(或在一个文件中重复),则可以使用uniq

$ cat file1 file2 | sort | uniq -d

回答by Tuxdude

You could use the commcommand instead:

您可以改用以下comm命令:

sort id1.txt > id1.txt.sorted
sort id2.txt > id2.txt.sorted
comm -12 id1.txt.sorted id2.txt.sorted

If you want to do it in one command:

如果您想在一个命令中执行此操作:

comm -12 <(sort id1.txt) <(sort id2.txt)

Arguments to comm:

论据comm

  • The -1argument suppresses lines unique in the first file.
  • The -2argument suppresses lines unique in the second file.
  • If you pass a -3argument, it would suppress the common lines.
  • -1参数禁止在第一个文件中唯一的行。
  • -2参数禁止在第二个文件中唯一的行。
  • 如果你传递一个-3参数,它会抑制公共行。