在字段中查找重复项并在 unix bash 中打印它们

Question

提问by t28292

I have a file the contains

我有一个包含的文件

apple
apple
banana
orange
apple
orange

I want a script that finds the duplicates apple and orange and tells the user that the following : apple and orange are repeated. I tried

我想要一个脚本来查找重复的 apple 和 orange 并告诉用户以下内容： apple 和 orange 是重复的。我试过

nawk '!x[]++' FS="," filename

to find repeated item so how can i print them out in unix bash ?

找到重复的项目，那么我如何在 unix bash 中将它们打印出来？

Answer 1

回答by devnull

In order to print the duplicate lines, you can say:

为了打印重复的行，您可以说：

$ sort filename | uniq -d
apple
orange

If you want to print the count as well, supply the -coption to uniq:

如果您还想打印计数，请提供以下-c选项uniq：

$ sort filename | uniq -dc
      3 apple
      2 orange

Answer 2

回答by Varun

+1 for devnul's answer. However, if the file contains spaces instead of newlines as delimiter. then the following would work.

+1 为devnul 的回答。但是，如果文件包含空格而不是换行符作为分隔符。那么以下将起作用。

tr [:blank:] "\n" < filename | sort | uniq -d

Answer 3

回答by hek2mgl

Update:

更新：

The question has been changed significantly. Formerly, when answering this, the input file should look like:

问题发生了重大变化。以前，在回答此问题时，输入文件应如下所示：

apple apple banana orange apple orange
banana orange apple
...

However, the solution will work anyway, but might be a little bit too complicated for this special use case.

但是，该解决方案无论如何都会起作用，但对于这个特殊用例来说可能有点太复杂了。

The following awk script will do the job:

以下 awk 脚本将完成这项工作：

awk '{i=1;while(i <= NF){a[$(i++)]++}}END{for(i in a){if(a[i]>1){print i,a[i]}}}' your.file

Output:

输出：

apple 3
orange 2

It is more understandable in a form like this:

像这样的形式更容易理解：

#!/usr/bin/awk

{
  i=1;
  # iterate through every field
  while(i <= NF) {
    a[$(i++)]++; # count occurrences of every field
  }
}

# after all input lines have been read ...
END {
  for(i in a) {
    # ... print those fields which occurred more than 1 time
    if(a[i] > 1) {
      print i,a[i];
    }
  }
}

Then make the file executable and execute it passing the input file name to it:

然后使文件可执行并执行它，将输入文件名传递给它：

chmod +x script.awk
./script.awk your.file

在字段中查找重复项并在 unix bash 中打印它们

提问by t28292

回答by devnull

回答by Varun

回答by hek2mgl

相关推荐

最近更新

标签

在字段中查找重复项并在 unix bash 中打印它们

提问by t28292

回答by devnull

回答by Varun

回答by hek2mgl

相关推荐

bash 如何加密bash脚本源代码？

bash bash的字符串长度

bash 的 HereDoc 中的 Perl 脚本

用于删除目录中多个文件名末尾的“x”个字符的 Bash 脚本？

相关推荐

最近更新

标签