bash 使用命令行工具按排序顺序计算重复项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1092405/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
counting duplicates in a sorted sequence using command line tools
提问by letronje
I have a command (cmd1) that greps through a log file to filter out a set of numbers. The numbers are in random order, so I use sort -gr to get a reverse sorted list of numbers. There may be duplicates within this sorted list. I need to find the count for each unique number in that list.
我有一个命令 (cmd1),它通过日志文件 grep 过滤掉一组数字。这些数字的顺序是随机的,所以我使用 sort -gr 来获得一个反向排序的数字列表。此排序列表中可能存在重复项。我需要找到该列表中每个唯一数字的计数。
For e.g. if the output of cmd1 is:
例如,如果 cmd1 的输出是:
100
100
100
99
99
26
25
24
24
I need another command that I can pipe the above output to, so that, I get:
我需要另一个命令,我可以将上面的输出通过管道传输到,这样,我得到:
100 3
99 2
26 1
25 1
24 2
回答by Stephen Paul Lesniewski
how about;
怎么样;
$ echo "100 100 100 99 99 26 25 24 24" \
| tr " " "\n" \
| sort \
| uniq -c \
| sort -k2nr \
| awk '{printf("%s\t%s\n",,)}END{print}'
The result is :
结果是:
100 3
99 2
26 1
25 1
24 2
回答by Ibrahim
uniq -c
works for GNU uniq 8.23 at least, and does exactly what you want (assuming sorted input).
uniq -c
至少适用于 GNU uniq 8.23,并且完全符合您的要求(假设已排序输入)。
回答by ghostdog74
if order is not important
如果顺序不重要
# echo "100 100 100 99 99 26 25 24 24" | awk '{for(i=1;i<=NF;i++)a[$i]++}END{for(o in a) printf "%s %s ",o,a[o]}'
26 1 100 3 99 2 24 2 25 1
回答by ericcurtin
Numerically sort the numbers in reverse, then count the duplicates, then swap the left and the right words. Align into columns.
对数字进行反向排序,然后计算重复项,然后交换左右单词。对齐成列。
printf '%d\n' 100 99 26 25 100 24 100 24 99 \
| sort -nr | uniq -c | awk '{printf "%-8s%s\n", , }'
100 3
99 2
26 1
25 1
24 2
回答by Toby Speight
In Bash, we can use an associative arrayto count instances of each input value. Assuming we have the command $cmd1
, e.g.
在 Bash 中,我们可以使用关联数组来计算每个输入值的实例。假设我们有命令$cmd1
,例如
#!/bin/bash
cmd1='printf %d\n 100 99 26 25 100 24 100 24 99'
Then we can count values in the array variable a
using the ++
mathematical operator on the relevant array entries:
然后我们可以a
使用++
相关数组条目上的数学运算符计算数组变量中的值:
while read i
do
((++a["$i"]))
done < <($cmd1)
We can print the resulting values:
我们可以打印结果值:
for i in "${!a[@]}"
do
echo "$i ${a[$i]}"
done
If the order of output is important, we might need an external sort
of the keys:
如果输出顺序很重要,我们可能需要一个外部sort
键:
for i in $(printf '%s\n' "${!a[@]}" | sort -nr)
do
echo "$i ${a[$i]}"
done