bash 从文件生成频率表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6044539/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 20:31:48  来源:igfitidea点击:

generating frequency table from file

bashfileshellawk

提问by Javier

Given an input file containing one single number per line, how could I get a count of how many times an item occurred in that file?

给定一个每行包含一个数字的输入文件,我如何计算该文件中某个项目出现的次数?

cat input.txt
1
2
1
3
1
0

desired output (=>[1,3,1,1]):

所需的输出 (=>[1,3,1,1]):

cat output.txt
0 1
1 3
2 1
3 1

It would be great, if the solution could also be extended for floating numbers.

如果解决方案也可以扩展到浮点数,那就太好了。

回答by Caleb

You mean you want a count of how many times an item appears in the input file? First sort it (using -nif the input is always numbers as in your example) then count the unique results.

你的意思是你想要一个项目在输入文件中出现的次数?首先对其进行排序(-n如果输入始终是您的示例中的数字,则使用)然后计算唯一结果。

sort -n input.txt | uniq -c

回答by glenn Hymanman

Another option:

另外一个选项:

awk '{n[]++} END {for (i in n) print i,n[i]}' input.txt | sort -n > output.txt

回答by Mike Sherrill 'Cat Recall'

In addition to the other answers, you can use awk to make a simple graph. (But, again, it's not a histogram.)

除了其他答案之外,您还可以使用 awk 制作一个简单的 graph。(但是,同样,它不是直方图。)

回答by pavium

At least some of that can be done with

至少其中一些可以用

sort output.txt | uniq -c

But the order number countis reversed. This will fix that problem.

但是顺序number count颠倒了。这将解决该问题。

sort test.dat | uniq -c | awk '{print , }'

回答by agc

Using maphimbufrom the Debianstdapackage:

使用maphimbuDebian的STDA包:

# use 'jot' to generate 100 random numbers between 1 and 5
# and 'maphimbu' to print sorted "histogram":
jot -r 100 1 5 | maphimbu -s 1

Output:

输出:

             1                20
             2                21
             3                20
             4                21
             5                18

maphimbualso works with floating point:

maphimbu也适用于浮点:

jot -r 100.0 10 15 | numprocess /%10/ | maphimbu -s 1

Output:

输出:

             1                21
           1.1                17
           1.2                14
           1.3                18
           1.4                11
           1.5                19

回答by Chris Koknat

perl -lne '$h{$_}++; END{for $n (sort keys %h) {print "$n\t$h{$n}"}}' input.txt

Loop over each line with -n
Each $_number increments hash %h
Once the ENDof input.txthas been reached,
sort {$a <=> $b}the hash numerically
Print the number $nand the frequency $h{$n}

遍历每行-n
的每个$_数目的增量散列%h
一旦ENDinput.txt已经达到了,
sort {$a <=> $b}散列数值
打印的数量$n和频率$h{$n}

Similar code which works on floating point:

适用于浮点的类似代码:

perl -lne '$h{int($_)}++; END{for $n (sort {$a <=> $b} keys %h) {print "$n\t$h{$n}"}}' float.txt

float.txt

浮点数.txt

1.732
2.236
1.442
3.162
1.260
0.707

output:

输出:

0       1
1       3
2       1
3       1