bash 从文件生成频率表

Question

提问by Javier

Given an input file containing one single number per line, how could I get a count of how many times an item occurred in that file?

给定一个每行包含一个数字的输入文件，我如何计算该文件中某个项目出现的次数？

cat input.txt
1
2
1
3
1
0

desired output (=>[1,3,1,1]):

所需的输出 (=>[1,3,1,1])：

cat output.txt
0 1
1 3
2 1
3 1

It would be great, if the solution could also be extended for floating numbers.

如果解决方案也可以扩展到浮点数，那就太好了。

Answer 1

回答by Caleb

You mean you want a count of how many times an item appears in the input file? First sort it (using -nif the input is always numbers as in your example) then count the unique results.

你的意思是你想要一个项目在输入文件中出现的次数？首先对其进行排序（-n如果输入始终是您的示例中的数字，则使用）然后计算唯一结果。

sort -n input.txt | uniq -c

Answer 2

回答by glenn Hymanman

Another option:

另外一个选项：

awk '{n[]++} END {for (i in n) print i,n[i]}' input.txt | sort -n > output.txt

Answer 3

回答by Mike Sherrill 'Cat Recall'

In addition to the other answers, you can use awk to make a simple graph. (But, again, it's not a histogram.)

除了其他答案之外，您还可以使用 awk 制作一个简单的 graph。（但是，同样，它不是直方图。）

Answer 4

回答by pavium

At least some of that can be done with

至少其中一些可以用

sort output.txt | uniq -c

But the order number countis reversed. This will fix that problem.

但是顺序number count颠倒了。这将解决该问题。

sort test.dat | uniq -c | awk '{print , }'

Answer 5

回答by agc

Using maphimbufrom the Debianstdapackage:

使用maphimbu从Debian的STDA包：

# use 'jot' to generate 100 random numbers between 1 and 5
# and 'maphimbu' to print sorted "histogram":
jot -r 100 1 5 | maphimbu -s 1

Output:

输出：

             1                20
             2                21
             3                20
             4                21
             5                18

maphimbualso works with floating point:

maphimbu也适用于浮点：

jot -r 100.0 10 15 | numprocess /%10/ | maphimbu -s 1

Output:

输出：

             1                21
           1.1                17
           1.2                14
           1.3                18
           1.4                11
           1.5                19

Answer 6

回答by Chris Koknat

perl -lne '$h{$_}++; END{for $n (sort keys %h) {print "$n\t$h{$n}"}}' input.txt

Loop over each line with -n
Each $_number increments hash %h
Once the ENDof input.txthas been reached,
sort {$a <=> $b}the hash numerically
Print the number $nand the frequency $h{$n}

遍历每行-n
的每个$_数目的增量散列%h
一旦END的input.txt已经达到了，
sort {$a <=> $b}散列数值
打印的数量$n和频率$h{$n}

Similar code which works on floating point:

适用于浮点的类似代码：

perl -lne '$h{int($_)}++; END{for $n (sort {$a <=> $b} keys %h) {print "$n\t$h{$n}"}}' float.txt

float.txt

浮点数.txt

output:

输出：

bash 从文件生成频率表

提问by Javier

回答by Caleb

回答by glenn Hymanman

回答by Mike Sherrill 'Cat Recall'

回答by pavium

回答by agc

回答by Chris Koknat

相关推荐

最近更新

标签

bash 从文件生成频率表

提问by Javier

回答by Caleb

回答by glenn Hymanman

回答by Mike Sherrill 'Cat Recall'

回答by pavium

回答by agc

回答by Chris Koknat

相关推荐

如何在 bash 中使用 mod 运算符？

在 bash for 循环中使用命令行参数范围打印包含参数的括号

bash 想要通过将其输出重定向到变量来检查命令是否成功

bash 回声退格

相关推荐

最近更新

标签