bash 用于计算文件中特定单词出现次数的命令行(例如 json 中的键数)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15642754/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
command line for counting specific word occurrences in a file (such as numbers of keys in a json)
提问by Mittenchops
Kind of new to command line stuff, but looking for some pointers.
有点新的命令行东西,但寻找一些指针。
I use the following quick script to count how many times a key is in a json file:
我使用以下快速脚本来计算密钥在 json 文件中的次数:
grep -wo "\"keyname\"" "filename.json" | uniq -c
1200 keyname
It works well, but gets repetitive when I want to test counts of a bunch of keys...
它运行良好,但是当我想测试一堆键的数量时会重复......
grep -wo "\"key1\"" "filename.json" | uniq -c
1200 key1
grep -wo "\"key2\"" "filename.json" | uniq -c
1201 key2
grep -wo "\"key3\"" "filename.json" | uniq -c
1199 key3
So, I'd like to upgrade it to take an array of keynames, stored in a textfile, rather than specify them individually in the keyname argument. If that stays a one-liner, and stays cat-free, even better.
所以,我想升级它以获取存储在文本文件中的键名数组,而不是在键名参数中单独指定它们。如果它保持单线,并且保持cat免费,那就更好了。
I am not very good at one-liners, so here's what I tried instead:
我不太擅长单线,所以这是我尝试的:
(1) making a script called testkeys.sh:
(1) 制作一个名为 testkeys.sh 的脚本:
#!/bin/bash
while read line
do
grep -wo $line "filename.json" | uniq -c
done
(2) making a key file called keys.txt
(2) 制作一个名为keys.txt的密钥文件
key1
key2
key3
(3) Then
(3) 那么
$ ./testkeys.sh keys.txt
However, this ran without completing.
然而,这没有完成就运行了。
Thoughts?
想法?
I was trying to find some way to make the lines of keys.txt into variables to go into a looped statement in the grep, but was unsuccessful. Desired output would be...
我试图找到某种方法将 keys.txt 的行变成变量以进入 grep 中的循环语句,但没有成功。期望的输出是...
$ magic? | grep -wo $vars "filename.json" | uniq -c
1200 key1
1202 key2
1199 key3
UPDATE
更新
I know that grep can use the -f flag to take a pattern file as an argument, but this still seems to require a major change of the script in ways I don't understand. So, for example...
我知道 grep 可以使用 -f 标志将模式文件作为参数,但这似乎仍然需要以我不理解的方式对脚本进行重大更改。所以,例如...
Trying to convert...
正在尝试转换...
grep -wo "\"keyname\"" "filename.json" | uniq -c
into...
进入...
grep -wo -F -f keys.txt "filename.json" | uniq -c
produces
产生
1 key1
1 key2
1 key1
1 key2
1 key1
1 key2
... a bunch of times. It also takes /much/ longer than the speed of each individual execution done n times.
……好几次了。它还需要比每个单独执行 n 次的速度更长的时间。
I also tried this, which I thought would have been cool:
我也试过这个,我认为这会很酷:
$ cat keys.txt | xargs -0 -I keyname grep -wo keyname "filename.json" | uniq -c
But this also ran for a long time and did not aggregate beyond count = 1.
但这也运行了很长时间,并没有聚合超过 count = 1。
回答by FatalError
uniq -ccounts the number of consecutiveoccurrences. So, you're almost there, you just need a sort:
uniq -c计算连续出现的次数。所以,你快到了,你只需要一个sort:
grep -wo -F -f keys.txt "filename.json" | sort | uniq -c

