bash 用于计算文件中特定单词出现次数的命令行(例如 json 中的键数)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15642754/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 04:59:30  来源:igfitidea点击:

command line for counting specific word occurrences in a file (such as numbers of keys in a json)

bashshellawkgrepuniq

提问by Mittenchops

Kind of new to command line stuff, but looking for some pointers.

有点新的命令行东西,但寻找一些指针。

I use the following quick script to count how many times a key is in a json file:

我使用以下快速脚本来计算密钥在 json 文件中的次数:

grep -wo "\"keyname\"" "filename.json" | uniq -c
1200 keyname

It works well, but gets repetitive when I want to test counts of a bunch of keys...

它运行良好,但是当我想测试一堆键的数量时会重复......

grep -wo "\"key1\"" "filename.json" | uniq -c
1200 key1
grep -wo "\"key2\"" "filename.json" | uniq -c
1201 key2
grep -wo "\"key3\"" "filename.json" | uniq -c
1199 key3

So, I'd like to upgrade it to take an array of keynames, stored in a textfile, rather than specify them individually in the keyname argument. If that stays a one-liner, and stays cat-free, even better.

所以,我想升级它以获取存储在文本文件中的键名数组,而不是在键名参数中单独指定它们。如果它保持单线,并且保持cat免费,那就更好了。

I am not very good at one-liners, so here's what I tried instead:

我不太擅长单线,所以这是我尝试的:

(1) making a script called testkeys.sh:

(1) 制作一个名为 testkeys.sh 的脚本:

#!/bin/bash
while read line
do
grep -wo $line "filename.json" | uniq -c
done

(2) making a key file called keys.txt

(2) 制作一个名为keys.txt的密钥文件

key1
key2
key3

(3) Then

(3) 那么

$ ./testkeys.sh keys.txt 

However, this ran without completing.

然而,这没有完成就运行了。

Thoughts?

想法?

I was trying to find some way to make the lines of keys.txt into variables to go into a looped statement in the grep, but was unsuccessful. Desired output would be...

我试图找到某种方法将 keys.txt 的行变成变量以进入 grep 中的循环语句,但没有成功。期望的输出是...

$ magic? | grep -wo $vars "filename.json" | uniq -c
1200 key1
1202 key2
1199 key3

UPDATE

更新

I know that grep can use the -f flag to take a pattern file as an argument, but this still seems to require a major change of the script in ways I don't understand. So, for example...

我知道 grep 可以使用 -f 标志将模式文件作为参数,但这似乎仍然需要以我不理解的方式对脚本进行重大更改。所以,例如...

Trying to convert...

正在尝试转换...

grep -wo "\"keyname\"" "filename.json" | uniq -c

into...

进入...

grep -wo -F -f keys.txt "filename.json" | uniq -c

produces

产生

1 key1
1 key2
1 key1
1 key2
1 key1
1 key2

... a bunch of times. It also takes /much/ longer than the speed of each individual execution done n times.

……好几次了。它还需要比每个单独执行 n 次的速度更长的时间。

I also tried this, which I thought would have been cool:

我也试过这个,我认为这会很酷:

$ cat keys.txt | xargs -0 -I keyname grep -wo keyname "filename.json" | uniq -c

But this also ran for a long time and did not aggregate beyond count = 1.

但这也运行了很长时间,并没有聚合超过 count = 1。

回答by FatalError

uniq -ccounts the number of consecutiveoccurrences. So, you're almost there, you just need a sort:

uniq -c计算连续出现的次数。所以,你快到了,你只需要一个sort

grep -wo -F -f keys.txt "filename.json" | sort | uniq -c