bash 命令计算整个文件中单词的出现次数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21603555/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
command to count occurrences of word in entire file
提问by hardy_sandy
I am trying to count the occurrences of a word in a file.
我正在尝试计算文件中某个单词的出现次数。
If word occurs multiple times in a line, I will count is a 1.
如果单词在一行中出现多次,我将计数为 1。
Following command will give me the output but will fail if line has multiple occurrences of word
以下命令将为我提供输出,但如果行多次出现单词,则会失败
grep -c "word" filename.txt
Is there any one liner?
有没有一个班轮?
回答by fedorqui 'SO stop harming'
You can use grep -o
to show the exact matches and then count them:
您可以使用grep -o
来显示完全匹配,然后计算它们:
grep -o "word" filename.txt | wc -l
Test
测试
$ cat a
hello hello how are you
hello i am fine
but
this is another hello
$ grep -c "hello" a # Normal `grep -c` fails
3
$ grep -o "hello" a
hello
hello
hello
hello
$ grep -o "hello" a | wc -l # grep -o solves it!
4
回答by BMW
Set RS in awk for a shorter one.
在 awk 中将 RS 设置为更短的。
awk 'END{print NR-1}' RS="word" file
回答by anubhava
GNU awk allows it to be done in single command with use of multiple piped commands:
GNU awk 允许它使用多个管道命令在单个命令中完成:
awk -v w="word" '==w{n++} END{print n}' RS=' |\n' file
回答by atk
cat file | cut -d ' ' | grep -c word
This assumes that all words in the file have spaces between the words. If there's punctuation concatenating the word to itself, or otherwise no spaces on a single line between the word and itself, they'll count as one.
这假设文件中的所有单词在单词之间都有空格。如果有标点符号将单词连接到自身,或者单词与其自身之间的一行中没有空格,则它们将被视为一个。
回答by Michael
grep word filename.txt | wc -l
grep
prints the lines that match, then wc -l
prints the number of lines matched
grep
打印匹配的行,然后wc -l
打印匹配的行数