bash 命令计算整个文件中单词的出现次数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21603555/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 09:29:14  来源:igfitidea点击:

command to count occurrences of word in entire file

bashshellgrep

提问by hardy_sandy

I am trying to count the occurrences of a word in a file.

我正在尝试计算文件中某个单词的出现次数。

If word occurs multiple times in a line, I will count is a 1.

如果单词在一行中出现多次,我将计数为 1。

Following command will give me the output but will fail if line has multiple occurrences of word

以下命令将为我提供输出,但如果行多次出现单词,则会失败

grep -c "word" filename.txt

Is there any one liner?

有没有一个班轮?

回答by fedorqui 'SO stop harming'

You can use grep -oto show the exact matches and then count them:

您可以使用grep -o来显示完全匹配,然后计算它们:

grep -o "word" filename.txt | wc -l

Test

测试

$ cat a
hello hello how are you
hello i am fine
but
this is another hello

$ grep -c "hello" a    # Normal `grep -c` fails
3

$ grep -o "hello" a 
hello
hello
hello
hello
$ grep -o "hello" a | wc -l   # grep -o solves it!
4

回答by BMW

Set RS in awk for a shorter one.

在 awk 中将 RS 设置为更短的。

awk 'END{print NR-1}' RS="word" file

回答by anubhava

GNU awk allows it to be done in single command with use of multiple piped commands:

GNU awk 允许它使用多个管道命令在单个命令中完成:

awk -v w="word" '==w{n++} END{print n}' RS=' |\n' file

回答by atk

cat file | cut -d ' ' | grep -c word

This assumes that all words in the file have spaces between the words. If there's punctuation concatenating the word to itself, or otherwise no spaces on a single line between the word and itself, they'll count as one.

这假设文件中的所有单词在单词之间都有空格。如果有标点符号将单词连接到自身,或者单词与其自身之间的一行中没有空格,则它们将被视为一个。

回答by Michael

grep word filename.txt | wc -l

grepprints the lines that match, then wc -lprints the number of lines matched

grep打印匹配的行,然后wc -l打印匹配的行数