bash 命令计算整个文件中单词的出现次数

Question

提问by hardy_sandy

I am trying to count the occurrences of a word in a file.

我正在尝试计算文件中某个单词的出现次数。

If word occurs multiple times in a line, I will count is a 1.

如果单词在一行中出现多次，我将计数为 1。

Following command will give me the output but will fail if line has multiple occurrences of word

以下命令将为我提供输出，但如果行多次出现单词，则会失败

grep -c "word" filename.txt

Is there any one liner?

有没有一个班轮？

Answer 1

回答by fedorqui 'SO stop harming'

You can use grep -oto show the exact matches and then count them:

您可以使用grep -o来显示完全匹配，然后计算它们：

grep -o "word" filename.txt | wc -l

Test

测试

$ cat a
hello hello how are you
hello i am fine
but
this is another hello

$ grep -c "hello" a    # Normal `grep -c` fails
3

$ grep -o "hello" a 
hello
hello
hello
hello
$ grep -o "hello" a | wc -l   # grep -o solves it!
4

Answer 2

回答by BMW

Set RS in awk for a shorter one.

在 awk 中将 RS 设置为更短的。

awk 'END{print NR-1}' RS="word" file

Answer 3

回答by anubhava

GNU awk allows it to be done in single command with use of multiple piped commands:

GNU awk 允许它使用多个管道命令在单个命令中完成：

awk -v w="word" '==w{n++} END{print n}' RS=' |\n' file

Answer 4

回答by atk

cat file | cut -d ' ' | grep -c word

This assumes that all words in the file have spaces between the words. If there's punctuation concatenating the word to itself, or otherwise no spaces on a single line between the word and itself, they'll count as one.

这假设文件中的所有单词在单词之间都有空格。如果有标点符号将单词连接到自身，或者单词与其自身之间的一行中没有空格，则它们将被视为一个。

Answer 5

回答by Michael

grep word filename.txt | wc -l

grepprints the lines that match, then wc -lprints the number of lines matched

grep打印匹配的行，然后wc -l打印匹配的行数

bash 命令计算整个文件中单词的出现次数

提问by hardy_sandy

回答by fedorqui 'SO stop harming'

Test

测试

回答by BMW

回答by anubhava

回答by atk

回答by Michael

相关推荐

最近更新

标签

bash 命令计算整个文件中单词的出现次数

提问by hardy_sandy

回答by fedorqui 'SO stop harming'

Test

测试

回答by BMW

回答by anubhava

回答by atk

回答by Michael

相关推荐

bash 使用 awk 和 for 循环逐行读取文件

用于安装 apt-get 和 yum 的通用 bash 脚本

Linux bash 计时器

如何检查最后一个字符串字符是否等于 Bash 中的“*”？

相关推荐

最近更新

标签