Linux 计算纯文本文件中字符的出现次数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1603566/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 17:48:32  来源:igfitidea点击:

Count occurrences of a char in plain text file

linuxcountterminalcharacter

提问by cupakob

Is there any way under linux/terminal to count, how many times the char f occurs in a plain text file?

在linux/terminal下有什么办法可以统计一下char f在纯文本文件中出现了多少次?

采纳答案by Cascabel

How about this:

这个怎么样:

fgrep -o f <file> | wc -l

Note: Besides much easier to remember/duplicate and customize, this is about three times (sorry, edit! botched the first test) faster than Vereb's answer.

注意:除了更容易记住/复制和自定义之外,这比 Vereb 的答案快了大约三倍(对不起,编辑!搞砸了第一个测试)。

回答by Vereb

echo $(cat <file>  | wc -c) - $(cat <file>  | tr -d 'A' | wc -c) | bc

where the A is the character

其中 A 是字符

Time for this command with a file with 4.9 MB and 1100000 occurences of the searched character:

使用具有 4.9 MB 和 1100000 次搜索字符的文件执行此命令的时间:

real   0m0.168s
user   0m0.059s
sys    0m0.115s

回答by Rob Hruska

tr -d '\n' < file | sed 's/A/A\n/g' | wc -l

tr -d '\n' < file | sed 's/A/A\n/g' | wc -l

Replacing the two occurrences of "A" with your character, and "file" with your input file.

用你的角色替换两次出现的“A”,用你的输入文件替换“文件”。

  • tr -d '\n' < file: removes newlines
  • sed 's/A/A\n/g: adds a newline after every occurrence of "A"
  • wc -l: counts the number of lines
  • tr -d '\n' < file: 删除换行符
  • sed 's/A/A\n/g: 在每次出现“A”后添加一个换行符
  • wc -l: 计​​算行数

Example:

例子:

$ cat file
abcdefgabcdefgababababbbba


1234gabca

$ tr -d '\n' < file | sed 's/a/a\n/g' | wc -l
9

回答by Jongo the Gibbon

If all you need to do is count the number of lines containing your character, this will work:

如果您需要做的就是计算包含您的角色的行数,这将起作用:

grep -c 'f' myfile

However, it counts multiple occurrences of 'f' on the same line as a single match.

但是,它会将同一行上多次出现的 'f' 计为一个匹配项。

回答by user1985553

even faster:

甚至更快:

tr -cd f < file | wc -c

Time for this commandwith a file with 4.9 MB and 1100000 occurences of the searched character:

使用具有 4.9 MB 和 1100000 次搜索字符的文件执行此命令的时间

real   0m0.089s
user   0m0.057s
sys    0m0.027s

Time for Vereb answer with echo, cat, trand bcfor the same file:

时间Vereb答案有echocattrbc相同的文件:

real   0m0.168s
user   0m0.059s
sys    0m0.115s

Time for Rob Hruska answer with tr, sedand wcfor the same file:

Rob Hruska 回答tr,sedwc同一个文件的时间:

real   0m0.465s
user   0m0.411s
sys    0m0.080s

Time for Jefromi answer with fgrepand wcfor the same file:

时间Jefromi答案与fgrepwc相同的文件:

real   0m0.522s
user   0m0.477s
sys    0m0.023s