bash 如何计算文件中字符串的出现次数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6741967/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I count the occurrences of a string within a file?
提问by Leo Chan
Just take this code as an example. Pretending it is an HTML/text file, if I would like to know the total number of times that echo
appears, how can I do it using bash?
仅以这段代码为例。假装它是一个 HTML/文本文件,如果我想知道出现的总次数,echo
我如何使用 bash 来做到这一点?
new_user()
{
echo "Preparing to add a new user..."
sleep 2
adduser # run the adduser program
}
echo "1. Add user"
echo "2. Exit"
echo "Enter your choice: "
read choice
case $choice in
1) new_user # call the new_user() function
;;
*) exit
;;
esac
回答by Dmitry
The number of string occurrences (not lines) can be obtained using grep
with -o
option and wc
(word count):
可以使用with选项和(字数)获得字符串出现的次数(不是行):grep
-o
wc
$ echo "echo 1234 echo" | grep -o echo
echo
echo
$ echo "echo 1234 echo" | grep -o echo | wc -l
2
So the full solution for your problem would look like this:
因此,您问题的完整解决方案如下所示:
$ grep -o "echo" FILE | wc -l
回答by Manny D
This will output the number of linesthat contain your search string.
这将输出包含搜索字符串的行数。
grep -c "echo" FILE
This won't, however, count the number of occurrences in the file (ie, if you have echo multiple times on one line).
但是,这不会计算文件中出现的次数(即,如果您在一行中多次回显)。
edit:
编辑:
After playing around a bit, you could get the number of occurrences using this dirty little bit of code:
在玩了一会儿之后,您可以使用这段肮脏的代码获得出现次数:
sed 's/echo/echo\n/g' FILE | grep -c "echo"
This basically adds a newline following every instance of echo so they're each on their own line, allowing grep to count those lines. You can refine the regex if you only want the word "echo", as opposed to "echoing", for example.
这基本上在每个 echo 实例之后添加一个换行符,因此它们每个都在自己的行上,允许 grep 计算这些行。例如,如果您只想要“回声”这个词,而不是“回声”,您可以优化正则表达式。
回答by James Polley
I'm taking some guesses here, because I don't quite understand what you're asking.
我在这里进行一些猜测,因为我不太明白你在问什么。
I think that what you want is a count of the number of lines on which the pattern 'echo' appears in the given file.
我认为您想要的是计算给定文件中出现模式“echo”的行数。
I've pasted your sample text into a file called 6741967
.
我已将您的示例文本粘贴到名为6741967
.
First, grep
finds the matches:
首先,grep
找到匹配项:
james@Brindle:tmp$grep echo 6741967
echo "Preparing to add a new user..."
echo "1. Add user"
echo "2. Exit"
echo "Enter your choice: "
Second, use wc -l
to count the lines
二、使用wc -l
计数线
james@Brindle:tmp$grep echo 6741967 | wc -l
4
回答by Timmmm
None of the existing answers worked for me with a single-line 10GB file. Grep runs out of memory even on a machine with 768 GB of RAM!
对于单行 10GB 文件,现有答案均不适合我。即使在具有 768 GB RAM 的机器上,Grep 也会耗尽内存!
$ cat /proc/meminfo | grep MemTotal
MemTotal: 791236260 kB
$ ls -lh test.json
-rw-r--r-- 1 me all 9.2G Nov 18 15:54 test.json
$ grep -o '0,0,0,0,0,0,0,0,' test.json | wc -l
grep: memory exhausted
0
So I wrote a very simple Rust program to do it.
- Install Rust.
cargo install count_occurences
- 安装锈。
cargo install count_occurences
$ count_occurences '0,0,0,0,0,0,0,0,' test.json
99094198
It's a little slow (1 minute for 10GB), but at least it doesn't run out of memory!
它有点慢(10GB 1 分钟),但至少它不会耗尽内存!
回答by beginner
if you just want the number of occurences then you can do this, $ grep -c "string_to_count" file_name
如果你只想要出现的次数,那么你可以这样做, $ grep -c "string_to_count" file_name