bash 使用 grep 计算某个单词在文件中重复的次数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21054875/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
use grep to count the number of times a word got repeated in a file
提问by linbianxiaocao
The problem is like this:
问题是这样的:
For instance, I have a file "a.xml". Inside this file it is just one line as
例如,我有一个文件“a.xml”。在这个文件中,它只有一行
<queue><item><cause><item>
I want to find how many times <item>
occurs, and in this case it is 2.
我想找出<item>
发生了多少次,在这种情况下是 2。
However, if I run:
但是,如果我运行:
grep -c "<item>" a.xml
It will only give me 1 because grep stops as soon as it matches the first <item>
.
它只会给我 1 因为 grep 只要匹配第一个就停止<item>
。
So my problem is how do I use a simple shell/bash command that returns the number of times <item>
occurs?
所以我的问题是如何使用一个简单的 shell/bash 命令来返回<item>
发生的次数?
It looks simple but I just cannot find a good way around. Any ideas?
它看起来很简单,但我找不到好的方法。有任何想法吗?
回答by MillaresRoo
You may try something like:
您可以尝试以下操作:
grep -o "<item>" a.xml | wc -l
回答by anubhava
Using awk you can do that in a single command:
使用 awk,您可以在单个命令中执行此操作:
awk -F '<item>' '{print NF-1}' a.xml
Online Demo: http://ideone.com/vheDgq
在线演示:http: //ideone.com/vheDgq
OR to get total count for whole file use:
或获取整个文件使用的总数:
awk -F '<item>' '{s+=NF-1}END{print s}' a.xml
回答by John1024
If you are just looking to count '< item>' alone, then I like MillaresRoo's grep -o
solution. If you are looking to count items more generally, then consider:
如果您只是想单独计算 '< item>',那么我喜欢 MillaresRoo 的grep -o
解决方案。如果您希望更广泛地计算项目,请考虑:
$ sed 's/></>\n</g' a.xml | sort | uniq -c
1 <cause>
2 <item>
1 <queue>
Or, showing the input explicitly on the command line:
或者,在命令行上显式显示输入:
$ echo '<queue><item><cause><item>' | sed 's/></>\n</g' | sort | uniq -c
1 <cause>
2 <item>
1 <queue>