Linux 如何计算整个文件中字符串出现的次数?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10119717/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 05:44:50  来源:igfitidea点击:

How do I count the number of occurrences of a string in an entire file?

linuxperlbashshellunix

提问by toop

Is there an inbuilt command to do this or has anyone had any luck with a script that does it?

是否有内置命令可以执行此操作,或者是否有人对执行此操作的脚本有任何运气?

I am looking to count the number of times a certain string (not word) appears in a file. This can include multiple occurrences per line so the count should count every occurrence not just count 1 for lines that have the string 2 or more times.

我正在计算某个字符串(不是单词)在文件中出现的次数。这可以包括每行多次出现,因此计数应该对每次出现计数,而不仅仅是对具有 2 次或更多次字符串的行计数 1。

For example, with this sample file:

例如,使用此示例文件:

blah(*)wasp( *)jkdjs(*)kdfks(l*)ffks(dl
flksj(*)gjkd(*
)jfhk(*)fj (*) ks)(*gfjk(*)

If I am looking to count the occurrences of the string (*)I would expect the count to be 6, i.e. 2 from the first line, 1 from the second line and 3 from the third line. Note how the one across lines 2-3 does not count because there is a LF character separating them.

如果我想计算字符串的出现次数,(*)我希望计数为 6,即第一行 2,第二行 1,第三行 3。请注意,第 2-3 行的那个不计算在内,因为它们之间有一个 LF 字符。

Update: great responses so far! Can I ask that the script handle the conversion of (*)to \(*\), etc? That way I could just pass any desired string as an input parameter without worrying about what conversion needs to be done to it so it appears in the correct format.

更新:到目前为止反应很好!我可以要求脚本处理(*)to\(*\)等的转换吗?这样我就可以将任何所需的字符串作为输入参数传递,而不必担心需要对其进行什么转换,以便它以正确的格式显示。

采纳答案by TLP

Using perl's "Eskimo kiss" operator with the -nswitch to print a total at the end. Use \Q...\Eto ignore any meta characters.

使用 perl 的“爱斯基摩吻”操作符和-n开关在最后打印总数。使用\Q...\E忽略任何元字符。

perl -lnwe '$a+=()=/\Q(*)/g; }{ print $a;' file.txt

Script:

脚本:

use strict;
use warnings;

my $count;
my $text = shift;

while (<>) {
    $count += () = /\Q$text/g;
}

print "$count\n";

Usage:

用法:

perl script.pl "(*)" file.txt 

回答by DavidO

This loops over the lines of the file, and on each line finds all occurrences of the string "(*)". Each time that string is found, $c is incremented. When there are no more lines to loop over, the value of $c is printed.

这会遍历文件的各行,并在每一行中查找所有出现的字符串“(*)”。每次找到该字符串时,$c 都会递增。当没有更多行要循环时,将打印 $c 的值。

perl -ne'$c++ while /\(\*\)/g;END{print"$c\n"}' filename.txt

perl -ne'$c++ while /\(\*\)/g;END{print"$c\n"}' filename.txt

Update:Regarding your comment asking that this be converted into a solution that accepts a regex as an argument, you might do it like this:

更新:关于您的评论要求将其转换为接受正则表达式作为参数的解决方案,您可以这样做:

perl -ne'BEGIN{$re=shift;}$c++ while /\Q$re/g;END{print"$c\n"}' 'regex' filename.txt

perl -ne'BEGIN{$re=shift;}$c++ while /\Q$re/g;END{print"$c\n"}' 'regex' filename.txt

That ought to do the trick. If I felt inclined to skim through perlrunagain I might see a more elegant solution, but this should work.

那应该可以解决问题。如果我再次浏览perlrun,我可能会看到一个更优雅的解决方案,但这应该可行。

You could also eliminate the explicit inner while loop in favor of an implicit one by providing list context to the regexp:

您还可以通过向正则表达式提供列表上下文来消除显式内部 while 循环,以支持隐式循环:

perl -ne'BEGIN{$re=shift}$c+=()=/\Q$re/g;END{print"$c\n"}' 'regex' filename.txt

perl -ne'BEGIN{$re=shift}$c+=()=/\Q$re/g;END{print"$c\n"}' 'regex' filename.txt

回答by kev

You can use basic tools such as grepand wc:

您可以使用基本工具,例如grepwc

grep -o '(\*)' input.txt | wc -l

回答by Jahid

text="(\*)"
grep -o $text file | wc -l

You can make it into a script which accepts arguments like this:

你可以把它变成一个脚本,它接受这样的参数:

script count:

脚本计数

#!/bin/bash
text=""
file=""
grep -o "$text" "$file" | wc -l

Usage:

用法:

./count "(\*)" file_path

回答by Arijit Panda

You can use basic grepcommand:

您可以使用基本的grep命令:

Example: If you want to find the no of occurrence of "hello" word in a file

示例:如果要查找文件中“hello”单词出现的次数

grep -c "hello" filename

If you want to find the no of occurrence of a pattern then

如果你想找到一个模式出现的次数,那么

grep -c -P "Your Pattern"

Pattern example : hell.w, \d+etc

模式示例:hell.w、\d+

回答by Abhishek Singh

I have used below command to find particular string count in a file

我使用以下命令在文件中查找特定的字符串数

grep search_String fileName|wc -l

grep search_String 文件名|wc -l