bash Grep 递归和计数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/884591/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 20:53:25  来源:igfitidea点击:

Grep Recursive and Count

linuxbashshellscripting

提问by Codex73

Need to search a directories with lots of sub-directories for a string inside files:

需要在包含大量子目录的目录中搜索文件中的字符串:

I'm using:

我正在使用:

grep -c -r "string here" *

How can I total count of finds?

我怎样才能总发现数?

How can I output to file only those files with at least one instance?

如何仅输出到至少具有一个实例的那些文件?

采纳答案by Nick Presta

It works for me (it gets the total number of 'string here' found in each file). However, it does not display the total for ALL files searched. Here is how you can get it:

它对我有用(它获取在每个文件中找到的“此处的字符串”总数)。但是,它不会显示搜索到的所有文件的总数。获取方法如下:

grep -c -r 'string' file > out && \
    awk -F : '{total += } END { print "Total:", total }' out

The list will be in out and the total will be sent to STDOUT.

该列表将被输入,总数将被发送到 STDOUT。

Here is the output on the Python2.5.4 directory tree:

这是 Python2.5.4 目录树上的输出:

grep -c -r 'import' Python-2.5.4/ > out && \
    awk -F : '{total += } END { print "Total:", total }' out
Total: 11500

$ head out
Python-2.5.4/Python/import.c:155
Python-2.5.4/Python/thread.o:0
Python-2.5.4/Python/pyarena.c:0
Python-2.5.4/Python/getargs.c:0
Python-2.5.4/Python/thread_solaris.h:0
Python-2.5.4/Python/dup2.c:0
Python-2.5.4/Python/getplatform.c:0
Python-2.5.4/Python/frozenmain.c:0
Python-2.5.4/Python/pyfpe.c:0
Python-2.5.4/Python/getmtime.c:0

If you just want to get lines with occurrences of 'string', change to this:

如果您只想获取出现 'string' 的行,请更改为:

grep -c -r 'import' Python-2.5.4/ | \
    awk -F : '{total += ; print , } END { print "Total:", total }'

That will output:

这将输出:

[... snipped]
Python-2.5.4/Lib/dis.py 4
Python-2.5.4/Lib/mhlib.py 10
Python-2.5.4/Lib/decimal.py 8
Python-2.5.4/Lib/new.py 6
Python-2.5.4/Lib/stringold.py 3
Total: 11500

You can change how the files ($1) and the count per file ($2) is printed.

您可以更改文件 ($1) 和每个文件的计数 ($2) 的打印方式。

回答by ephemient

Using Bash's process substitution, this gives what I believe is the output you want? (Please clarify the question if it's not.)

使用 Bash 的进程替换,这给出了我认为你想要的输出?(如果不是,请澄清问题。)

grep -r "string here" * | tee >(wc -l)

This runs grep -rnormally, with output going both to stdout and to a wc -lprocess.

grep -r正常运行,输出既到 stdout 又到wc -l进程。

回答by Johannes Schaub - litb

Some solution with AWK:

AWK 的一些解决方案:

grep -r "string here" * | awk 'END { print NR } 1'

Next one is total count, number of files, and number of matches for each, displaying the first match of each one (to display all matches, change the condition to ++f[$1]):

下一个是总计数、文件数和每个匹配项的数量,显示每个匹配项的第一个匹配项(要显示所有匹配项,将条件更改为++f[$1]):

grep -r "string here" * | 
    awk -F: 'END { print "\nmatches: ", NR, "files: ", length(f); 
                   for (i in f) print i, f[i] } !f[]++'

Output for the first solution (searching within a directory for "boost::". I manually cut some too long lines so they fit horizontally):

第一个解决方案的输出(在目录中搜索“ boost::”。我手动剪切了一些太长的行,以便它们水平放置):

list_inserter.hpp:            return range( boost::begin(r), boost::end(r) );
list_of.hpp:            ::boost::is_array<T>,
list_of.hpp:            ::boost::decay<const T>,
list_of.hpp:            ::boost::decay<T> >::type type;
list_of.hpp:        return ::boost::iterator_range_detail::equal( l, r );
list_of.hpp:        return ::boost::iterator_range_detail::less_than( l, r );
list_of.hpp:        return ::boost::iterator_range_detail::less_than( l, r );
list_of.hpp:        return Os << ::boost::make_iterator_range( r.begin(), r.end() );
list_of.hpp:            return range( boost::begin(r), boost::end(r) );
list_of.hpp:            return range( boost::begin(r), boost::end(r) );
list_of.hpp:            return range( boost::begin(r), boost::end(r) );
ptr_list_of.hpp:                          BOOST_DEDUCED_TYPENAME boost::ptr_...
ptr_list_of.hpp:        typedef boost::ptr_vector<T>       impl_type;
13

Output for the second one

第二个输出

list_inserter.hpp:            return range( boost::begin(r), boost::end(r) );
list_of.hpp:            ::boost::is_array<T>,
ptr_list_of.hpp:                          BOOST_DEDUCED_TYPENAME boost::ptr_...

matches:  13 files:  3
ptr_list_of.hpp 2
list_of.hpp 10
list_inserter.hpp 1

Colors in the result are nice (--color=alwaysfor grep), but they break when piped through awk here. So better don't enable them then unless you want to have all your terminal colored afterwards :) Cheers!

结果中的颜色很好(--color=always对于 grep),但是当在这里通过 awk 传输时它们会损坏。所以最好不要启用它们,除非你之后想让你的所有终端都着色:) 干杯!

回答by KrNel

grep -rc "my string" ./ | grep :[1-9] >> file_name_by_count.txt

Works like a charm.

奇迹般有效。

回答by mouviciel

I would try a combination of find and grep.

我会尝试结合使用 find 和 grep。

find . | xargs grep -c "string here"

Anyway, grep -c -r "string here" *works for me (Mac OS X).

无论如何,grep -c -r "string here" *对我有用(Mac OS X)。

回答by ASk

To output only file names with matches, use:

要仅输出匹配的文件名,请使用:

grep -r -l "your string here" .

It will output one line with the filename for each file which matches the expression searched for.

它将为与搜索的表达式匹配的每个文件输出一行文件名。