Linux 在文件夹中的 gzip 文件中查找字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1253816/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 17:36:46  来源:igfitidea点击:

find string inside a gzipped file in a folder

linuxshelldirectorygrepgzip

提问by gagneet

My current problem is that I have around 10 folders, which contain gzipped files (around on an average 5 each). This makes it 50 files to open and look at.

我目前的问题是我有大约 10 个文件夹,其中包含 gzip 文件(每个文件夹平均大约 5 个)。这使得它可以打开和查看 50 个文件。

Is there a simpler method to find out if a gzipped file inside a folder has a particular pattern or not?

是否有更简单的方法来确定文件夹内的 gzip 文件是否具有特定模式?

zcat ABC/myzippedfile1.txt.gz | grep "pattern match"
zcat ABC/myzippedfile2.txt.gz | grep "pattern match"

Instead of writing a script, can I do the same in a single line, for all the folders and sub folders?

我可以在一行中为所有文件夹和子文件夹执行相同的操作,而不是编写脚本吗?

for f in `ls *.gz`; do echo $f; zcat $f | grep <pattern>; done;

采纳答案by Ned Batchelder

zgrep will look in gzipped files, has a -R recursive option, and a -H show me the filename option:

zgrep 将查看 gzipped 文件,具有 -R 递归选项和 -H 显示文件名选项:

zgrep -R --include=*.gz -H "pattern match" .

回答by ghostdog74

use the find command

使用查找命令

find . -name "*.gz" -exec zcat "{}" + |grep "test"

or try using the recursive option (-r) of zcat

或尝试使用 zcat 的递归选项 (-r)

回答by Nietzche-jou

You don't need zcathere because there is zgrepand zegrep.

这里不需要zcat,因为有zgrepzegrep。

If you want to run a command over a directory hierarchy, you use find:

如果要在目录层次结构上运行命令,请使用find:

find . -name "*.gz" -exec zgrep ?pattern? \{\} \;

And also “ls *.gz” is useless in forand you should just use “*.gz” in the future.

而且“ ls *.gz”在for中没有用,你应该在未来使用“*.gz”。

回答by Francisco Lavin

how zgrep don't support -R

zgrep 如何不支持 -R

I think the solution of "Nietzche-jou" could be a better answer, but I would add the option -H to show the file name something like this

我认为“Nietzche-jou”的解决方案可能是一个更好的答案,但我会添加选项 -H 来显示类似这样的文件名

find . -name "*.gz" -exec zgrep -H 'PATTERN' \{\} \;

回答by sleepycal

Coming in a bit late on this, had a similar problem and was able to resolve using;

迟到了,遇到了类似的问题,并且能够解决使用;

zcat -r /some/dir/here | grep "blah"

As detailed here;

如此处详述;

http://manpages.ubuntu.com/manpages/quantal/man1/gzip.1.html

http://manpages.ubuntu.com/manpages/quantal/man1/gzip.1.html

However, this does not show the original file that the result matched from, instead showing "(standard input)" as it's coming in from a pipe. zcat does not seem to support outputting a name either.

但是,这不会显示结果匹配的原始文件,而是显示“(标准输入)”,因为它来自管道。zcat 似乎也不支持输出名称。

In terms of performance, this is what we got;

在性能方面,这就是我们得到的;

$ alias dropcache="sync && echo 3 > /proc/sys/vm/drop_caches"

$ find 09/01 | wc -l
4208

$ du -chs 09/01
24M

$ dropcache; time zcat -r 09/01 > /dev/null
real    0m3.561s

$ dropcache; time find 09/01 -iname '*.txt.gz' -exec zcat '{}' \; > /dev/null
0m38.041s

As you can see, using the find|zcatmethod is significantly slower than using zcat -rwhen dealing with even a small volume of files. I was also unable to make zcat output the file name (using -vwill apparently output the filename, but not on every single line). It would appear that there isn't currently a tool that will provide both speed and name consistency with grep (i.e. the -Hoption).

如您所见,即使处理少量文件,使用该find|zcat方法也比使用该方法慢得多zcat -r。我也无法让 zcat 输出文件名(使用-v显然会输出文件名,但不是每一行)。目前似乎没有一种工具可以提供速度和名称与 grep 的一致性(即-H选项)。

If you need to identify the name of the file that the result belongs to, then you'll need to either write your own tool (could be done in 50 lines of Python code) or use the slower method. If you do not need to identify the name, then use zcat -r.

如果您需要确定结果所属的文件的名称,那么您需要编写自己的工具(可以在 50 行 Python 代码中完成)或使用较慢的方法。如果不需要标识名称,则使用zcat -r.

Hope this helps

希望这可以帮助

回答by Ajit Kumar

find . -name "*.gz"|xargs zcat | grep "pattern"should do.

find . -name "*.gz"|xargs zcat | grep "pattern"应该做。

回答by todipratik

zgrep "string" ./*/*

zgrep "string" ./*/*

You can use above command to search for stringin .gz files of dirdirectory where dirhas following sub-directories structure:

您可以使用上述命令在具有以下子目录结构stringdir目录的.gz 文件中搜索dir

/dir
    /childDir1
              /file1.gz
              /file2.gz
    /childDir2
              /file3.gz
              /file4.gz
    /childDir3
              /file5.gz
              /file6.gz