Linux grep -f 在压缩文件夹中的文件上

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18015866/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-07 00:27:02  来源:igfitidea点击:

grep -f on files in a zipped folder

linuxzipgrep

提问by yonetpkbji

I have a problem I am hoping someone will be able to help with...

我有一个问题,我希望有人能够帮助解决...

I am performing a recursive fgrep/grep -f search on a zipped up folder using the following command in one of my programs:

我在我的一个程序中使用以下命令对压缩的文件夹执行递归 fgrep/grep -f 搜索:

The command I am using

我正在使用的命令

grep -r -i -z -I -f /path/to/pattern/file /home/folder/TestZipFolder.zip

Inside the pattern file is the string "Dog" that I am trying to search for.

模式文件中是我要搜索的字符串“Dog”。

In the zipped up folder there are a number of text files containing the string "Dog".

在压缩的文件夹中有许多包含字符串“Dog”的文本文件。

The grep -f command successfully finds the text files containing the string "Dog" in 3 files inside the zipped up folder, but it prints the output all on one line and some strange characters appear at the end i.e PK (as shown below). And when I try and print the output to a file in my program other characters appear on the end such as ^B^T^@

grep -f 命令在压缩后的文件夹中的 3 个文件中成功找到包含字符串“Dog”的文本文件,但它会将输出全部打印在一行上,并且在末尾出现一些奇怪的字符,即 PK(如下所示)。当我尝试将输出打印到程序中的文件时,其他字符会出现在末尾,例如^B^T^@

Output from the grep -f command:

grep -f 命令的输出:

TestZipFolder/test.txtThis is a file containing the string DogPKtest1.txtDog, is found again in this file.PKTestZipFolder/another.txtDog is written in this file.PK 

How would I get each of the files where the string "Dog" has been found to print on a new line so they are not all grouped together on one line like they are now? Also where are the "PK" and other strange characters appearing from in the output and how do i prevent them from appearing?

我如何将找到字符串“Dog”的每个文件打印在新行上,以便它们不像现在那样全部组合在一行上?另外,输出中出现的“PK”和其他奇怪字符在哪里,我如何防止它们出现?

Desired output

期望输出

TestZipFolder/test.txt:This is a file containing the string Dog
TestZipFolder/test1.txt:Dog, is found again in this file
TestZipFolder/another.txt:Dog is written in this file

Something along these lines, whereby the user is able to see where the string can be found in the file (you actually get the output in this format if you run the grep command on a file that is not a zip file).

沿着这些思路,用户可以看到字符串在文件中的位置(如果您在非 zip 文件的文件上运行 grep 命令,您实际上会获得这种格式的输出)。

your help with this is much appreciated, thanks

非常感谢您对此的帮助,谢谢

采纳答案by blackSmith

If you need a multiline output, better use zipgrep:

如果您需要多行输出,最好使用zipgrep

zipgrep -s "pattern" TestZipFolder.zip

the -sis to suppress error messages(optional). This command will print every matched lines along with the file name. If you want to remove the duplicate names, when more than one match is in a file, some other processing must be done using loops/grep or awk or sed.

-s是抑制错误消息(可选)。此命令将打印每个匹配的行以及文件名。如果要删除重复名称,当文件中有多个匹配项时,必须使用 loops/grep 或 awk 或 sed 完成其他一些处理。

Actually, zipgrepis a combination egrepand unzip. And its usage is as follows :

实际上,zipgrepegrepunzip的组合。其用法如下:

zipgrep [egrep_options] pattern file[.zip] [file(s) ...] [-x xfile(s) ...]

so you can pass any egrep options to it.

所以你可以将任何 egrep 选项传递给它。