bash linux在一个文件中搜索多个单词

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7584958/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 00:53:24  来源:igfitidea点击:

linux search multiple word in a files

linuxbashcommand

提问by Stefano

I have a folder containing a set of text files.

我有一个包含一组文本文件的文件夹。

-Folder
--- file 1
--- file 2
--- file 3
--- file 4

I have a set of word that i want to check if are inside. {word1, username, blah blahblah}

我有一组单词要检查是否在里面。 {word1, username, blah blahblah}

Is there a way on a single command to discover which of these file contains all the word within my list?

有没有办法通过单个命令来发现这些文件中的哪个包含我列表中的所有单词?

I saw it's possible to use some and with grep but i think they work on a single line while in my case the wors are always on different lines.

我看到可以使用 some 和 grep,但我认为它们在一行上工作,而在我的情况下,wors 总是在不同的行上。

the number of word is static. are always 3 or 4 so if needed i can hard code them in the command.

字数是静态的。总是 3 或 4,所以如果需要,我可以在命令中对它们进行硬编码。

EDIT:They are in AND. a file is not accepted if does not have ALL of them inside! i would like to avoid doing egrep -l 'word1' .| xargs egrep -l 'word2'

编辑:他们在 AND 中。如果里面没有所有文件,则不接受文件!我想避免做 egrep -l 'word1' 。| xargs egrep -l 'word2'

Is there a better solution to call grep just once?

有没有更好的解决方案来调用 grep 一次?

Cheers, Ste

干杯,Ste

回答by jaypal singh

Does this work for you?

这对你有用吗?

grep -IRE 'word1|username|blah blahblah' /path/to/files/ | 
sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n/d; s/\n//; h; P' | 
awk -F: '!=p{if(b"" && c > 2)print b; p=;c=0;b=s=""}{b=b s 
grep -f words.txt input
;s=RS;c++}END {if(b"" && c > 2)print b}' | awk -F: '{print }' | sort -u

The first part (grep) will list all the file names with matching pattern. The second part (sed) will strip the duplicates from the first output giving only distinct rows. The third part will only show the file which occurs more than once and the forth one will strip your matched pattern and the last one will only serve you the file name my friend.

第一部分 (grep) 将列出所有具有匹配模式的文件名。第二部分 (sed) 将从第一个输出中去除重复项,只给出不同的行。第三部分只会显示出现不止一次的文件,第四部分将剥离您匹配的模式,最后一部分只会为您提供文件名,我的朋友。

my head hurts now ...

我现在头疼...

回答by Fredrik Pihl

use:

用:

$ cat words
word1
username
blah blahbla

a
word1
username blah blahblah
b
username blah blahblah
c
word1
d
word1, username, blah blahblah}

$ grep -f words.txt *
a:word1
a:username blah blahblah
b:username blah blahblah
c:word1
d:word1, username, blah blahblah}

Example:

例子:

grep -E '(word1|username|blah blahblah)' Folder/*

回答by Marc B

Use grep:

使用 grep:

grep -e word1 -e username -e "blah blahblah" Folder/*

the -Eflag puts grep into 'extended' mode for regular expressions. This will by default show the filename AND the matching text. If you want just the filename, add -lto the options.

-E标志将 grep 置于正则表达式的“扩展”模式。这将默认显示文件名和匹配的文本。如果您只想要文件名,请添加-l到选项中。

回答by Hai Vu

Another solution, which works best for a small set of words:

另一个解决方案,最适合一小组单词:

egrep -E '{word1|username|blah blahblah)' `find . -type f -print` 

回答by jflaflamme

The following if you want to grep into a directory tree

如果要 grep 进入目录树,请执行以下操作

##代码##

I suggest you also to use the term directory instead of folder when you are searching for answers about *nix systems :-)

我建议您在搜索有关 *nix 系统的答案时也使用术语目录而不是文件夹:-)