Linux 如何在文件中搜索多行模式?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/152708/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I search for a multiline pattern in a file?
提问by Oli
I needed to find all the files that contained a specific string pattern. The first solution that comes to mind is using findpiped with xargs grep:
我需要找到包含特定字符串模式的所有文件。想到的第一个解决方案是使用带有xargs grep 的find管道:
find . -iname '*.py' | xargs grep -e 'YOUR_PATTERN'
But if I need to find patterns that spans on more than one line, I'm stuck because vanilla grep can't find multiline patterns.
但是如果我需要找到跨越多行的模式,我就会被卡住,因为 vanilla grep 无法找到多行模式。
采纳答案by Oli
So I discovered pcregrepwhich stands for Perl Compatible Regular Expressions GREP.
所以我发现了pcregrep它代表Perl Compatible Regular Expressions GREP。
For example, you need to find files where the '_name' variable is immediatelly followed by the '_description' variable:
例如,您需要查找“ _name”变量后紧跟“ _description”变量的文件:
find . -iname '*.py' | xargs pcregrep -M '_name.*\n.*_description'
Tip: you need to include the line break character in your pattern. Depending on your platform, it could be '\n', \r', '\r\n', ...
提示:您需要在模式中包含换行符。根据您的平台,它可能是 '\n'、\r'、'\r\n'、...
回答by ayaz
回答by Oli
Here is a more useful example:
这是一个更有用的例子:
pcregrep -Mi "<title>(.*\n){0,5}</title>" afile.html
It searches the title tag in a html file even if it spans up to 5 lines.
它会在 html 文件中搜索标题标签,即使它最多跨越 5 行。
Here is an example of unlimited lines:
以下是无限行的示例:
pcregrep -Mi "(?s)<title>.*</title>" example.html
回答by albfan
This answer might be useful:
这个答案可能有用:
Regex (grep) for multi-line search needed
To find recursively you can use flags -R (recursive) and --include (GLOB pattern). See:
要递归查找,您可以使用标志 -R(递归)和 --include(GLOB 模式)。看:
Use grep --exclude/--include syntax to not grep through certain files
回答by bukzor
grep -P
also uses libpcre, but is muchmore widely installed. To find a complete title
section of an html document, even if it spans multiple lines, you can use this:
grep -P
还采用libpcre,但很多更广泛安装。要查找title
html 文档的完整部分,即使它跨越多行,您也可以使用以下命令:
grep -P '(?s)<title>.*</title>' example.html
Since the PCRE projectimplements to the perl standard, use the perl documentation for reference:
由于PCRE项目是按照perl标准实现的,参考perl文档:
回答by Shwaydogg
With silver searcher:
与银搜索器:
ag 'abc.*(\n|.)*efg'
Speed optimizations of silver searcher could possibly shine here.
Silver Searcher 的速度优化可能会在这里大放异彩。
回答by svent
You can use the grep alternative sifthere (disclaimer: I am the author).
您可以在此处使用 grep 替代筛选(免责声明:我是作者)。
It support multiline matching and limiting the search to specific file types out of the box:
它支持多行匹配并将搜索限制为开箱即用的特定文件类型:
sift -m --files '*.py' 'YOUR_PATTERN'
(search all *.py files for the specified multiline regex pattern)
(搜索指定的多行正则表达式模式的所有 *.py 文件)
It is available for all major operating systems. Take a look at the samples pageto see how it can be used to to extract multiline values from an XML file.
它适用于所有主要操作系统。查看示例页面,了解如何使用它从 XML 文件中提取多行值。
回答by Martin
@Marcin: awk example non-greedy:
@Marcin:awk 非贪婪示例:
awk '{if (ex +"/string1/,/string3/p" -R -scq! file.txt
~ /Start pattern/) {triggered=1;}if (triggered) {print; if (ex +"/aaa/,/bbb/p" -scq! **/*.py
~ /End pattern/) { exit;}}}' filename
回答by kenorb
Using ex
/vi
editor and globstar option(syntax similar to awk
and sed
):
使用ex
/ vi
editor 和globstar 选项(语法类似于awk
and sed
):
where aaa
is your starting point, and bbb
is your ending text.
aaa
你的起点在哪里,bbb
你的结束文本在哪里。
To search recursively, try:
要递归搜索,请尝试:
##代码##Note: To enable **
syntax, run shopt -s globstar
(Bash 4 or zsh).
注意:要启用**
语法,请运行shopt -s globstar
(Bash 4 或 zsh)。