Linux 如何在文件中搜索多行模式?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/152708/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 16:32:57  来源:igfitidea点击:

How can I search for a multiline pattern in a file?

linuxcommand-linegrepfindpcregrep

提问by Oli

I needed to find all the files that contained a specific string pattern. The first solution that comes to mind is using findpiped with xargs grep:

我需要找到包含特定字符串模式的所有文件。想到的第一个解决方案是使用带有xargs grep 的find管道:

find . -iname '*.py' | xargs grep -e 'YOUR_PATTERN'

But if I need to find patterns that spans on more than one line, I'm stuck because vanilla grep can't find multiline patterns.

但是如果我需要找到跨越多行的模式,我就会被卡住,因为 vanilla grep 无法找到多行模式。

采纳答案by Oli

So I discovered pcregrepwhich stands for Perl Compatible Regular Expressions GREP.

所以我发现了pcregrep它代表Perl Compatible Regular Expressions GREP

For example, you need to find files where the '_name' variable is immediatelly followed by the '_description' variable:

例如,您需要查找“ _name”变量后紧跟“ _description”变量的文件:

find . -iname '*.py' | xargs pcregrep -M '_name.*\n.*_description'

Tip: you need to include the line break character in your pattern. Depending on your platform, it could be '\n', \r', '\r\n', ...

提示:您需要在模式中包含换行符。根据您的平台,它可能是 '\n'、\r'、'\r\n'、...

回答by ayaz

Here is the example using GNU grep:

这是使用GNUgrep的示例:

grep -Pzo '_name.*\n.*_description'

-z/--null-dataTreat input and output data as sequences of lines.

-z/--null-data将输入和输出数据视为行序列。

See also here

另见此处

回答by Oli

Here is a more useful example:

这是一个更有用的例子:

pcregrep -Mi "<title>(.*\n){0,5}</title>" afile.html

It searches the title tag in a html file even if it spans up to 5 lines.

它会在 html 文件中搜索标题标签,即使它最多跨越 5 行。

Here is an example of unlimited lines:

以下是无限行的示例:

pcregrep -Mi "(?s)<title>.*</title>" example.html 

回答by Amit

Why don't you go for awk:

你为什么不去awk

awk '/Start pattern/,/End pattern/' filename

回答by albfan

This answer might be useful:

这个答案可能有用:

Regex (grep) for multi-line search needed

需要多行搜索的正则表达式(grep)

To find recursively you can use flags -R (recursive) and --include (GLOB pattern). See:

要递归查找,您可以使用标志 -R(递归)和 --include(GLOB 模式)。看:

Use grep --exclude/--include syntax to not grep through certain files

使用 grep --exclude/--include 语法不通过某些文件进行 grep

回答by bukzor

grep -Palso uses libpcre, but is muchmore widely installed. To find a complete titlesection of an html document, even if it spans multiple lines, you can use this:

grep -P还采用libpcre,但很多更广泛安装。要查找titlehtml 文档的完整部分,即使它跨越多行,您也可以使用以下命令:

grep -P '(?s)<title>.*</title>' example.html

Since the PCRE projectimplements to the perl standard, use the perl documentation for reference:

由于PCRE项目是按照perl标准实现的,参考perl文档:

回答by Shwaydogg

With silver searcher:

银搜索器

ag 'abc.*(\n|.)*efg'

Speed optimizations of silver searcher could possibly shine here.

Silver Searcher 的速度优化可能会在这里大放异彩。

回答by svent

You can use the grep alternative sifthere (disclaimer: I am the author).

您可以在此处使用 grep 替代筛选(免责声明:我是作者)。

It support multiline matching and limiting the search to specific file types out of the box:

它支持多行匹配并将搜索限制为开箱即用的特定文件类型:

sift -m --files '*.py' 'YOUR_PATTERN'

(search all *.py files for the specified multiline regex pattern)

(搜索指定的多行正则表达式模式的所有 *.py 文件)

It is available for all major operating systems. Take a look at the samples pageto see how it can be used to to extract multiline values from an XML file.

它适用于所有主要操作系统。查看示例页面,了解如何使用它从 XML 文件中提取多行值。

回答by Martin

@Marcin: awk example non-greedy:

@Marcin:awk 非贪婪示例:

awk '{if (
ex +"/string1/,/string3/p" -R -scq! file.txt
~ /Start pattern/) {triggered=1;}if (triggered) {print; if (
ex +"/aaa/,/bbb/p" -scq! **/*.py
~ /End pattern/) { exit;}}}' filename

回答by kenorb

Using ex/vieditor and globstar option(syntax similar to awkand sed):

使用ex/ vieditor 和globstar 选项(语法类似于awkand sed):

##代码##

where aaais your starting point, and bbbis your ending text.

aaa你的起点在哪里,bbb你的结束文本在哪里。

To search recursively, try:

要递归搜索,请尝试:

##代码##

Note: To enable **syntax, run shopt -s globstar(Bash 4 or zsh).

注意:要启用**语法,请运行shopt -s globstar(Bash 4 或 zsh)。