bash grep - 正则表达式 - 匹配到特定单词

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20373334/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 08:51:55  来源:igfitidea点击:

grep - regular expression - match till a specific word

regexbashgrep

提问by thefourtheye

Lets say I have a file with lines like this

假设我有一个像这样的行的文件

abcefghijklxyz
abcefghijkl

I want to get only the string between abcand the end of the line. End of the line can be defined as the normal end of line or the string xyz.

我只想获取abc行尾之间的字符串。行尾可以定义为正常的行尾或字符串xyz

My question is

我的问题是

How can I get only the matched string using grepand regular expressions? For example, the expected output for the two lines shown above would be

如何使用grep正则表达式仅获取匹配的字符串?例如,上面显示的两行的预期输出将是

efghijkl
efghijkl

I don't want the starting and ending markers.

我不想要开始和结束标记。

What I have tried till now

到目前为止我所尝试的

grep -oh "abc.*xyz"

I use Ubuntu 13.04 and Bash shell.

我使用 Ubuntu 13.04 和 Bash shell。

回答by Kent

this line chops leading abcand endingxyz(if there was) away, and gives you the part you need:

这条线将前导abc结尾xyz(如果有)分开,并为您提供所需的部分:

grep -oP '^abc\K.*?(?=xyz$|$)'

with your example:

用你的例子:

kent$  echo "abcefghijklxyz
abcefghijkl"|grep -oP '^abc\K.*?(?=xyz$|$)'
efghijkl
efghijkl

another example with xyzin the middle of the text:

另一个例子xyz在文本中间:

kent$  echo "abcefghijklxyz
abcefghijkl
abcfffffxyzbbbxyz
abcffffxyzbbb"|grep -oP '^abc\K.*?(?=xyz$|$)'
efghijkl
efghijkl
fffffxyzbbb
ffffxyzbbb

回答by perreal

Using sed:

使用 sed:

sed -n '/abc/{s/.*abc\(.*\)//;s/xyz.*//;p}' input

Produces:

产生:

efghijkl
efghijkl

回答by fedorqui 'SO stop harming'

Use a look-behindlike this:

像这样使用后

$ grep -Po '(?<=abc)[^x]*' file
efghijkl
efghijkl

It fetches everything after abcand until it finds a x.

它在abc找到之后获取所有内容,直到找到x.



Based on Kent's answer(not to copy, but for completeness) you can grepall within abcand xyz(or end of line):

根据肯特的回答(不是为了复制,而是为了完整性),您可以在行grepabcxyz(或行尾):

$ grep -Po '(?<=abc).*(?=xyz|$)' file
efghijklxyz
efghijkl

回答by Jotne

Or you can just remove what you do not like:

或者您可以删除您不喜欢的内容:

awk '/^abc/{sub(/^abc/,x);sub(/xyz.*$/,x)}1' file
efghijkl
efghijkl

xyz.*$represent everything from xyzto end of line.

xyz.*$代表从xyz到行尾的所有内容。