bash grep - 正则表达式 - 匹配到特定单词
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20373334/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
grep - regular expression - match till a specific word
提问by thefourtheye
Lets say I have a file with lines like this
假设我有一个像这样的行的文件
abcefghijklxyz
abcefghijkl
I want to get only the string between abc
and the end of the line. End of the line can be defined as the normal end of line or the string xyz
.
我只想获取abc
行尾之间的字符串。行尾可以定义为正常的行尾或字符串xyz
。
My question is
我的问题是
How can I get only the matched string using grep
and regular expressions? For example, the expected output for the two lines shown above would be
如何使用grep
正则表达式仅获取匹配的字符串?例如,上面显示的两行的预期输出将是
efghijkl
efghijkl
I don't want the starting and ending markers.
我不想要开始和结束标记。
What I have tried till now
到目前为止我所尝试的
grep -oh "abc.*xyz"
I use Ubuntu 13.04 and Bash shell.
我使用 Ubuntu 13.04 和 Bash shell。
回答by Kent
this line chops leading abc
and endingxyz
(if there was) away, and gives you the part you need:
这条线将前导abc
和结尾xyz
(如果有)分开,并为您提供所需的部分:
grep -oP '^abc\K.*?(?=xyz$|$)'
with your example:
用你的例子:
kent$ echo "abcefghijklxyz
abcefghijkl"|grep -oP '^abc\K.*?(?=xyz$|$)'
efghijkl
efghijkl
another example with xyz
in the middle of the text:
另一个例子xyz
在文本中间:
kent$ echo "abcefghijklxyz
abcefghijkl
abcfffffxyzbbbxyz
abcffffxyzbbb"|grep -oP '^abc\K.*?(?=xyz$|$)'
efghijkl
efghijkl
fffffxyzbbb
ffffxyzbbb
回答by perreal
Using sed:
使用 sed:
sed -n '/abc/{s/.*abc\(.*\)//;s/xyz.*//;p}' input
Produces:
产生:
efghijkl
efghijkl
回答by fedorqui 'SO stop harming'
Use a look-behindlike this:
像这样使用后视:
$ grep -Po '(?<=abc)[^x]*' file
efghijkl
efghijkl
It fetches everything after abc
and until it finds a x
.
它在abc
找到之后获取所有内容,直到找到x
.
Based on Kent's answer(not to copy, but for completeness) you can grep
all within abc
and xyz
(or end of line):
根据肯特的回答(不是为了复制,而是为了完整性),您可以在行grep
内abc
和xyz
(或行尾):
$ grep -Po '(?<=abc).*(?=xyz|$)' file
efghijklxyz
efghijkl
回答by Jotne
Or you can just remove what you do not like:
或者您可以删除您不喜欢的内容:
awk '/^abc/{sub(/^abc/,x);sub(/xyz.*$/,x)}1' file
efghijkl
efghijkl
xyz.*$
represent everything from xyz
to end of line.
xyz.*$
代表从xyz
到行尾的所有内容。