string 如何使用 sed/grep 提取两个单词之间的文本?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13242469/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use sed/grep to extract text between two words?
提问by user1190650
I am trying to output a string that contains everything between two words of a string:
我试图输出一个字符串,其中包含字符串的两个单词之间的所有内容:
input:
输入:
"Here is a String"
output:
输出:
"is a"
Using:
使用:
sed -n '/Here/,/String/p'
includes the endpoints, but I don't want to include them.
包括端点,但我不想包括它们。
采纳答案by Brian Campbell
sed -e 's/Here\(.*\)String//'
回答by anishsane
GNU grep can also support positive & negative look-ahead & look-back: For your case, the command would be:
GNU grep 还可以支持正面和负面的前瞻和回顾:对于您的情况,命令是:
echo "Here is a string" | grep -o -P '(?<=Here).*(?=string)'
If there are multiple occurrences of Here
and string
, you can choose whether you want to match from the first Here
and last string
or match them individually. In terms of regex, it is called as greedy match (first case)or non-greedy match (second case)
如果有多次出现Here
并且string
,你可以选择你是否想从第一场比赛Here
和最后的string
或单独匹配。在正则表达式的方面,它被称为贪婪匹配(第一情况)或非贪婪匹配(第二种情况)
$ echo 'Here is a string, and Here is another string.' | grep -oP '(?<=Here).*(?=string)' # Greedy match
is a string, and Here is another
$ echo 'Here is a string, and Here is another string.' | grep -oP '(?<=Here).*?(?=string)' # Non-greedy match (Notice the '?' after '*' in .*)
is a
is another
回答by wheeler
The accepted answer does not remove text that could be before Here
or after String
. This will:
接受的答案不会删除可能在 之前Here
或之后的文本String
。这会:
sed -e 's/.*Here\(.*\)String.*//'
The main difference is the addition of .*
immediately before Here
and after String
.
主要区别是在.*
之前Here
和之后添加了String
。
回答by ghoti
You can strip strings in Bashalone:
您可以单独在Bash 中剥离字符串:
$ foo="Here is a String"
$ foo=${foo##*Here }
$ echo "$foo"
is a String
$ foo=${foo%% String*}
$ echo "$foo"
is a
$
And if you have a GNU grep that includes PCRE, you can use a zero-width assertion:
如果您有一个包含PCRE的 GNU grep ,您可以使用零宽度断言:
$ echo "Here is a String" | grep -Po '(?<=(Here )).*(?= String)'
is a
回答by Avinash Raj
Through GNU awk,
通过 GNU awk,
$ echo "Here is a string" | awk -v FS="(Here|string)" '{print }'
is a
grep with -P
(perl-regexp) parameter supports \K
, which helps in discarding the previously matched characters. In our case , the previously matched string was Here
so it got discarded from the final output.
grep with -P
( perl-regexp) 参数支持\K
,这有助于丢弃以前匹配的字符。在我们的例子中,先前匹配的字符串Here
因此从最终输出中被丢弃。
$ echo "Here is a string" | grep -oP 'Here\K.*(?=string)'
is a
$ echo "Here is a string" | grep -oP 'Here\K(?:(?!string).)*'
is a
If you want the output to be is a
then you could try the below,
如果你想要输出,is a
那么你可以尝试下面的,
$ echo "Here is a string" | grep -oP 'Here\s*\K.*(?=\s+string)'
is a
$ echo "Here is a string" | grep -oP 'Here\s*\K(?:(?!\s+string).)*'
is a
回答by alemol
If you have a long file with many multi-line ocurrences, it is useful to first print number lines:
如果你有一个多行出现的长文件,首先打印数字行会很有用:
cat -n file | sed -n '/Here/,/String/p'
回答by potong
This might work for you (GNU sed):
这可能对你有用(GNU sed):
sed '/Here/!d;s//&\n/;s/.*\n//;:a;/String/bb;$!{n;ba};:b;s//\n&/;P;D' file
This presents each representation of text between two markers (in this instance Here
and String
) on a newline and preserves newlines within the text.
这将在换行符上的两个标记(在本例中为Here
和String
)之间呈现文本的每个表示,并在文本中保留换行符。
回答by Gary Dean
All the above solutions have deficiencies where the last search string is repeated elsewhere in the string. I found it best to write a bash function.
上述所有解决方案都有不足之处,即最后一个搜索字符串在字符串的其他地方重复。我发现最好编写一个 bash 函数。
function str_str {
local str
str="${1#*}"
str="${str%%*}"
echo -n "$str"
}
# test it ...
mystr="this is a string"
str_str "$mystr" "this " " string"
回答by mvairavan
You can use \1
(refer to http://www.grymtheitroade.com/Unix/Sed.html#uh-4):
您可以使用\1
(请参阅http://www.grymtheitroade.com/Unix/Sed.html#uh-4):
echo "Hello is a String" | sed 's/Hello\(.*\)String//g'
The contents that is inside the brackets will be stored as \1
.
括号内的内容将存储为\1
.
回答by Sabrina
To understand sed
command, we have to build it step by step.
要理解sed
命令,我们必须一步一步地构建它。
Here is your original text
这是你的原文
user@linux:~$ echo "Here is a String"
Here is a String
user@linux:~$
Let's try to remove Here
with s
ubstition option in sed
让我们尝试Here
使用s
ubstition 选项删除sed
user@linux:~$ echo "Here is a String" | sed 's/Here //'
is a String
user@linux:~$
At this point, I believe you would be able to remove String
as well
在这一点上,我相信你将能够去除String
以及
user@linux:~$ echo "Here is a String" | sed 's/String//'
Here is a
user@linux:~$
But this is not your desired output.
但这不是您想要的输出。
To combine two sed commands, use -e
option
要组合两个 sed 命令,请使用-e
选项
user@linux:~$ echo "Here is a String" | sed -e 's/Here //' -e 's/String//'
is a
user@linux:~$
Hope this helps
希望这可以帮助