bash 当区域重叠时,Sed 不会替换文件中的所有实例
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8752268/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Sed is not replacing all instances in a file when areas overlap
提问by Village
I need to replace several words with other words.
我需要用其他词替换几个词。
For e.g.: "apple" with "FRUIT" in file, only in these 4 situations:
例如:"apple" 和 "FRUIT" in file,仅在以下 4 种情况下:
_apple_, has a blank space before and after.[apple_, has a square opening bracket before and a blank space after._apple], has a blank space before and a square closing bracket after.[apple], has square brackets before and after.
_apple_, 前后各有一个空格。[apple_, 前面有一个方括号,后面有一个空格。_apple], 前面有一个空格,后面有一个方括号。[apple], 前后都有方括号。
I do not want the replaces to occur in any other situation.
我不希望在任何其他情况下发生替换。
I have tried using the following code:
我尝试使用以下代码:
a="apple"
b="fruit"
sed -i "s/ $a / $b /g" ./file
sed -i "s/\[$a /\[$b /g" ./file
sed -i "s/ $a\]/ $b\]/g" ./file
sed -i "s/\[$a\]/\[$b\]/g" ./file
I thought the option "g" at the end would mean it would replace all instances, but I found this is not a thorough solution. For e.g. if filecontains this:
我认为最后的选项“g”意味着它将替换所有实例,但我发现这不是一个彻底的解决方案。例如,如果file包含这个:
apple spider apple apple spider tree apple tree
The third occurrence of "apple" is not being replaced. Also in this, several appearances of the word are not changed:
第三次出现的“apple”没有被替换。同样在这,这个词的几个外观没有改变:
apple spider apple apple apple apple apple spider tree apple tree
I suspect this is because the shared "space".
我怀疑这是因为共享“空间”。
How can I get this to find and replace all instances of $awith $b, regardless of any overlap?
我怎样才能找到并替换$awith 的所有实例$b,而不管是否有任何重叠?
采纳答案by igorrs
The quick-and-dirty solution is to perform the replacement twice.
快速而肮脏的解决方案是执行两次更换。
$ echo apple apple apple apple[apple apple] | sed -e 's/\(\[\| \)apple\( \|\]\)/FRUIT/g; s/\(\[\| \)apple\( \|\]\)/FRUIT/g'
apple FRUIT FRUIT apple[FRUIT FRUIT]
This is safe because, after the first command, the resulting text won't contain any occurrences of (\[| )apple( |\])that were not already in the original text.
这是安全的,因为在第一个命令之后,生成的文本将不包含(\[| )apple( |\])原始文本中没有的任何出现。
The drawback is that two replacements take roughly twice more time to run.
缺点是两次替换需要大约两倍的时间来运行。
If you break it in two executions of sed, you can see the steps clearer:
如果将其分为两次sed执行,则可以更清楚地看到步骤:
$ echo apple apple apple apple apple apple[apple apple] | sed -e 's/\(\[\| \)apple\( \|\]\)/FRUIT/g'
apple FRUIT apple FRUIT apple apple[FRUIT apple]
$ echo apple FRUIT apple FRUIT apple apple[FRUIT apple] | sed -e 's/\(\[\| \)apple\( \|\]\)/FRUIT/g'
apple FRUIT FRUIT FRUIT FRUIT apple[FRUIT FRUIT]
回答by SiegeX
You can do this using backreferences. This should be fully POSIX compatible
您可以使用反向引用来做到这一点。这应该完全兼容 POSIX
sed -i 's/^badger\([] ]\)/SNAKE/g; \
s/\([[ ]\)badger$/SNAKE/g; \
s/\([[ ]\)badger\([] ]\)/SNAKE/g; \
s/ badger]/ SNAKE]/g' ./infile
Example
例子
$ sed 's/^badger\([] ]\)/SNAKE/g;s/\([[ ]\)badger$/SNAKE/g;s/\([[ ]\)badger\([] ]\)/SNAKE/g;s/ badger]/ SNAKE]/g' <<<"badger [badger badger] [badger] badger foobadger badgering mushroom badger"
SNAKE [SNAKE SNAKE] [SNAKE] SNAKE foobadger badgering mushroom SNAKE
回答by mvds
sed -i "s/\bapple\b/FRUIT/g" file
\bmatches word boundaries. Probably not entirely portable, doesn't work on Mac OS X at least.
\b匹配单词边界。可能不完全便携,至少不能在 Mac OS X 上运行。
And a more interesting test:
还有一个更有趣的测试:
$ cat file; sed "s/\bapple\b/FRUIT/g" file
apple apple apple spider tree apple tree applejuice pineapple apple.com etc
FRUIT FRUIT FRUIT spider tree FRUIT tree applejuice pineapple FRUIT.com etc
回答by fardjad
Consider using look ahead and look behinds:
考虑使用向前看和向后看:
s/(?<=[\s\[])apple(?=[\s\]])/FRUIT/g
Demo: http://regexr.com?2vl8p
Okay, I tested the regexin my computer now and noted that look aheads and look behinds doesn't work in standard sed, you would use ssedwith --regexp-perloption instead:
好的,我regex现在在我的电脑上测试了,并注意到向前看和向后看在标准中不起作用sed,你可以使用ssedwith--regexp-perl选项:
uname -msrv Darwin 11.2.0 Darwin Kernel Version 11.2.0: Tue Aug 9 20:54:00 PDT 2011; root:xnu-1699.24.8~1/RELEASE_X86_64 x86_64
ssed --ver super-sed version 3.62 based on GNU sed version 4.1 Copyright (C) 2003 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, to the extent permitted by law.
ssed -R 's/(?<=[\s\[])apple(?=[\s\]])/FRUIT/g' apple spider apple apple spider tree apple tree apple spider FRUIT FRUIT spider tree FRUIT tree
回答by Birei
One way using sed:
一种使用方式sed:
sed "s/\([^ ]\)\([ ]\)\([^ ]\)//g; s/\( \|\[\)$a\( \|\]\)/$b/g; s/\([^ ]\)\([ ]\{2\}\)\([^ ]\)/ /g" file
There are three substitution commands. Explanation:
共有三个替换命令。解释:
s/\([^ ]\)\([ ]\)\([^ ]\)//g # Duplicate each space character surrounded with non-space
# characters.
s/\( \|\[\)$a\( \|\]\)/$b/g # Substitute content of variable '$a' when just before there is a
# blank or '[' and just after another space or ']'. Any combination
# of those. And replace with content of variable '$b' and same
# groups of the pattern ( and ).
s/\([^ ]\)\([ ]\{2\}\)\([^ ]\)/ /g # Remove a space when found two consecutive surrounded with
# non-space characters.
My test:
我的测试:
Content of file:
文件内容:
apple spider apple apple spider tree apple tree
apple spider [apple apple spider tree apple] tree
apple spider apple apple spider tree appletree
apple spider apple apple spider tree [apple] tree
apple spider apple apple apple apple apple spider tree apple tree
Set variables:
设置变量:
a="apple"
b="fruit"
Run sedcommand:
运行sed命令:
sed "s/\([^ ]\)\([ ]\)\([^ ]\)//g; s/\( \|\[\)$a\( \|\]\)/$b/g; s/\([^ ]\)\([ ]\{2\}\)\([^ ]\)/ /g" file
Result:
结果:
apple spider fruit fruit spider tree fruit tree
apple spider [fruit fruit spider tree fruit] tree
apple spider fruit fruit spider tree appletree
apple spider fruit fruit spider tree [fruit] tree
apple spider fruit fruit fruit fruit fruit spider tree fruit tree
It won't work if your real file has different distribution of spaces or has a strange format. In that case, sedis a limited tool, it would be better perlor similar with look-aheads and look-behinds.
如果您的真实文件具有不同的空间分布或具有奇怪的格式,则它将不起作用。在那种情况下,它sed是一个有限的工具,它会更好perl或与前瞻和后视相似。

