bash 替换多个模式,但不要使用相同的字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29606527/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Replace multiple patterns, but not with the same string
提问by ornit
is it possible to change multiply patterns to different values at the same command? lets say I have
是否可以在同一命令中将乘法模式更改为不同的值?可以说我有
A B C D ABC
and I want to change every A to 1 every B to 2 and every C to 3
我想把每一个 A 改为 1,每一个 B 改为 2,每一个 C 改为 3
so the output will be
所以输出将是
1 2 3 D 123
since I have 3 patterns to change I would like to avoid substitute them separately. I thought there would be something like
因为我有 3 种模式要改变,所以我想避免单独替换它们。我以为会有类似的东西
sed -r s/'(A|B|C)'/(1|2|3)/
but of course this just replace A or B or C to (1|2|3). I should just mention that my real patterns are more complicated than that...
但当然这只是将 A 或 B 或 C 替换为 (1|2|3)。我应该提到我的真实模式比那更复杂......
thank you!
谢谢你!
采纳答案by choroba
Easy in Perl:
在 Perl 中很容易:
perl -pe '%h = (A => 1, B => 2, C => 3); s/(A|B|C)/$h{}/g'
If you use more complex patterns, put the more specific ones before the more general ones in the alternative list. Sorting by length might be enough:
如果您使用更复杂的模式,请将更具体的模式放在替代列表中更通用的模式之前。按长度排序可能就足够了:
perl -pe 'BEGIN { %h = (A => 1, AA => 2, AAA => 3);
$re = join "|", sort { length $b <=> length $a } keys %h; }
s/($re)/$h{}/g'
To add word or line boundaries, just change the pattern to
要添加字或行边界,只需将模式更改为
/\b($re)\b/
# or
/^($re)$/
# resp.
回答by hek2mgl
Easy in sed
:
轻松进入sed
:
sed 's/WORD1/NEW_WORD1/g;s/WORD2/NEW_WORD2/g;s/WORD3/NEW_WORD3/g'
You can separate multiple commands on the same line by a ;
您可以在同一行上用一个分隔多个命令 ;
Update
更新
Probably this was too easy. NeronLeVelupointed out that the above command can lead to unwanted results because the second substitution might even touch results of the first substitution (and so on).
可能这太容易了。NeronLeVelu指出,上述命令可能会导致不需要的结果,因为第二次替换甚至可能触及第一次替换的结果(以此类推)。
If you care about this you can avoid this side effect with the t
command. The t
command branches to the end of the script, but only if a substitution did happen:
如果您关心这一点,您可以使用t
命令避免这种副作用。该t
命令分支到脚本的末尾,但前提是替换没有发生:
sed 's/WORD1/NEW_WORD1/g;t;s/WORD2/NEW_WORD2/g;t;s/WORD3/NEW_WORD3/g'
回答by Ed Morton
This will work if your "words" don't contain RE metachars (. * ? etc.):
如果您的“单词”不包含 RE 元字符(. * ? 等),这将起作用:
$ cat file
there is the problem when the foo is closed
$ cat tst.awk
BEGIN {
split("the a foo bar",tmp)
for (i=1;i in tmp;i+=2) {
old = (i>1 ? old "|" : "\<(") tmp[i]
map[tmp[i]] = tmp[i+1]
}
old = old ")\>"
}
{
head = ""
tail = ##代码##
while ( match(tail,old) ) {
head = head substr(tail,1,RSTART-1) map[substr(tail,RSTART,RLENGTH)]
tail = substr(tail,RSTART+RLENGTH)
}
print head tail
}
$ awk -f tst.awk file
there is a problem when a bar is closed
The above obviously maps "the" to "a" and "foo" to "bar" and uses GNU awk for word boundaries.
上面显然将“the”映射到“a”,将“foo”映射到“bar”,并使用 GNU awk 作为单词边界。
If your "words" do contain RE metachars etc. then you need a string-based solution using index()
instead of an RE based one using match()
(note that sed
ONLY supports REs, not strings).
如果您的“单词”确实包含 RE 元字符等,那么您需要一个基于字符串的解决方案 usingindex()
而不是基于 RE 的解决方案using match()
(请注意,sed
仅支持 RE,而不是字符串)。