如何在 bash 正则表达式替换中引用捕获

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5624969/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 23:46:35  来源:igfitidea点击:

How to reference captures in bash regex replacement

regexbash

提问by joshuapoehls

How can I include the regex match in the replacement expression in BASH?

如何在 BASH 的替换表达式中包含正则表达式匹配?

Non-working example:

非工作示例:

#!/bin/bash
name=joshua
echo ${name//[oa]/X}

I expect to output jXoshuXawith \1being replaced by the matched character.

我期望输出jXoshuXa\1由匹配的字符被替换。

This doesn't actually work though and outputs jX1shuX1instead.

但这实际上并不起作用,jX1shuX1而是输出。

采纳答案by Andrew Clark

bash> name=joshua  
bash> echo $name | sed 's/\([oa]\)/X/g'  
jXoshuXa

回答by nickl-

Perhaps not as intuitive as sedand arguably quite obscure but in the spirit of completeness, while BASH will probably never support capture variables in replace (at least not in the usual fashion as parenthesis are used for extended pattern matching), but it is still possible to capture a pattern when testing with the binary operator =~to produce an array of matches called BASH_REMATCH.

也许不如直观sed,可以说是相当晦涩,但本着完整性的精神,而 BASH 可能永远不会支持替换中的捕获变量(至少不是以括号用于扩展模式匹配的通常方式),但仍然有可能使用二元运算符=~进行测试以生成名为 的匹配数组时捕获模式BASH_REMATCH

Making the following example possible:

使以下示例成为可能:

#!/bin/bash
name='joshua'
[[ $name =~ ([ao].*)([oa]) ]] && \
    echo ${name/$BASH_REMATCH/X${BASH_REMATCH[1]}X${BASH_REMATCH[2]}}

The conditional match of the regular expression ([ao].*)([oa])captures the following values to $BASH_REMATCH:

正则表达式的条件匹配([ao].*)([oa])将以下值捕获到$BASH_REMATCH

$ echo ${BASH_REMATCH[*]}
oshua oshu a

If found we use the ${parameter/pattern/string}expansionto search for the patternoshuain parameterwith value joshuaand replace it with the combined stringXoshuand Xa. However this only works for our example string because we know what to expect.

如果发现我们使用${parameter/pattern/string}扩展到搜索模式oshua参数与价值joshua,并与联合替换字符串XoshuXa。然而,这仅适用于我们的示例字符串,因为我们知道会发生什么。

For something that functions more like the match all or global regex counterparts the following example will greedy match for any unchanged oor ainserting Xfrom back to front.

对于更像匹配所有或全局正则表达式对应的功能,以下示例将贪婪匹配任何未更改o或从后向前a插入的内容X

#/bin/bash
name='joshua'
while [[ $name =~ .*[^X]([oa]) ]]; do
    name=${name/$BASH_REMATCH/${BASH_REMATCH:0:-1}X${BASH_REMATCH[1]}}
done 
echo $name

The first iteration changes $nameto joshuXaand finally to jXoshuXabefore the condition fails and the loop terminates. This example works similar to the look behind expression /(?<!X)([oa])/X\1/which assumes to only care about the oor acharacters which don't have a Xprefixed.

在条件失败且循环终止之前,第一次迭代更改$namejoshuXa并最终更改为jXoshuXa。这个例子的工作方式类似于 look behind 表达式/(?<!X)([oa])/X\1/,它假设只关心没有前缀的oa字符X

The output for both examples:

两个示例的输出:

jXoshuXa

nJoy!

快乐!

回答by 18446744073709551615

The question bash string substitution: reference matched subexpressionswas marked a duplicate of this one, in spite of the requirement that

问题bash 字符串替换:引用匹配的子表达式被标记为与此重复,尽管要求

The code runs in a long loop, it should be a one-liner that does not launch sub-processes.

代码在一个长循环中运行,它应该是不启动子进程的单行代码。

So the answer is:

所以答案是:

If you really cannot afford launching sed in a subprocess, do not use bash !Use perl instead, its read-update-output loop will be several times faster, and the difference in syntax is small. (Well, you must not forget semicolons.)

如果您真的负担不起在子进程中启动 sed,请不要使用 bash !改用 perl,它的 read-update-output 循环会快几倍,而且语法差异很小。(好吧,你不能忘记分号。)

I switched to perl, and there was only one gotcha: Unicode support was not available on one of the computers, I had to reinstall packages.

我切换到 perl,但只有一个问题:其中一台计算机不支持 Unicode,我不得不重新安装软件包。