Bash 脚本 - 使用正则表达式分隔符拆分字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23114583/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Bash Script - split string using regex delimiter
提问by user3541984
I want to split string something like 'substring1 substring2 ONCE[0,10s] substring3'. The expected result should be (with delimiter 'ONCE[0,10s]'):
我想拆分字符串,例如“substring1 substring2 ONCE[0,10s] substring3”。预期结果应该是(带有分隔符“ONCE[0,10s]”):
substring1 substring2
substring3
The problem is that the number in delimiter is variable such as 'ONCE[0,1s]' or 'ONCE[0,3m]' or 'ONCE[0,10d]' and so on.
问题是分隔符中的数字是可变的,例如“ONCE[0,1s]”或“ONCE[0,3m]”或“ONCE[0,10d]”等。
How can I do this in bash script ? Any idea ?
我怎样才能在 bash 脚本中做到这一点?任何的想法 ?
Thank you
谢谢
回答by rici
The example provided in the OP (as well as the two answers provided by @GlennHymanman and @devnull) assume that the actual question could have been:
OP 中提供的示例(以及@GlennHymanman 和@devnull 提供的两个答案)假设实际问题可能是:
In bash, how do I replace the match for a regular expression in a string with a newline.
在 bash 中,如何用换行符替换字符串中正则表达式的匹配项。
That's not actually the same as "split a string using a regular expression", unless you add the constraint that the string does not contain any newline characters. And even then, it's not actually "splitting" the string; the presumption is that some other process will use a newline to split the result.
这实际上与“使用正则表达式拆分字符串”不同,除非您添加字符串不包含任何换行符的约束。即便如此,它实际上并没有“拆分”字符串;假设是其他一些进程将使用换行符来分割结果。
Once the question has been reformulated, the solution is not challenging. You could use any tool which supports regular expressions, such as sed
:
一旦问题被重新表述,解决方案就不再具有挑战性。您可以使用任何支持正则表达式的工具,例如sed
:
sed 's/ *ONCE\[[^]]*] */\n/g' <<<"$variable"
(Remove the g
if you only want to replace the first sequence; you may need to adjust the regular expression, since it wasn't quite clear what the desired constraints are.)
(g
如果您只想替换第一个序列,请删除;您可能需要调整正则表达式,因为不太清楚所需的约束是什么。)
bash
itself does not provide a replace all
primitive using regular expressions, although it does have "patterns" and, if the option extglob
is set (which is the default on some distributions), the patterns are sufficiently powerful to express the pattern, so you could use:
bash
本身不提供replace all
使用正则表达式的原语,尽管它确实有“模式”,并且如果extglob
设置了该选项(这是某些发行版的默认设置),则模式足以表达模式,因此您可以使用:
echo "${variable//*( )ONCE\[*([^]])]*( )/$'\n'}"
Again, you can make the substitution only happen once by changing //
to /
and you may need to change the pattern to meet your precise needs.
同样,您可以通过更改//
为使替换只发生一次,/
并且您可能需要更改模式以满足您的确切需求。
That leaves open the question of how to actually split a bash variable using a delimiter specified by a regular expression, for some definition of "split". One possible definition is "call a function with the parts of the string as arguments"; that's the one which we use here:
这留下了一个问题,即如何使用正则表达式指定的分隔符实际拆分 bash 变量,对于“拆分”的某些定义。一种可能的定义是“以字符串部分作为参数调用函数”;这就是我们在这里使用的:
# Usage:
# call_with_split <pattern> <string> <cmd> <args>...
# Splits string according to regular expression pattern and then invokes
# cmd args string-pieces
call_with_split () {
if [[ =~ ().* ]]; then
call_with_split "" \
"${2:$((${#2} - ${#BASH_REMATCH[0]} + ${#BASH_REMATCH[1]}))}" \
"${@:3}" \
"${2:0:$((${#2} - ${#BASH_REMATCH[0]}))}"
else
"${@:3}" ""
fi
}
Example:
例子:
$ var="substring1 substring2 ONCE[0,10s] substring3"
$ call_with_split " ONCE\[[^]]*] " "$var" printf "%s\n"
substring1 substring2
substring3
回答by glenn Hymanman
bash:
重击:
s='substring1 substring2 ONCE[0,10s] substring3'
if [[ $s =~ (.+)" ONCE["[0-9]+,[0-9]+[smhd]"] "(.+) ]]; then
echo "${BASH_REMATCH[1]}"
echo "${BASH_REMATCH[2]}"
else
echo no match
fi
substring1 substring2
substring3
回答by devnull
You could use awk
. Specify the field separator as:
你可以使用awk
. 将字段分隔符指定为:
'ONCE[[]0,[^]]*[]] *'
For example, using your sample input:
例如,使用您的示例输入:
$ awk -F 'ONCE[[]0,[^]]*[]] *' '{for(i=1;i<=NF;i++){printf $i"\n"}}' <<< "substring1 substring2 ONCE[0,10s] substring3"
substring1 substring2
substring3