bash 如何用sed用其他语法替换成对的方括号?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10646418/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to replace paired square brackets with other syntax with sed?
提问by Village
I want to replace all pairs of square brackets in a file, e.g., [some text]
, with \macro{some text}
, e.g.:
我想[some text]
用\macro{some text}
, 例如替换文件中的所有方括号对,例如:
This is some [text].
This [line] has [some more] text.
This becomes:
这变成:
This is some \macro{text}.
This \macro{line} has \macro{some more} text.
- The pairs only occur on individual lines, never across multiple lines.
- Sometimes there might be more than one pair on a single line, but they are never nested.
- If a bracket is found alone on a line, without a pair, then it should not be changed.
- 这些对只出现在单独的行上,永远不会跨越多行。
- 有时一行上可能有不止一对,但它们从不嵌套。
- 如果在一行上单独找到一个括号,没有成对,则不应更改它。
How can I replace these pairs of brackets with this code?
如何用此代码替换这些括号对?
采纳答案by dcbyers
sed -e 's/\[\([^]]*\)\]/\macro{}/g' file.txt
This looks for an opening bracket, any number of explicitly non-closing brackets, then a closing bracket. The group is captured by the parens and inserted into the replacement expression.
这将查找一个左括号,任意数量的明确非右括号,然后是一个右括号。该组由括号捕获并插入到替换表达式中。
回答by David W.
It took a little doing, but here:
花了一点时间,但在这里:
sed -i.bkup 's/\[\([^]]*\)\]/\macro{}/g' test.txt
Let's see if I can explain this regular expression:
让我们看看我是否可以解释这个正则表达式:
- The
\[
is matching a square bracket. Since[
is a valid magic regular expression character, the backslash means to match the literal character. - The (...) is a capture group. It captures the part of the regular expression I want. I can have many capture groups, and in
sed
I can reference them as\1
,\2
, etc. - Inside the capture group
\(...\)
. I have[^]]*
.- The
[^...]
syntax means any character but. - The
[^]]
means any character but a closing brace. - The
*
means zero or more of the preceding. That means I am capturing zero or more characters that are not closing square braces.
- The
- The
\]
means the closing square bracket
- 的
\[
是匹配方括号。由于[
是一个有效的魔法正则表达式字符,反斜杠意味着匹配文字字符。 - (...) 是一个捕获组。它捕获了我想要的正则表达式的一部分。我可以有很多的捕捉组,并在
sed
我可以引用它们作为\1
,\2
等等。 - 在捕获组里面
\(...\)
。我有[^]]*
。- 该
[^...]
语法是指任何字符,但。 - 该
[^]]
指任何字符,但一个右括号。 - The
*
表示前面的零个或多个。这意味着我正在捕获零个或多个未关闭方括号的字符。
- 该
- 该
\]
装置中的右方括号
Let's look at the line this is [some] more [text]
让我们看看这行[some] more [text]
- In #1 above, I capture the first open square bracket in front of the word some. However, it's not in a capture group. This is the first character I'm going to substitute.
- I now start a capture group. I am capturing according to 3.2 and 3.3 above, starting with the letter
s
in someas many characters as possible that are not closing square brackets. This means I am matching[some
, but only capturingsome
. - In #4, I have ended my capture group. I've matched for substitution purposes
[some
and now I'm matching on the last closing square bracket. That means I'm matching[some]
. Note that regular expressions are normally greedy. I'll explain below why this is important. - Now, I can match the replacement string. This is much easier. It's
\\macro(\1)
. The\1
is replaced by my capture group. The\\
is just a backslash. Thus, I'll replace[some]
with\macro{some}
.
- 在上面的#1 中,我捕获了单词some前面的第一个方括号。但是,它不在捕获组中。这是我要替换的第一个字符。
- 我现在开始一个捕获组。我根据上面的 3.2 和 3.3 进行捕获,以字母开头
s
,其中包含尽可能多的不是方括号的字符。这意味着我正在匹配[some
,但只捕获some
. - 在 #4 中,我已经结束了我的捕获组。我已经为替换目的
[some
进行了匹配,现在我正在匹配最后一个右方括号。这意味着我正在匹配[some]
. 请注意,正则表达式通常是贪婪的。我将在下面解释为什么这很重要。 - 现在,我可以匹配替换字符串。这要容易得多。它是
\\macro(\1)
。该\1
由我捕获组所取代。这\\
只是一个反斜杠。因此,我将替换[some]
为\macro{some}
.
It would be much easier if I could be guaranteed a single set of square brackets in each line. Then I could have done this:
如果我可以保证每行有一组方括号,那就容易多了。然后我可以这样做:
sed -i.bkup 's/\[\(.*\)\]/\macro()/g'
The capture group is now saying anything between to square brackets. However, the problem is that regular expressions are greedy, that means I would have matched from the s
in some
all the way to the final t
in text. The 'x' below show the capture group. The [
and ]
show the square brackets I'm matching on:
捕获组现在在方括号之间说什么。然而,问题是正则表达式是贪婪的,这意味着我会从s
in一直匹配some
到t
文本中的 final 。下面的“x”显示捕获组。在[
和]
显示的方括号我匹配:
this is [some] more [text]
[xxxxxxxxxxxxxxxx]
This became more complex because I had to match on characters that had special meaning to regular expressions, so we see a lot of backslashing. Plus, I had to account for regular expression greediness, which got the nice looking, non-matching string [^]]*
to match anything not a closing bracket. Add in the square brackets before and after \[[^]]*\]
, and don't forget the \(...\)
capture group: \[\([^]]*\)\]
And you get one big mess of a regular expression.
这变得更加复杂,因为我必须匹配对正则表达式具有特殊意义的字符,所以我们看到了很多反斜杠。另外,我必须考虑到正则表达式的贪婪性,它得到了漂亮的、不匹配的字符串[^]]*
来匹配任何不是右括号的东西。在 之前和之后添加方括号\[[^]]*\]
,不要忘记\(...\)
捕获组:\[\([^]]*\)\]
你会得到一大堆正则表达式。
回答by Tiago Peczenyj
use groups
使用组
sed 's|\[\([^]]*\)\]|\macro{}|g' file
回答by sashang
The following expression matches the pattern [a-z, A-Z and space]
and replaces it with \macro{<whatever was between the []>}
以下表达式匹配模式[a-z, A-Z and space]
并将其替换为\macro{<whatever was between the []>}
sed -e 's/\[\([a-zA-Z ]*\)\]/\macro{}/g'
In the expression the \( ... \)
form a match group that can be referenced later in the substitution as \1
在表达式中\( ... \)
形成一个匹配组,可以在稍后的替换中引用为\1