bash 如何用sed用其他语法替换成对的方括号?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10646418/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:07:42  来源:igfitidea点击:

How to replace paired square brackets with other syntax with sed?

bashsed

提问by Village

I want to replace all pairs of square brackets in a file, e.g., [some text], with \macro{some text}, e.g.:

我想[some text]\macro{some text}, 例如替换文件中的所有方括号对,例如:

This is some [text].
This [line] has [some more] text.

This becomes:

这变成:

This is some \macro{text}.
This \macro{line} has \macro{some more} text.
  • The pairs only occur on individual lines, never across multiple lines.
  • Sometimes there might be more than one pair on a single line, but they are never nested.
  • If a bracket is found alone on a line, without a pair, then it should not be changed.
  • 这些对只出现在单独的行上,永远不会跨越多行。
  • 有时一行上可能有不止一对,但它们从不嵌套。
  • 如果在一行上单独找到一个括号,没有成对,则不应更改它。

How can I replace these pairs of brackets with this code?

如何用此代码替换这些括号对?

采纳答案by dcbyers

sed -e 's/\[\([^]]*\)\]/\macro{}/g' file.txt

This looks for an opening bracket, any number of explicitly non-closing brackets, then a closing bracket. The group is captured by the parens and inserted into the replacement expression.

这将查找一个左括号,任意数量的明确非右括号,然后是一个右括号。该组由括号捕获并插入到替换表达式中。

回答by David W.

It took a little doing, but here:

花了一点时间,但在这里:

sed -i.bkup  's/\[\([^]]*\)\]/\macro{}/g' test.txt

Let's see if I can explain this regular expression:

让我们看看我是否可以解释这个正则表达式:

  1. The \[is matching a square bracket. Since [is a valid magic regular expression character, the backslash means to match the literal character.
  2. The (...) is a capture group. It captures the part of the regular expression I want. I can have many capture groups, and in sedI can reference them as \1, \2, etc.
  3. Inside the capture group \(...\). I have [^]]*.
    1. The [^...]syntax means any character but.
    2. The [^]]means any character but a closing brace.
    3. The *means zero or more of the preceding. That means I am capturing zero or more characters that are not closing square braces.
  4. The \]means the closing square bracket
  1. \[是匹配方括号。由于[是一个有效的魔法正则表达式字符,反斜杠意味着匹配文字字符。
  2. (...) 是一个捕获组。它捕获了我想要的正则表达式的一部分。我可以有很多的捕捉组,并在sed我可以引用它们作为\1\2等等。
  3. 在捕获组里面\(...\)。我有[^]]*
    1. [^...]语法是指任何字符,但。
    2. [^]]指任何字符,但一个右括号。
    3. The*表示前面的零个或多个。这意味着我正在捕获零个或多个未关闭方括号的字符。
  4. \]装置中的右方括号

Let's look at the line this is [some] more [text]

让我们看看这行[some] more [text]

  • In #1 above, I capture the first open square bracket in front of the word some. However, it's not in a capture group. This is the first character I'm going to substitute.
  • I now start a capture group. I am capturing according to 3.2 and 3.3 above, starting with the letter sin someas many characters as possible that are not closing square brackets. This means I am matching [some, but only capturing some.
  • In #4, I have ended my capture group. I've matched for substitution purposes [someand now I'm matching on the last closing square bracket. That means I'm matching [some]. Note that regular expressions are normally greedy. I'll explain below why this is important.
  • Now, I can match the replacement string. This is much easier. It's \\macro(\1). The \1is replaced by my capture group. The \\is just a backslash. Thus, I'll replace [some]with \macro{some}.
  • 在上面的#1 中,我捕获了单词some前面的第一个方括号。但是,它不在捕获组中。这是我要替换的第一个字符。
  • 我现在开始一个捕获组。我根据上面的 3.2 和 3.3 进行捕获,以字母开头s其中包含尽可能多的不是方括号的字符。这意味着我正在匹配[some,但只捕获some.
  • 在 #4 中,我已经结束了我的捕获组。我已经为替换目的[some进行了匹配,现在我正在匹配最后一个右方括号。这意味着我正在匹配[some]. 请注意,正则表达式通常是贪婪的。我将在下面解释为什么这很重要。
  • 现在,我可以匹配替换字符串。这要容易得多。它是\\macro(\1)。该\1由我捕获组所取代。这\\只是一个反斜杠。因此,我将替换[some]\macro{some}.

It would be much easier if I could be guaranteed a single set of square brackets in each line. Then I could have done this:

如果我可以保证每行有一组方括号,那就容易多了。然后我可以这样做:

sed -i.bkup 's/\[\(.*\)\]/\macro()/g'

The capture group is now saying anything between to square brackets. However, the problem is that regular expressions are greedy, that means I would have matched from the sin someall the way to the final tin text. The 'x' below show the capture group. The [and ]show the square brackets I'm matching on:

捕获组现在在方括号之间说什么。然而,问题是正则表达式是贪婪的,这意味着我会从sin一直匹配somet文本中的 final 。下面的“x”显示捕获组。在[]显示的方括号我匹配:

 this is [some] more [text]
         [xxxxxxxxxxxxxxxx]

This became more complex because I had to match on characters that had special meaning to regular expressions, so we see a lot of backslashing. Plus, I had to account for regular expression greediness, which got the nice looking, non-matching string [^]]*to match anything not a closing bracket. Add in the square brackets before and after \[[^]]*\], and don't forget the \(...\)capture group: \[\([^]]*\)\]And you get one big mess of a regular expression.

这变得更加复杂,因为我必须匹配对正则表达式具有特殊意义的字符,所以我们看到了很多反斜杠。另外,我必须考虑到正则表达式的贪婪性,它得到了漂亮的、不匹配的字符串[^]]*来匹配任何不是右括号的东西。在 之前和之后添加方括号\[[^]]*\],不要忘记\(...\)捕获组:\[\([^]]*\)\]你会得到一大堆正则表达式。

回答by Tiago Peczenyj

use groups

使用组

sed 's|\[\([^]]*\)\]|\macro{}|g' file

回答by sashang

The following expression matches the pattern [a-z, A-Z and space]and replaces it with \macro{<whatever was between the []>}

以下表达式匹配模式[a-z, A-Z and space]并将其替换为\macro{<whatever was between the []>}

sed -e 's/\[\([a-zA-Z ]*\)\]/\macro{}/g'

In the expression the \( ... \)form a match group that can be referenced later in the substitution as \1

在表达式中\( ... \)形成一个匹配组,可以在稍后的替换中引用为\1