bash 正则表达式匹配行尾

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30386455/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 13:01:50  来源:igfitidea点击:

Regex to match the end of line

regexbash

提问by steve

I am looking for BASH regex to pull the 'db' agruments from the below commands. The order of the arguments is not guaranteed however. For some reason I cannot get it to work completely.

我正在寻找 BASH 正则表达式来从以下命令中提取“db”参数。但是,不能保证参数的顺序。出于某种原因,我无法让它完全工作。

What I have so far

到目前为止我所拥有的

regex="--db (.*)($| --)"
[[ $@ =~ $regex ]]
DB_NAMES="${BASH_REMATCH[1]}"

# These are example lines
somecommand --db myDB --conf /var/home # should get "myDB"
somecommand --db myDB anotherDB manymoreDB --conf /home # should get "myDB anotherDB manymoreDB" 
somecommand --db myDB # should get "myDB"
somecommand --db myDB anotherDB # should get "myDB anotherDB"

Any suggestion on the regex?

关于正则表达式的任何建议?

回答by axiac

The problem is that bashuses a flavor of regexthat does not include non-greedy repetition operators (*?, +?). Because *is greedy and there is no way to tell it to not be greedy, the first parenthesized subexpression ((.*)) matches everything up to the end of line.

问题是bash使用regex不包括非贪婪重复运算符 ( *?, +?)的风味。因为*is greedy 并且没有办法告诉它不贪婪,所以第一个带括号的子表达式 ( (.*)) 匹配到行尾的所有内容。

You can work around this if you know for that the values you want to capture do not contain a certain character and replace .with the character class that excludes that character.

如果您知道要捕获的值不包含特定字符并替换.为排除该字符的字符类,则可以解决此问题。

For example, if the values after --dbdo not contain dashes (-) you can use this regex:

例如,如果后面的值--db不包含破折号 ( -),您可以使用regex

regex='--db ([^-]*)($| --)'

It matches all the examples posted in the question.

它匹配问题中发布的所有示例。

回答by Martin Konecny

The following works:

以下工作:

regex="--db[[:space:]]([[:alnum:][:space:]]+)([[:space:]]--|$)"
[[ "$@" =~ $regex ]]

There were two issues:

有两个问题:

  1. Character classes such as [:space:] should be used to represent whitespace
  2. (.*)is greedy and will go as far as your last --literal. Since bash doesn't support non-greedy matching, we have to match using [[:alnum:][:space:]]which will guarantee we stop at the next --.
  1. 应该使用 [:space:] 等字符类来表示空格
  2. (.*)是贪婪的,并且会达到你最后的--文字。由于 bash 不支持非贪婪匹配,我们必须匹配 using[[:alnum:][:space:]]这将保证我们在下一个--.

回答by Downgoat

By default, RegEx tries to get the most matches possible, use a non-greedy (lazy)quantifier. You might also want to put --first so the engine will use that first

默认情况下,RegEx 尝试获得尽可能多的匹配,使用非贪婪(惰性)量词。您可能还想--放在第一位,以便引擎首先使用它

--db[[:space:]](.*?)([[:space:]]--|$)

Demo

演示



如果您不想要----,则可以使用非捕获组

--db[[:space:]](.*?)(?:[[:space:]]--|$)
                     ^^ Notice the ?:

Demo

演示

回答by KatonahMike

I think you want to match on non-space characters to catch the first grouping:

我认为您想匹配非空格字符以捕获第一个分组:

regex="--db (\S+)( --|$)"