Bash 正则表达式条件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5186292/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Bash Regular Expression Condition
提问by jayem
I have a regular expression that I need to verify. The regular expression has double quotes in it, but I can't seem to figure out how to properly escape them.
我有一个需要验证的正则表达式。正则表达式中有双引号,但我似乎无法弄清楚如何正确转义它们。
First attempt, doesn't work as the quotes are not escaped.
第一次尝试不起作用,因为引号没有被转义。
while read line
do
if [[ $line =~ "<a href="(.+)">HTTP</a>" ]]; then
SOURCE=${BASH_REMATCH[1]}
break
fi
done < tmp/source.html
echo "{$SOURCE}" #output = {"link.html"} (with double quotes)
How can I properly run this so the output is link.htmlwithout double quotes.
我怎样才能正确运行它,以便输出是不带双引号的link.html。
I have tried...
我试过了...
while read line
do
if [[ $line =~ "<a href=/"(.+)/">HTTP</a>" ]]; then
SOURCE=${BASH_REMATCH[1]}
break
fi
done < tmp/source.html
echo "{$SOURCE}" #output = {}
Without luck. Can someone please help me so I can stop beating my head on my desk? I am not great with Bash. Thank you!
没有运气。有人可以帮助我,这样我就可以停止在我的桌子上敲我的头了吗?我不擅长 Bash。谢谢!
回答by Paused until further notice.
It's always best to put your regex in a variable.
最好将正则表达式放在变量中。
pattern='<a href="(.+)">HTTP</a>'
while read line
do
if [[ $line =~ $pattern ]]; then
SOURCE=${BASH_REMATCH[1]}
break
fi
done < tmp/source.html
echo "{$SOURCE}" #output = {link.html} (without double quotes)
If you quote the right hand side (the pattern), it changes the match from regex to a simple string equal (=~effectively becomes ==).
如果引用右侧(模式),它会将匹配从正则表达式更改为一个简单的字符串 equal(=~实际上变为==)。
As a side note, escaping is done with backslashes (\) rather than slashes (/), but that would not help your situation because of the outer quotes as mentioned in my previous paragraph.
作为旁注,转义是使用反斜杠 ( \) 而不是斜杠 ( /) 完成的,但这对您的情况无济于事,因为我在上一段中提到了外部引号。
回答by Satyajit
$line =~ "<a href=\"(.+)\">HTTP</a>"
回答by zhigang
I recommend always use a variable when specifying the regex:
我建议在指定正则表达式时始终使用变量:
#!/bin/bash
SOURCE=
url_re='<a href="(.+)">HTTP</a>'
while read line
do
if [[ "$line" =~ $url_re ]]; then
SOURCE=${BASH_REMATCH[1]}
break
fi
done < test.txt
echo $SOURCE # http://example.com/
# test.txt contents:
# <a href="http://example.com/">HTTP</a>
回答by yong321
Without an intermediate variable (i.e. use the regex directly after =~), it works only if the regex pattern doesn't have certain characters (space, < or >, etc.) and you remove the quotes around the regex, or if the regex is a plain alphanumeric string
如果没有中间变量(即在 =~ 之后直接使用正则表达式),它仅在正则表达式模式没有某些字符(空格、< 或 > 等)并且您删除正则表达式周围的引号时才有效,或者regex 是一个普通的字母数字字符串
$ x='Hello'
$ [[ $x =~ ^H ]] && echo OK
OK
$ [[ $x =~ 'H' ]] && echo OK
OK
$ [[ $x =~ H ]] && echo OK
OK
I stumbled across this page while looking for an explanation on the design of bash that generally doesn't allow you to use regex directly after =~. For example
我在寻找有关 bash 设计的解释时偶然发现了此页面,该解释通常不允许您在 =~ 之后直接使用正则表达式。例如
$ re='^H'
$ [[ $x =~ $re ]] && echo OK
OK
works as expected, while
按预期工作,而
$ [[ $x =~ '^H' ]] && echo OK
does not. I personally always put the regex in a variable first. But I still wonder why bash is designed this way. You can argue assigning the regex to a variable first would overall make the code look neater. Any other reason? If a regex is not supposed to be interpreted as a string, bash could use other ways to represent it. For example, Perl uses slashes, /regex/, or more explicitly m/regex/.
才不是。我个人总是首先将正则表达式放在变量中。但我仍然想知道为什么 bash 是这样设计的。您可以争辩说首先将正则表达式分配给变量会使代码看起来更整洁。还有什么原因吗?如果不应该将正则表达式解释为字符串,则 bash 可以使用其他方式来表示它。例如,Perl 使用斜杠、/regex/ 或更明确地使用 m/regex/。
回答by yong321
Try this "<a href="""(.+)""">HTTP</a>"
尝试这个 "<a href="""(.+)""">HTTP</a>"
Edit, well try this
编辑,试试这个
"<a href="\""(.+)"\"">HTTP</a>"
"<a href="\""(.+)"\"">HTTP</a>"
or
或者
'<a href="(.+)">HTTP</a>'
'<a href="(.+)">HTTP</a>'
or
或者
'<a href='\"'(.+)'\"'>HTTP</a>'<-- this will give the right syntax in Bash, as for the regex (.+), don't know how that will play
'<a href='\"'(.+)'\"'>HTTP</a>'<-- 这将在 Bash 中给出正确的语法,至于正则表达式 (.+),不知道它会如何播放
Edit, what do you get when you use this regex "<a href=(.+)>HTTP</a>"??
编辑,当你使用这个正则表达式时你会得到什么"<a href=(.+)>HTTP</a>"?

