oracle 正则表达式中的运算符优先级
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36870168/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Operator precedence in regular expressions
提问by Kenny
What is the default operator precedence in Oracle's regular expressions when they don't contain parentheses?
当 Oracle 的正则表达式不包含括号时,它们的默认运算符优先级是多少?
For example, given
例如,给定
H|ha+
would it be evaluated as H|h
and then concatenated to a
as in ((H|h)a)
, or would the H
be alternated with ha
as in (H|(ha))
?
它会被评估为H|h
然后连接到a
as in ((H|h)a)
,还是会H
与ha
as in交替(H|(ha))
?
Also, when does the +
kick in, etc.?
另外,什么时候+
开始,等等?
回答by Thomas Ayoub
Given the Oracle doc:
鉴于Oracle 文档:
Table 4-2 lists the list of metacharacters supported for use in regular expressions passed to SQL regular expression functions and conditions. These metacharacters conform to the POSIX standard; any differences in behavior from the standard are noted in the "Description" column.
表 4-2 列出了支持在传递给 SQL 正则表达式函数和条件的正则表达式中使用的元字符列表。这些元字符符合 POSIX 标准;与标准行为的任何差异都在“描述”列中注明。
And taking a look at the |
value in that table:
并查看|
该表中的值:
The expression a|b matches character a or character b.
表达式 a|b 匹配字符 a 或字符 b。
Plus taking a look at the POSIX doc:
再加上看一下POSIX 文档:
Operator precedence The order of precedence for of operators is as follows:
Collation-related bracket symbols [==] [::] [..]
Escaped characters \
Character set (bracket expression) []
Grouping ()
Single-character-ERE duplication * + ? {m,n}
Concatenation
Anchoring ^$
Alternation |
运算符优先级运算符的优先级顺序如下:
与排序规则相关的括号符号 [==] [::] [..]
转义字符 \
字符集(括号表达式)[]
分组 ()
单字符 ERE 重复 * + ? {m,n}
级联
锚定^$
交替|
I would say that H|ha+
would be the same as (?:H|ha+)
.
我会说这H|ha+
与(?:H|ha+)
.
回答by robinCTS
Using capturing groups to demonstrate the order of evaluation, the regex H|ha+
is equivalent to the following:
使用捕获组来演示求值顺序,正则表达式H|ha+
等效于以下内容:
(H|(h(a+)))
This is because the precedence rules (as seen below) are applied in order from the highest precedence (the lowest numbered) one to the lowest precedence (the highest numbered) one:
这是因为优先级规则(如下所示)是按照从最高优先级(编号最低)到最低优先级(编号最高)的顺序应用的:
Rule 5 →
(a+)
The+
is grouped with thea
because this operator works on the preceding single character, back-reference, group (a "marked sub-expression" in Oracle parlance), or bracket expression (character class).Rule 6 →
(h(a+))
Theh
is then concatenated with the group in the preceding step.Rule 8 →
(H|(h(a+)))
TheH
is then alternated with the group in the preceding step.
规则 5 →
(a+)
The+
与 the 分组,a
因为此运算符适用于前面的单个字符、反向引用、组(Oracle 用语中的“标记子表达式”)或括号表达式(字符类)。规则 6 →
(h(a+))
Theh
然后与上一步中的组连接。第8→
(H|(h(a+)))
将H
随后与在前面步骤中的组交替。
Precedence table from section 9.4.8 of the POSIX docs for regular expressions(there doesn't seem to be an official Oracle table):
正则表达式的 POSIX 文档第9.4.8 节中的优先表(似乎没有官方的 Oracle 表):
+---+----------------------------------------------------------+
| | ERE Precedence (from high to low) |
+---+----------------------------------------------------------+
| 1 | Collation-related bracket symbols | [==] [::] [..] |
| 2 | Escaped characters | \<special character> |
| 3 | Bracket expression | [] |
| 4 | Grouping | () |
| 5 | Single-character-ERE duplication | * + ? {m,n} |
| 6 | Concatenation | |
| 7 | Anchoring | ^ $ |
| 8 | Alternation | | |
+---+-----------------------------------+----------------------+
The table above is for Extended Regular Expressions. For Basic Regular Expressions see 9.3.7.
上表用于扩展正则表达式。基本正则表达式见9.3.7。