Oracle - 确定正则表达式支持的最大大小

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2694023/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 20:28:26  来源:igfitidea点击:

Oracle - Determine maximum supported size for regular expression

regexoracle

提问by FrustratedWithFormsDesigner

I have a regular expression that throws ORA-12733, "regular expression is too long". How do I determine what the maximum supported size is?

我有一个正则表达式会抛出 ORA-12733,“正则表达式太长”。如何确定支持的最大尺寸是多少?

FYI: the offending regex is 892 characters. It's a generated regex, so I could change how I generate and execute it, but I would like to know what the limits to the max size are before I change how I am generating and executing.

仅供参考:有问题的正则表达式是 892 个字符。这是一个生成的正则表达式,因此我可以更改生成和执行它的方式,但是在更改生成和执行方式之前,我想知道最大大小的限制是什么。

(running Oracle 10.2g)

(运行 Oracle 10.2g)

UPDATE:

更新:

If it depends on the actual regex, here's the begining of it (the rest is just the same thing repeated, with different values between ^and $):

如果它取决于实际的正则表达式,这里是它的开始(其余的只是重复相同的事情,在^和之间具有不同的值$):

(^R_1A$|^R_2A$|^R_3A$|^R_4A$|^R_4B$|^R_5A$|^R_5B$...

(^R_1A$|^R_2A$|^R_3A$|^R_4A$|^R_4B$|^R_5A$|^R_5B$...

回答by Ian Carpenter

Looking at the documentation for the regex functions, REGEXP_SUBSTR, REGEXP_INSTR & REGEXP_REPLACE it has the following quote for the pattern:

查看正则表达式函数 REGEXP_SUBSTR、REGEXP_INSTR 和 REGEXP_REPLACE 的文档,它对模式有以下引用:

pattern is the regular expression. It is usually a text literal and can be of any of the datatypes CHAR, VARCHAR2, NCHAR, or NVARCHAR2. It can contain up to 512 bytes. If the datatype of pattern is different from the datatype of source_char, Oracle Database converts pattern to the datatype of source_char. For a listing of the operators you can specify in pattern`**

模式是正则表达式。它通常是文本文字,可以是任何数据类型 CHAR、VARCHAR2、NCHAR 或 NVARCHAR2。它最多可以包含 512 个字节。如果pattern 的数据类型与source_char 的数据类型不同,则Oracle 数据库将pattern 转换为source_char 的数据类型。对于可以在模式中指定的运算符列表`**

Taken from here

取自这里

回答by Abecee

The sample regex should not need all the start/end of line anchors. ^(R_1A|R_2A|R_3A|R_4A|R_4B|R_5A|R_5B)$would work just as fine.

示例正则表达式不应需要所有行锚点的开始/结束。^(R_1A|R_2A|R_3A|R_4A|R_4B|R_5A|R_5B)$会一样好。

Actually: If the search tokens are really as similar as in the sample, one might want to benefit from it with ^(R_[1-5]A|R_[4-5]B)$or ^(R_([1-5]A|[4-5]B))$(for the search string's part given in the question).

实际上:如果搜索标记确实与示例中的相似,则可能希望使用^(R_[1-5]A|R_[4-5]B)$or ^(R_([1-5]A|[4-5]B))$(对于问题中给出的搜索字符串部分)从中受益。

Verified in 11.2:

在 11.2 中验证:

SELECT i, t FROM (
  SELECT 1 i, 'R_1A' t FROM DUAL UNION ALL
  SELECT 2,   'xR_2A'  FROM DUAL UNION ALL
  SELECT 3,   'R_3Ax'  FROM DUAL UNION ALL
  SELECT 4,   'xR_4Ax' FROM DUAL UNION ALL
  SELECT 5,   'R_4B'   FROM DUAL UNION ALL
  SELECT 6,   'R_5A'   FROM DUAL UNION ALL
  SELECT 7,   'R_5B'   FROM DUAL)
--WHERE REGEXP_LIKE(t, '(^R_1A$|^R_2A$|^R_3A$|^R_4A$|^R_4B$|^R_5A$|^R_5B$)')
--WHERE REGEXP_LIKE(t, '^(R_1A|R_2A|R_3A|R_4A|R_4B|R_5A|R_5B)$')
--WHERE REGEXP_LIKE(t, '^(R_[1-5]A|R_[4-5]B)$')
WHERE REGEXP_LIKE(t, '^(R_([1-5]A|[4-5]B))$')
ORDER BY i;