Java 正则表达式错误 - 后视组没有明显的最大长度

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7543746/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 20:24:49  来源:igfitidea点击:

Java regex error - Look-behind group does not have an obvious maximum length

javaregexlookbehind

提问by user963263

I get this error:

我收到此错误:

java.util.regex.PatternSyntaxException: Look-behind group does not have an
    obvious maximum length near index 22
([a-z])(?!.*)(?<!.+)([a-z])(?!.*)(?<!.+)(.)()(.)()
                      ^

I'm trying to match COFFEE, but not BOBBEE.

我正在尝试匹配COFFEE,但不是BOBBEE

I'm using java 1.6.

我正在使用 Java 1.6。

采纳答案by Kobi

Java doesn't support variable length in look behind.
In this case, it seems you can easily ignore it (assuming your entire input is one word):

Java 不支持后视可变长度。
在这种情况下,您似乎可以轻松忽略它(假设您的整个输入是一个词):

([a-z])(?!.*)([a-z])(?!.*)(.)()(.)()

Both lookbehinds do not add anything: the first asserts at least two characters where you only had one, and the second checks the second character is different from the first, which was already covered by (?!.*\1).

两个lookbehinds都没有添加任何东西:第一个断言至少有两个字符,而你只有一个,第二个检查第二个字符是否与第一个不同,第一个已经被(?!.*\1).

Working example: http://regexr.com?2up96

工作示例:http: //regexr.com?2up96

回答by luobo25

To avoid this error, you should replace +with a region like {0,10}:

为避免此错误,您应该替换+为如下区域{0,10}

([a-z])(?!.*)(?<!.{0,10})([a-z])(?!.*)(?<!.{0,10})(.)()(.)()

回答by boiledwater

Java takes things a step further by allowing finite repetition. You still cannot use the star or plus, but you can use the question mark and the curly braces with the max parameter specified. Java determines the minimum and maximum possible lengths of the lookbehind.
The lookbehind in the regex (?<!ab{2,4}c{3,5}d)testhas 6 possible lengths. It can be between 7 to 11 characters long. When Java (version 6 or later) tries to match the lookbehind, it first steps back the minimum number of characters (7 in this example) in the string and then evaluates the regex inside the lookbehind as usual, from left to right. If it fails, Java steps back one more character and tries again. If the lookbehind continues to fail, Java continues to step back until the lookbehind either matches or it has stepped back the maximum number of characters (11 in this example). This repeated stepping back through the subject string kills performance when the number of possible lengths of the lookbehind grows. Keep this in mind. Don't choose an arbitrarily large maximum number of repetitions to work around the lack of infinite quantifiers inside lookbehind. Java 4 and 5 have bugs that cause lookbehind with alternation or variable quantifiers to fail when it should succeed in some situations. These bugs were fixed in Java 6.

Java 通过允许有限重复使事情更进一步。您仍然不能使用星号或加号,但您可以使用带有指定 max 参数的问号和花括号。Java 确定了lookbehind 的最小和最大可能长度。
正则表达式中的回顾(?<!ab{2,4}c{3,5}d)test有 6 种可能的长度。它的长度可以在 7 到 11 个字符之间。当Java(版本6 或更高版本)尝试匹配lookbehind 时,它首先退回字符串中的最少字符数(在本例中为7),然后像往常一样从左到右评估lookbehind 中的正则表达式。如果失败,Java 会再退回一个字符并重试。如果后视继续失败,Java 将继续后退,直到后视匹配或后退最大字符数(在本例中为 11)。当lookbehind 的可能长度的数量增加时,这种重复地退回主题字符串会降低性能。记住这一点。不要选择任意大的最大重复次数来解决lookbehind内部缺少无限量词的问题。Java 4 和 5 有一些错误,当它在某些情况下应该成功时,会导致使用交替或变量量词的后视失败。这些错误已在 Java 6 中修复。

Copied from Here

这里复制