Java 正则表达式错误:\( 不是有效字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5260364/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java regexp error: \( is not a valid character
提问by Marthin
I was using java regexp today and found that you are not allowed to use the following regexp sequence
我今天在用java regexp,发现你不允许使用下面的regexp序列
String pattern = "[a-zA-Z\s\.-\)\(]*";
if I do use it it will fail and tell me that \( is not a valid character.
如果我确实使用它,它会失败并告诉我 \( 不是有效字符。
But if I change the regexp to
但是如果我将正则表达式更改为
String pattern = "[[a-zA-Z\s\.-]|[\(\)]]*";
Then it will work. Is this a bug in the regxp engine or am I not understanding how to work with the engine?
然后它会起作用。这是 regxp 引擎中的错误还是我不了解如何使用引擎?
EDIT:I've had an error in my string: there shouldnt be 2 starting [[, it should be only one. This is now corrected
编辑:我的字符串中有一个错误:不应该有 2 个 [[,它应该只有一个。现在已更正
回答by codaddict
Your regex has two problems.
您的正则表达式有两个问题。
You've not closed the character class.
The
-
is acting as a range operator with.
on LHS and(
on RHS. But(
comes before.
in unicode, so this results in an invalid range.
你还没有关闭字符类。
将
-
充当范围内运营商.
在LHS和(
在RHS。但在 unicode(
之前.
出现,所以这会导致无效范围。
To fix problem 1, close the char class or if you meant to not include [
in the allowed characters delete one of the [
.
要解决问题 1,请关闭 char 类,或者如果您打算不包含[
在允许的字符中,请删除[
.
To fix problem 2, either escape the -
as \\-
or move the -
to the beginning or to the end of the char class.
要解决问题 2,请转义-
as\\-
或将 移至-
char 类的开头或结尾。
So you can use:
所以你可以使用:
String pattern = "[a-zA-Z\s\.\-\)\(]*";
or
或者
String pattern = "[a-zA-Z\s\.\)\(-]*";
or
或者
String pattern = "[-a-zA-Z\s\.\)\(]*";
回答by Tim
You should only use the dash -
at the end of the character class, since it is normally used to show a range (as in a-z
). Rearrange it:
您应该只-
在字符类的末尾使用破折号,因为它通常用于显示范围(如 中所示a-z
)。重新排列:
String pattern = "[[a-zA-Z\s\.\)\(-]*";
Also, I don't think you have to escape (.)
characters inside brackets.
另外,我认为您不必对(.)
括号内的字符进行转义。
Update: As others pointed out, you must also escape the [
in a java regex character class.
更新:正如其他人指出的那样,您还必须转义[
java 正则表达式字符类中的 。
回答by Joachim Sauer
The problem here is that \.-\)
("\\.-\\)"
in a Java string literal) tries to define a range from .
to )
. Since the Unicode codepoint of .
(U+002E) is higher than that of )
(U+0029) this is an error.
这里的问题是\.-\)
("\\.-\\)"
在 Java 字符串文字中)试图定义从.
到的范围)
。由于.
(U+002E)的 Unicode 代码点高于)
(U+0029)的 Unicode 代码点,这是一个错误。
Try using this pattern and you'll see: [z-a]
.
尝试使用此模式,您将看到:[z-a]
.
The correct solution is to either put the dash -
at the end of the character group (at which point it will lose its special meaning) or to escape it.
正确的解决方案是将破折号-
放在字符组的末尾(此时它将失去其特殊含义)或将其转义。
You also need to close the unclosed open square bracket or escape it, if it was not intended for grouping.
如果不用于分组,您还需要关闭未闭合的开放方括号或将其转义。
Also, escaping the fullstop .
is not necessary inside a character group.
此外,.
在字符组内不需要转义句号。
回答by Denis Tulskiy
You have to escape the dash and close the unmatched square bracket. So you are going to get two errors with this regex:
您必须避开破折号并关闭不匹配的方括号。所以这个正则表达式会出现两个错误:
java.util.regex.PatternSyntaxException: Illegal character range near index 14
because the dash is used to specify a range, and \) is obviously a not valid range character. If you escape the dash, making it [[a-zA-Z\s\.\-\)\(]*
you'll get
因为破折号用于指定范围,而 \) 显然是无效的范围字符。如果你逃脱了破折号,成功了,[[a-zA-Z\s\.\-\)\(]*
你会得到
java.util.regex.PatternSyntaxException: Unclosed character class near index 19
which means that you have an extra opening square bracket that is used to specify character class. I don't know what you meant by putting an extra bracket here, but either escaping or removing it will make it a valid regex.
这意味着您有一个额外的左方括号用于指定字符类。我不知道你在这里放置一个额外的括号是什么意思,但是转义或删除它会使它成为有效的正则表达式。