Java RegEx 不区分大小写吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3436118/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is Java RegEx case-insensitive?
提问by Crystal
In Java, when doing a replaceAll to look for a regex pattern like:
在 Java 中,在执行 replaceAll 以查找正则表达式模式时,例如:
replaceAll("\?i\b(\w+)\b(\s+\1)+\b", "");
(to remove duplicate consecutive case-insensitive words, e.g. Test test), I'm not sure where I put the ?i
. I read that it is supposed to be at the beginning, but if I take it out then i catch duplicate consecutive words (e.g. test test), but not case-insensitive words (e.g. Test test). So I thought I could add the ?i in the beginning but that does not seem to get the job done. Any thoughts? Thanks!
(删除重复的连续不区分大小写的单词,例如测试测试),我不确定我把?i
. 我读到它应该在开头,但是如果我将其取出,那么我会捕获重复的连续单词(例如 test test),而不是不区分大小写的单词(例如 Test test)。所以我想我可以在开始时添加 ?i 但这似乎并没有完成工作。有什么想法吗?谢谢!
采纳答案by cnanney
RegexBuddyis telling me if you want to include it at the beginning, this is the correct syntax:
RegexBuddy告诉我如果你想在开头包含它,这是正确的语法:
"(?i)\b(\w+)\b(\s+\1)+\b"
回答by relet
If your whole expression is case insensitive, you can just specify the CASE_INSENSITIVE
flag:
如果您的整个表达式不区分大小写,您只需指定CASE_INSENSITIVE
标志:
Pattern.compile(regexp, Pattern.CASE_INSENSITIVE)
回答by polygenelubricants
Yes, case insensitivity can be enabled and disabled at will in Java regex.
是的,可以在 Java 正则表达式中随意启用和禁用不区分大小写。
It looks like you want something like this:
看起来你想要这样的东西:
System.out.println(
"Have a meRry MErrY Christmas ho Ho hO"
.replaceAll("(?i)\b(\w+)(\s+\1)+\b", "")
);
// Have a meRry Christmas ho
Note that the embeddedPattern.CASE_INSENSITIVE
flag is (?i)
not \?i
. Note also that one superfluous \b
has been removed from the pattern.
请注意,嵌入式Pattern.CASE_INSENSITIVE
标志(?i)
不是\?i
. 另请注意,\b
已从模式中删除了一个多余的内容。
The (?i)
is placed at the beginning of the pattern to enable case-insensitivity. In this particular case, it is not overridden later in the pattern, so in effect the whole pattern is case-insensitive.
在(?i)
被放置在使不区分大小写的图案的开始。在这种特殊情况下,它不会在模式的后面被覆盖,因此实际上整个模式是不区分大小写的。
It is worth noting that in fact you can limit case-insensitivity to only parts of the whole pattern. Thus, the question of where to put it really depends on the specification (although for this particular problem it doesn't matter since \w
is case-insensitive.
值得注意的是,实际上您可以将不区分大小写限制为整个模式的一部分。因此,将它放在哪里的问题实际上取决于规范(尽管对于这个特定问题它并不重要,因为它不\w
区分大小写。
To demonstrate, here's a similar example of collapsing runs of letters like "AaAaaA"
to just "A"
.
为了演示,这里倒塌的信件就像一条流淌的一个类似的例子"AaAaaA"
,只是"A"
。
System.out.println(
"AaAaaA eeEeeE IiiIi OoooOo uuUuUuu"
.replaceAll("(?i)\b([A-Z])\1+\b", "")
); // A e I O u
Now suppose that we specify that the run should only be collapsed only if it starts with an uppercase letter. Then we must put the (?i)
in the appropriate place:
现在假设我们指定仅当运行以大写字母开头时才应折叠运行。然后我们必须把它(?i)
放在适当的地方:
System.out.println(
"AaAaaA eeEeeE IiiIi OoooOo uuUuUuu"
.replaceAll("\b([A-Z])(?i)\1+\b", "")
); // A eeEeeE I O uuUuUuu
More generally, you can enable and disable any flag within the pattern as you wish.
更一般地说,您可以根据需要启用和禁用模式中的任何标志。
See also
也可以看看
java.util.regex.Pattern
- regular-expressions.info/Modifiers
- Specifying Modes Inside The Regular Expression
- Instead of
/regex/i
(Pattern.CASE_INSENSITIVE
in Java), you can do/(?i)regex/
- Instead of
- Turning Modes On and Off for Only Part of The Regular Expression
- You can also do
/first(?i)second(?-i)third/
- You can also do
- Modifier Spans
- You can also do
/first(?i:second)third/
- You can also do
- Specifying Modes Inside The Regular Expression
- regular-expressions.info/Word Boundaries(there's always a
\b
between a\w
and a\s
)
java.util.regex.Pattern
- 正则表达式.info/Modifiers
- 在正则表达式中指定模式
- 而不是
/regex/i
(Pattern.CASE_INSENSITIVE
在Java中),你可以做/(?i)regex/
- 而不是
- 只为部分正则表达式打开和关闭模式
- 你也可以这样做
/first(?i)second(?-i)third/
- 你也可以这样做
- 修改器跨度
- 你也可以这样做
/first(?i:second)third/
- 你也可以这样做
- 在正则表达式中指定模式
- regular-expressions.info/Word Boundaries(
\b
a\w
和 a之间总是有一个\s
)
Related questions
相关问题
回答by Alexander Drobyshevsky
You also can lead your initial string, which you are going to check for pattern matching, to lower case. And use in your pattern lower case symbols respectively.
您还可以将要检查模式匹配的初始字符串变为小写。并分别在您的模式中使用小写符号。
回答by Christian Vielma
You can also match case insensitive regexs and make it more readable by using the Pattern.CASE_INSENSITIVE constant like:
您还可以匹配不区分大小写的正则表达式,并使用 Pattern.CASE_INSENSITIVE 常量使其更具可读性,例如:
Pattern mypattern = Pattern.compile(MYREGEX, Pattern.CASE_INSENSITIVE);
Matcher mymatcher= mypattern.matcher(mystring);