Java 正则表达式和美元符号

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3853726/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 05:52:54  来源:igfitidea点击:

Java regular expressions and dollar sign

javaregex

提问by azec-pdx

I have Java string:

我有 Java 字符串:

String b = "/feedback/com.school.edu.domain.feedback.Review
String pattern = "/feedback/com.school.edu.domain.feedback.Review
String pattern = "/feedback/com\.navteq\.lcms\.common\.domain\.poi\.feedback\.Review\
"/feedback/com\.navtag\.etc\.Review\
String pattern = 
  "/feedback/com.navteq.lcms.common.domain.poi.feedback.Review\
String pattern = 
  "/feedback/com\.navteq\.lcms\.common\.domain\.poi\.feedback\.Review\
String escapedString = java.util.regex.Pattern.quote(myString)
(.)*";
(.)*";
(.*)"
(.)*";
(.)*";
/feedbackId");

I also have generated pattern against which I want to match this string:

我还生成了要匹配此字符串的模式:

##代码##

When I say b.matches(pattern)it returns false. Now I know dollar sign is part of Java RegEx, but I don't know how should my pattern look like. I am assuming that $ in pattern needs to be replaced by some escape characters, but don't know how many. This $ sign is important to me as it helps me distinguish elements in list (numbers after dollar), and I can't go without it.

当我说它b.matches(pattern)返回时false。现在我知道美元符号是 Java RegEx 的一部分,但我不知道我的模式应该是什么样子。我假设模式中的 $ 需要替换为一些转义字符,但不知道有多少。这个 $ 符号对我很重要,因为它帮助我区分列表中的元素(美元后的数字),我不能没有它。

采纳答案by Colin Hebert

You need to escape $in the regexwith a back-slash (\), but as a back-slash is an escape character in stringsyou need to escape the back-slash itself.

您需要$在正则表达式中使用反斜杠 ( \) 进行转义,但由于反斜杠是字符串中的转义字符,因此您需要对反斜杠本身进行转义。

You will need to escape any special regex char the same way, for example with ".".

您需要以相同的方式转义任何特殊的正则表达式字符,例如使用“.”。

##代码##

回答by kennytm

In Java regex both .and $are special. You need to escape it with 2 backslashes, i.e..

在 Java 正则表达式中,.$都是特殊的。你需要用 2 个反斜杠来转义它,即。

##代码##

(1 backslash is for the Java string, and 1 is for the regex engine.)

(1 个反斜杠用于 Java 字符串,1 个用于正则表达式引擎。)

回答by Julien Hoarau

Escape the dollar with \

逃离美元 \

##代码##

I advise you to escape .as well, .represent any character.

我建议你也逃跑..代表任何角色。

##代码##

回答by Tim Pietzcker

Use

##代码##

to automatically escape all special regex charactersin a given string.

自动转义所有特殊字符的正则表达式给定的字符串中。

回答by rps

The ans by @Colin Hebert and edited by @theon is correct. The explanation is as follows. @azec-pdx

@Colin Hebert 和 @theon 编辑的答案是正确的。解释如下。@azec-pdx

  1. It is a regex as a string literal (within double quotes).

  2. period (.) and dollar-sign ($) are special regex characters (metacharacters).

  3. To make the regex engine interpret them as normal regex characters period(.) and dollar-sign ($), you need to prefix a single backslash to each. The single backslash ( itself a special regex character) quotes the character following it and thus escaping it.

  4. Since the given regex is a string literal, another backslash is required to be prefixed to each to avoid confusion with the usual visible-ASCII escapes(character, string and Unicode escapes in string literals) and thus avoid compiler error.

  5. Even if you use within a string literal any special regex construct that has been defined as an escape sequence, it needs to be prefixed with another backslash to avoid compiler error.For example, the special regex construct (an escape sequence) \b (word boundary) of regex would clash with \b(backspace) of the usual visible-ASCII escape(character escape). Thus another backslash is prefixed to avoid the clash and then \\b would be read by regex as word boundary.

  6. To be always safe, all single backslash escapes (quotes) within string literals are prefixed with another backslash. For example, the string literal "\(hello\)" is illegal and leads to a compile-time error; in order to match the string (hello) the string literal "\\(hello\\)" must be used.

  7. The last period (.)* is supposed to be interpreted as special regex character and thus it needs no quoting by a backslash, let alone prefixing a second one.

  1. 它是作为字符串文字的正则表达式(在双引号内)。

  2. 句点 (.) 和美元符号 ($) 是特殊的正则表达式字符(元字符)。

  3. 要使正则表达式引擎将它们解释为正常的正则表达式字符句点 (.) 和美元符号 ($),您需要为每个字符添加一个反斜杠前缀。单个反斜杠(本身是一个特殊的正则表达式字符)引用它后面的字符,从而转义它。

  4. 由于给定的正则表达式是字符串文字,因此每个反斜杠都需要作为前缀,以避免与通常的可见 ASCII 转义(字符串文字中的字符、字符串和 Unicode 转义)混淆,从而避免编译器错误。

  5. 即使您在字符串文字中使用任何已定义为转义序列的特殊正则表达式结构,它也需要以另一个反斜杠作为前缀以避免编译器错误。例如,特殊的正则表达式结构(转义序列)\b (word正则表达式的边界)会与通常的可见 ASCII 转义(字符转义)的 \b(backspace) 发生冲突。因此,另一个反斜杠是前缀以避免冲突,然后 \\b 将被正则表达式读取为单词边界。

  6. 为了始终安全,字符串文字中的所有单反斜杠转义(引号)都以另一个反斜杠为前缀。例如,字符串文字“\(hello\)”是非法的,会导致编译时错误;为了匹配字符串 (hello),必须使用字符串文字“\\(hello\\)”。

  7. 最后一个句号 (.)* 应该被解释为特殊的正则表达式字符,因此它不需要用反斜杠引用,更不用说为第二个加前缀了。