Java String.replaceAll 带有双反斜杠的单反斜杠
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1701839/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
String.replaceAll single backslashes with double backslashes
提问by Frank Groeneveld
I'm trying to convert the String
\something\
into the String
\\something\\
using replaceAll
, but I keep getting all kinds of errors. I thought this was the solution:
我正在尝试将 theString
\something\
转换为String
\\something\\
using replaceAll
,但我不断收到各种错误。我认为这是解决方案:
theString.replaceAll("\", "\\");
But this gives the below exception:
但这给出了以下例外:
java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
采纳答案by BalusC
The String#replaceAll()
interprets the argument as a regular expression. The \
is an escape character in bothString
and regex
. You need to double-escape it for regex:
将String#replaceAll()
参数解释为正则表达式。该\
是转义字符都String
和regex
。您需要为正则表达式双重转义它:
string.replaceAll("\\", "\\\\");
But you don't necessarily need regex for this, simply because you want an exact character-by-character replacement and you don't need patterns here. So String#replace()
should suffice:
但是您不一定需要为此使用正则表达式,仅仅因为您想要一个逐个字符的精确替换,而您不需要这里的模式。所以String#replace()
应该足够了:
string.replace("\", "\\");
Update: as per the comments, you appear to want to use the string in JavaScript context. You'd perhaps better use StringEscapeUtils#escapeEcmaScript()
instead to cover more characters.
更新:根据评论,您似乎想在 JavaScript 上下文中使用该字符串。你也许最好用它StringEscapeUtils#escapeEcmaScript()
来覆盖更多的字符。
回答by sfussenegger
You'll need to escape the (escaped) backslash in the first argument as it is a regular expression. Replacement (2nd argument - see Matcher#replaceAll(String)) also has it's special meaning of backslashes, so you'll have to replace those to:
您需要在第一个参数中转义(转义)反斜杠,因为它是一个正则表达式。替换(第二个参数 - 请参阅Matcher#replaceAll(String))也具有反斜杠的特殊含义,因此您必须将它们替换为:
theString.replaceAll("\\", "\\\\");
回答by Jonathan Feinberg
Yes... by the time the regex compiler sees the pattern you've given it, it sees only a single backslash (since Java's lexer has turned the double backwhack into a single one). You need to replace "\\\\"
with "\\\\"
, believe it or not! Java really needs a good raw string syntax.
是的...当正则表达式编译器看到您提供的模式时,它只会看到一个反斜杠(因为 Java 的词法分析器已将双反斜杠变成了一个反斜杠)。您需要更换"\\\\"
有"\\\\"
,信不信由你!Java 确实需要一个好的原始字符串语法。
回答by Fabian Steeg
To avoid this sort of trouble, you can use replace
(which takes a plain string) instead of replaceAll
(which takes a regular expression). You will still need to escape backslashes, but not in the wild ways required with regular expressions.
为了避免这种麻烦,您可以使用replace
(它需要一个普通字符串) 而不是replaceAll
(它需要一个正则表达式)。您仍然需要转义反斜杠,但不需要以正则表达式所需的疯狂方式。
回答by Pshemo
TLDR: use theString = theString.replace("\\", "\\\\");
instead.
TLDR:theString = theString.replace("\\", "\\\\");
改为使用。
Problem
问题
replaceAll(target, replacement)
uses regular expression (regex) syntax for target
and partially for replacement
.
replaceAll(target, replacement)
使用正则表达式 (regex) 语法用于target
和部分用于replacement
.
Problem is that \
is special character in regex (it can be used like \d
to represents digit) and in String literal (it can be used like "\n"
to represent line separator or \"
to escape double quote symbol which normally would represent end of string literal).
问题是它\
是正则表达式中的特殊字符(它可以用来\d
表示数字)和字符串文字(它可以用来"\n"
表示行分隔符或\"
转义通常表示字符串文字结尾的双引号)。
In both these cases to create \
symbol we can escapeit (make it literal instead of special character) by placing additional \
before it (like we escape "
in string literals via \"
).
在这两种创建\
符号的情况下,我们可以通过在它之前放置额外的(就像我们在字符串文字中通过转义)来转义它(使其成为文字而不是特殊字符)。\
"
\"
So to target
regex representing \
symbol will need to hold \\
, and string literal representing such text will need to look like "\\\\"
.
因此,target
表示\
符号的正则表达式需要保持\\
,而表示此类文本的字符串文字需要看起来像"\\\\"
.
So we escaped \
twice:
所以我们逃了\
两次:
- once in regex
\\
- once in String literal
"\\\\"
(each\
is represented as"\\"
).
- 一次在正则表达式中
\\
- 一次在字符串文字中
"\\\\"
(每个\
都表示为"\\"
)。
In case of replacement
\
is also special there. It allows us to escape other special character $
which via $x
notation, allows us to use portion of data matched by regex and held by capturing group indexed as x
, like "012".replaceAll("(\\d)", "$1$1")
will match each digit, place it in capturing group 1 and $1$1
will replace it with its two copies (it will duplicate it) resulting in "001122"
.
万一replacement
\
也有特别之处。它允许我们$
通过$x
符号转义其他特殊字符,允许我们使用由正则表达式匹配并由捕获组索引为 的部分数据x
,例如"012".replaceAll("(\\d)", "$1$1")
将匹配每个数字,将其放置在捕获组 1 中并将$1$1
其替换为其两个副本(它会复制它)导致"001122"
.
So again, to let replacement
represent \
literal we need to escape it with additional \
which means that:
所以再次,为了让replacement
代表\
文字我们需要用额外的\
方式来转义它,这意味着:
- replacement must hold two backslash characters
\\
- and String literal which represents
\\
looks like"\\\\"
- 替换必须包含两个反斜杠字符
\\
- 和代表
\\
看起来像的字符串文字"\\\\"
BUT since we want replacement
to hold twobackslashes we will need "\\\\\\\\"
(each \
represented by one "\\\\"
).
但是因为我们想要replacement
保存两个反斜杠,所以我们需要"\\\\\\\\"
(每个\
用一个 表示"\\\\"
)。
So version with replaceAll
can look like
所以版本replaceAll
看起来像
replaceAll("\\", "\\\\");
Easier way
更简单的方法
To make out life easier Java provides tools to automatically escape text into target
and replacement
parts. So now we can focus only on strings, and forget about regex syntax:
为了让生活更轻松,Java 提供了工具来自动将文本转义为target
和replacement
部分。所以现在我们可以只关注字符串,而忘记正则表达式语法:
replaceAll(Pattern.quote(target), Matcher.quoteReplacement(replacement))
which in our case can look like
在我们的例子中,它看起来像
replaceAll(Pattern.quote("\"), Matcher.quoteReplacement("\\"))
Even better
甚至更好
If we don't really need regex syntax support lets not involve replaceAll
at all. Instead lets use replace
. Both methods will replace alltarget
s, but replace
doesn't involve regex syntax. So you could simply write
如果我们真的不需要正则表达式语法支持,就不要参与replaceAll
。相反,让我们使用replace
. 两种方法都将替换所有target
s,但replace
不涉及正则表达式语法。所以你可以简单地写
theString = theString.replace("\", "\\");