Java - 从文件中转义字符串中的双引号

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33893701/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 22:15:20  来源:igfitidea点击:

Java - escaping double quotes in string from file

javaregex

提问by Pavel_K

I have html string from file. I need to escape all double quotes. So I do this way:

我有来自文件的 html 字符串。我需要转义所有双引号。所以我这样做:

String content=readFile(file.getAbsolutePath(), StandardCharsets.UTF_8);
content=content.replaceAll("\"","\\"");
System.out.println(content);

However, the double quotes are not escaped and the string is the same as it was before replaceAll method. When I do

但是,双引号不会被转义,并且字符串与 replaceAll 方法之前的相同。当我做

String content=readFile(file.getAbsolutePath(), StandardCharsets.UTF_8);
content=content.replaceAll("\"","^^^");
System.out.println(content);

All double quotes are replaced with ^^^.

所有双引号都替换为 ^^^。

Why content.replaceAll("\"","\\\"");doesn't work?

为什么content.replaceAll("\"","\\\"");不起作用?

回答by Wiktor Stribi?ew

You need to use 4 backslashes to denote one literal backslash in the replacement pattern:

您需要使用 4 个反斜杠来表示替换模式中的一个文字反斜杠:

content=content.replaceAll("\"","\\\"");

Here, \\\\means a literal \and \"means a literal ".

在这里,\\\\表示文字\\"表示文字"

More details at Java String#replaceAlldocumentation:

JavaString#replaceAll文档中的更多详细信息:

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll

请注意,替换字符串中的反斜杠 ( \) 和美元符号 ( $) 可能会导致结果与将其视为文字替换字符串时的结果不同;见Matcher.replaceAll

And later in Matcher.replaceAlldocumentation:

后来在Matcher.replaceAll文档中:

Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

美元符号可以被视为对上述捕获子序列的引用,反斜杠用于转义替换字符串中的文字字符。

Another fun replacement is replacing quotes with dollar sign: the replacement is "\\$". The 2 \s turn into 1 literal \for the regex engine and it escapes the special character $used to define backreferences. So, now it is a literal inside the replacement pattern.

另一个有趣的替换是用美元符号替换引号:替换是"\\$". 对于正则表达式引擎,2 \s 变成 1 文字\,它转义$用于定义反向引用的特殊字符。所以,现在它是替换模式中的文字。

回答by Konstantin Yovkov

You need to do :

你需要做:

String content = "some content with \" quotes.";
content = content.replaceAll("\"", "\\\"");

Why will this work?

为什么这会起作用?

\"represents the "symbol, while you need \".

\"代表"符号,而您需要\".

If you add a \as a prefix (\\") then you'll have to escape the prefix too, i.e. you'll have a \\\". This will now represent \", where \is not the escaping character, but the symbol \.

如果您添加 a\作为前缀 ( \\") 那么您也必须转义前缀,即您将有一个\\\". 这现在将代表\",其中\不是转义字符,而是符号\

However in the Java String the "character will be escaped with a \and you will have to replace it as well. Therefore prefixing again with \\will do fine:

但是,在 Java 字符串中,"字符将被转义为 a \,您也必须替换它。因此,再次添加前缀\\就可以了:

x = x.replaceAll("\"", "\\\"");

回答by Andy Turner

Honestly, I am surprised by the behaviour, but it seems like you need to double-escape the backslash:

老实说,我对这种行为感到惊讶,但似乎您需要双重转义反斜杠:

System.out.println("\"Hello world\"".replaceAll("\"", "\\\""));

which outputs:

输出:

\"Hello world\"

Demo

演示

回答by GaspardP

It took me way too long in Java to discover Pattern.quoteand Matcher.quoteReplacement. These will you achieve what you are trying to do here - which is a simple "find" and "replace" - without any regex and escape logic. The Pattern.quotehere would not be necessary but it shows how you can ensure that the "find" part is not interpreted as a regex string:

我在 Java 中花了很长时间才发现Pattern.quoteMatcher.quoteReplacement. 这些将你实现你在这里尝试做的事情——这是一个简单的“查找”和“替换”——没有任何正则表达式和转义逻辑。在Pattern.quote这里不会是必要的,但它显示了如何保证“发现”部分不被解释为一个正则表达式的字符串:

@Test
public void testEscapeQuotes()
{
    String content="some content with \"quotes\".";
    content=content.replaceAll(Pattern.quote("\""), Matcher.quoteReplacement("\\""));
    Assert.assertEquals("some content with \\"quotes\\".", content);
}

Remember that you can also use the simple .replacemethod which will also "replaceAll" but will not interpret your parameters as regular expressions:

请记住,您也可以使用简单的.replace方法,该方法也将“replaceAll”,但不会将您的参数解释为正则表达式:

@Test
public void testEscapeQuotes()
{
    String content="some content with \"quotes\".";
    content=content.replace("\"", "\\"");
    Assert.assertEquals("some content with \\"quotes\\".", content);
}