java 用单反斜杠替换双反斜杠

Question

提问by Vinay thallam

I have a string "\\u003c", which belongs to UTF-8 charset. I am unable to decode it to unicode because of the presence of double backslashes. How do i get "\u003c" from "\\u003c"? I am using java.

我有一个字符串“\\u003c”，它属于 UTF-8 字符集。由于存在双反斜杠，我无法将其解码为 unicode。我如何从“\\u003c”获取“\\u003c”？我正在使用Java。

I tried with,

我试过，

myString.replace("\\", "\");

but could not achieve what i wanted.

但无法实现我想要的。

This is my code,

这是我的代码

String myString = FileUtils.readFileToString(file);
String a = myString.replace("\\", "\");
byte[] utf8 = a.getBytes();

// Convert from UTF-8 to Unicode
a = new String(utf8, "UTF-8");
System.out.println("Converted string is:"+a);

and content of the file is

和文件的内容是

\u003c

Answer 1

采纳答案by anubhava

Not sure if you're still looking for a solution to your problem (since you have an accepted answer) but I will still add my answer as a possible solution to the stated problem:

不确定您是否仍在寻找问题的解决方案（因为您已经接受了答案），但我仍然会添加我的答案作为上述问题的可能解决方案：

String str = "\u003c";
Matcher m = Pattern.compile("(?i)\\u([\da-f]{4})").matcher(str);
if (m.find()) {
    String a = String.valueOf((char) Integer.parseInt(m.group(1), 16));
    System.out.printf("Unicode String is: [%s]%n", a);
}

OUTPUT:

输出：

Unicode String is: [<]

Here is online demo of the above code

Answer 2

回答by mtyson

You can use String#replaceAll:

您可以使用String#replaceAll：

String str = "\\u003c";
str= str.replaceAll("\\\\", "\\");
System.out.println(str);

It looks weird because the first argument is a string defining a regular expression, and \is a special character both in string literals andin regular expressions. To actually put a \in our search string, we need to escape it (\\) in the literal. But to actually put a \in the regular expression, we have to escape it at the regular expression level as well. So to literally get \\in a string, we need write \\\\in the string literal; and to get two literal \\to the regular expression engine, we need to escape those as well, so we end up with \\\\\\\\. That is:

这看起来很奇怪，因为第一个参数是一个定义正则表达式的字符串，并且\在字符串文字和正则表达式中都是一个特殊字符。要将 a 真正放入\我们的搜索字符串中，我们需要\\在文字中将它 ( )转义。但实际上把一个\在正则表达式，我们在正则表达式水平逃脱它也。因此，要真正\\输入字符串，我们需要写入\\\\字符串文字；并为\\正则表达式引擎获取两个文字，我们也需要转义它们，所以我们最终得到\\\\\\\\. 那是：

String Literal        String                      Meaning to Regex
????????????????????? ??????????????????????????? ?????????????????
\                     Escape the next character   Would depend on next char
\                    \                           Escape the next character
\\                  \                          Literal \
\\\\              \\                        Literal \

In the replacement parameter, even though it's not a regex, it still treats \and $specially — and so we have to escape them in the replacement as well. So to get one backslash in the replacement, we need four in that string literal.

在替换参数中，即使它不是正则表达式，它仍然会\进行$特殊处理——因此我们也必须在替换中对它们进行转义。因此，要在替换中获得一个反斜杠，我们需要在该字符串文字中使用四个反斜杠。

Answer 3

回答by podnov

Another option, capture one of the two slashes and replace both slashes with the captured group:

另一种选择，捕获两个斜杠之一并将两个斜杠替换为捕获的组：

public static void main(String args[])
{
    String str = "C:\\";
    str= str.replaceAll("(\\)\\", "");

    System.out.println(str);
}

Answer 4

回答by jakub.g

Regarding the problem of "replacing double backslashes with single backslashes" or, more generally, "replacing a simple string, containing \, with a different simple string, containing \" (which is not entirely the OP problem, but part of it):

关于“用单反斜杠替换双反斜杠”或更一般地说，“用\一个不同的简单字符串替换一个简单的字符串，包含\”的问题（这不完全是 OP 问题，而是它的一部分）：

Most of the answers in this thread mention replaceAll, which is a wrong tool for the job here. The easier tool is replace, but confusingly, the OP states that replace("\\\\", "\\")doesn't work for him, that's perhaps why all answers focus on replaceAll.

该线程中的大多数答案都提到replaceAll，这是此处工作的错误工具。更简单的工具是replace，但令人困惑的是，OP 声明这replace("\\\\", "\\")对他不起作用，这也许就是为什么所有答案都集中在replaceAll.

Important note for people with JavaScript background: Note that replace(CharSequence, CharSequence)in Java does replace ALL occurrences of a substring - unlike in JavaScript, where it only replaces the first one!

对于具有 JavaScript 背景的人的重要说明： 请注意，replace(CharSequence, CharSequence)在 Java 中确实会替换所有出现的子字符串 - 不像在 JavaScript 中，它只替换第一个！

Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence.

用指定的文字替换序列替换此字符串中与文字目标序列匹配的每个子字符串。

On the other hand, replaceAll(String regex, String replacement)-- more docs also here-- is treating both parameters as more than regular strings:

另一方面，replaceAll(String regex, String replacement)——这里还有更多文档——将这两个参数视为比常规字符串更多的参数：

Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string.

请注意，替换字符串中的反斜杠 () 和美元符号 ($) 可能会导致结果与将其视为文字替换字符串时的结果不同。

(this is because \and $can be used as backreferences to the captured regex groups, hence if you want to used them literally, you need to escape them).

（这是因为\和$可以用作捕获的正则表达式组的反向引用，因此如果您想按字面使用它们，则需要对它们进行转义）。

In other words, both first and 2nd params of replaceand replaceAllbehave differently. For replaceyou need to double the \in both params (standard escaping of a backslash in a string literal), whereas in replaceAll, you need to quadruple it! (standard string escape + function-specific escape)

换句话说，两个第一和第二PARAMSreplace和replaceAll表现不同。因为replace您需要将\两个参数中的加倍（字符串文字中反斜杠的标准转义），而在中replaceAll，您需要将其四倍！（标准字符串转义 + 特定于函数的转义）

To sum up, for simple replacements, one should stick to replace("\\\\", "\\")(it needs only one escaping, not two).

综上所述，对于简单的替换，应该坚持replace("\\\\", "\\")（它只需要一个转义，而不是两个）。

https://ideone.com/ANeMpw

System.out.println("a\\b\\c");                                 // "a\b\c"
System.out.println("a\\b\\c".replaceAll("\\\\", "\\"));  // "a\b\c"
//System.out.println("a\\b\\c".replaceAll("\\\\", "\"));  // runtime error
System.out.println("a\\b\\c".replace("\\", "\"));           // "a\b\c"

https://www.ideone.com/Fj4RCO

String str = "\\u003c";
System.out.println(str);                                // "\u003c"
System.out.println(str.replaceAll("\\\\", "\\")); // "\u003c"
System.out.println(str.replace("\\", "\"));          // "\u003c"

Answer 5

回答by Jaykishan

Try using,

尝试使用，

myString.replaceAll("[\\\\]{2}", "\\\\");

Answer 6

回答by Naveen Kumar Yadav

This is for replacing the double back slash to single back slash

这是用于将双反斜杠替换为单反斜杠

public static void main(String args[])
{
      String str = "\u003c";
      str= str.replaceAll("\\", "\\");

      System.out.println(str);
}

Answer 7

回答by user207421

"\\u003c"does not 'belong to UTF-8 charset' at all. It is fiveUTF-8 characters: '\', '0', '0', '3', and 'c'. The real question here is why are the double backslashes there at all? Or, arethey really there? and is your problem perhaps something completely different? If the String "\\u003c"is in your source code, there are no double backslashes in it at all at runtime, and whatever your problem may be, it doesn't concern decoding in the presence of double backslashes.

"\\u003c"根本不“属于 UTF-8 字符集”。它是五个UTF-8 字符：' \'、'0'、'0'、'3' 和 'c'。这里真正的问题是为什么那里有双反斜杠？或者说，是他们真的存在？你的问题可能完全不同吗？如果 String"\\u003c"在您的源代码中，则在运行时它根本没有双反斜杠，无论您的问题是什么，它都与存在双反斜杠时的解码无关。

java 用单反斜杠替换双反斜杠

提问by Vinay thallam

采纳答案by anubhava

OUTPUT:

输出：

回答by mtyson

回答by podnov

回答by jakub.g

回答by Jaykishan

回答by Naveen Kumar Yadav

回答by user207421

相关推荐

最近更新

标签

java 用单反斜杠替换双反斜杠

提问by Vinay thallam

采纳答案by anubhava

OUTPUT:

输出：

回答by mtyson

回答by podnov

回答by jakub.g

回答by Jaykishan

回答by Naveen Kumar Yadav

回答by user207421

相关推荐

java java中的Package.getPackage返回null

java 无效操作：结果集已关闭

java HttpURLConnection 下载的文件名

java java中列表变量的好名字

相关推荐

最近更新

标签