Java字符串替换和NUL(NULL,ASCII 0)字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2523284/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 08:48:42  来源:igfitidea点击:

Java string replace and the NUL (NULL, ASCII 0) character?

javastringreplacenul

提问by praspa

Testing out someone elses code, I noticed a few JSP pages printing funky non-ASCII characters. Taking a dip into the source I found this tidbit:

测试别人的代码,我注意到一些 JSP 页面打印了时髦的非 ASCII 字符。深入研究来源,我发现了这个花絮:

// remove any periods from first name e.g. Mr. John --> Mr John
firstName = firstName.trim().replace('.','
String s = "food".replace('o', '
String s = Character.toString('
System.out.println("A.E.I.O.U".replace(".", "")); // AEIOU
'); System.out.println(s.length()); // "1" assert s.charAt(0) == 0;
'); System.out.println(s.indexOf('
firstName = firstName.trim().replaceAll("\.", "");
')); // "1" System.out.println(s.indexOf('d')); // "3" System.out.println(s.length()); // "4" System.out.println(s.hashCode() == 'f'*31*31*31 + 'd'); // "true"
');

Does replacing a character in a String with a null character even work in Java? I know that '\0'will terminate a C-string. Would this be the culprit to the funky characters?

在 Java 中用空字符替换字符串中的字符是否有效?我知道这'\0'将终止一个 C 字符串。这会是时髦角色的罪魁祸首吗?

采纳答案by polygenelubricants

Does replacing a character in a String with a null character even work in Java? I know that '\0' will terminate a c-string.

在 Java 中用空字符替换字符串中的字符是否有效?我知道 '\0' 将终止一个 c 字符串。

That depends on how you define what is working. Does it replace all occurrences of the target character with '\0'? Absolutely!

这取决于你如何定义什么是有效的。它是否将所有出现的目标字符替换为'\0'?绝对地!

System.out.println( "Mr. Foo".trim().replace('.','
Mr[] Foo
'));

Everything seems to work fine to me! indexOfcan find it, it counts as part of the length, and its value for hash code calculation is 0; everything is as specified by the JLS/API.

对我来说一切似乎都很好!indexOf能找到,算作长度的一部分,其哈希码计算值为0;一切都由 JLS/API 指定。

It DOESN'Twork if you expect replacing a character with the null character would somehow remove that character from the string. Of course it doesn't work like that. A null character is still a character!

DOES NOT如果你希望用空字符替换字符会以某种方式删除的字符串的字符工作。当然,它不是那样工作的。空字符仍然是字符!

##代码##

It also DOESN'Twork if you expect the null character to terminate a string. It's evident from the snippets above, but it's also clearly specified in JLS (10.9. An Array of Characters is Not a String):

它还DOES NOT如果你希望空字符终止字符串工作。从上面的片段中可以明显看出,但在 JLS ( 10.9. An Array of Characters is Not a String) 中也明确指定了这一点:

In the Java programming language, unlike C, an array of charis not a String, and neither a Stringnor an array of charis terminated by '\u0000' (the NUL character).

在 Java 编程语言中,与 C 不同的charString,aString的数组不是 a ,a和 of 的数组都不会char以 '\u0000'(NUL 字符)结尾。



Would this be the culprit to the funky characters?

这会是时髦角色的罪魁祸首吗?

Now we're talking about an entirely different thing, i.e. how the string is rendered on screen. Truth is, even "Hello world!" will look funky if you use dingbats font. A unicode string may look funky in one locale but not the other. Even a properly rendered unicode string containing, say, Chinese characters, may still look funky to someone from, say, Greenland.

现在我们谈论的是完全不同的事情,即字符串如何在屏幕上呈现。事实是,即使是“Hello world!” 如果您使用 dingbats 字体,看起来会很时髦。Unicode 字符串在一种语言环境中可能看起来很时髦,但在另一种语言环境中则不然。即使是正确呈现的包含汉字的 unicode 字符串,对于来自格陵兰等人的人来说,仍然可能看起来很时髦。

That said, the null character probably will look funky regardless; usually it's not a character that you want to display. That said, since null character is not the string terminator, Java is more than capable of handling it one way or another.

也就是说,无论如何,空字符可能看起来很时髦;通常它不是您想要显示的字符。也就是说,由于空字符不是字符串终止符,Java 能够以一种或另一种方式处理它。



Now to address what we assume is the intended effect, i.e. remove all period from a string, the simplest solution is to use the replace(CharSequence, CharSequence)overload.

现在要解决我们假设的预期效果,即从字符串中删除所有句点,最简单的解决方案是使用replace(CharSequence, CharSequence)重载。

##代码##

The replaceAllsolution is mentioned here too, but that works with regular expression, which is why you need to escape the dot meta character, and is likely to be slower.

replaceAll解决方案在这里提到过,但与正则表达式,这就是为什么你需要躲避点元字符,并且是作品可能要慢一些。

回答by Michael Borgwardt

Does replacing a character in a String with a null character even work in Java?

在 Java 中用空字符替换字符串中的字符是否有效?

No.

不。

Would this be the culprit to the funky characters?

这会是时髦角色的罪魁祸首吗?

Quite likely.

很有可能。

回答by Valentin Rocher

I think it should be the case. To erase the character, you should use replace(".", "")instead.

我认为应该是这样。要擦除字符,您应该replace(".", "")改用。

回答by Roman

Should be probably changed to

大概应该改成

##代码##

回答by Jim Ferrans

This does cause "funky characters":

这确实会导致“时髦的角色”:

##代码##

produces:

产生:

##代码##

in my Eclipse console, where the [] is shown as a square box. As others have posted, use String.replace().

在我的 Eclipse 控制台中,[] 显示为一个方框。正如其他人发布的那样,使用String.replace().