Java -é 变成 ?? - 如何修复它
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16208517/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java - é becomes ?? - How to fix it
提问by user2172625
I have a folder tree in French. While I'm reading it's folders/files, it returns ?? instead of é. I replace the character, but it is not a good solution. How can I fix this ? I found some answers on google, but it doesn't help me.
我有一个法语文件夹树。当我阅读它的文件夹/文件时,它返回 ?? 而不是é。我替换了字符,但这不是一个好的解决方案。我怎样才能解决这个问题 ?我在谷歌上找到了一些答案,但对我没有帮助。
Thanks!
谢谢!
回答by Afriza N. Arief
when starting the application, set the encoding to utf-8:
启动应用程序时,将编码设置为 utf-8:
java -Dfile.encoding="UTF-8" YourMainClass
Note that as mentioned in the link above, many Java classes cache the encoding; therefore if you change the encoding during run-time, it may not affect all of the classes that we are concerned.
请注意,如上面的链接所述,许多 Java 类都会缓存编码;因此,如果您在运行时更改编码,它可能不会影响我们关注的所有类。
Copying explanation from tchristin his answerto another question:
A
\N{LATIN SMALL LETTER E WITH ACUTE}
character is code pointU+00E9
. In UTF-8, that is\xC3\xA9
.But if you turn around and treat those two bytes as distinct code points
U+00C3
andU+00A9
, those are\N{LATIN CAPITAL LETTER A WITH TILDE}
and\N{COPYRIGHT SIGN}
, respectively.
一个
\N{LATIN SMALL LETTER E WITH ACUTE}
字符是代码点U+00E9
。在 UTF-8 中,即\xC3\xA9
.但是,如果您转过身来将这两个字节视为不同的代码点
U+00C3
andU+00A9
,则它们分别是\N{LATIN CAPITAL LETTER A WITH TILDE}
和\N{COPYRIGHT SIGN}
。
回答by Walter Macambira
You are facing an encoding problem.
您正面临编码问题。
Any string is actually a set of bits. To make them readable, we use mappings of groups of bits to a character representation we can read. Those 'maps' represent what is called an encoding.
任何字符串实际上都是一组位。为了使它们可读,我们使用位组映射到我们可以读取的字符表示。这些“映射”代表所谓的编码。
The problem you are having is because you reading bits encoded using one 'map' and displaying it using another 'map'.
您遇到的问题是因为您读取使用一个“地图”编码的位并使用另一个“地图”显示它。
Be sure to use the same encoding and always check if your string manipulation functions work with the encoding being used. It is fundamental for proper working of your application.
确保使用相同的编码,并始终检查您的字符串操作函数是否适用于所使用的编码。它是您的应用程序正常工作的基础。
回答by Padrus
This typically) happens when you're not decoding the text in the right encoding format (probably UTF-8).
这通常)发生在您没有以正确的编码格式(可能是 UTF-8)解码文本时。
If you want a more precise answer, post us your code so we can try to correct it.
如果您想要更准确的答案,请将您的代码发布给我们,以便我们尝试更正。
回答by tchrist
The code is displaying the right bits — what is wrong is that the thing you are using to look at those bits has been told that the bits are in a different encoding than they actually are.
代码显示了正确的位——错误的是你用来查看这些位的东西被告知这些位的编码与实际不同。
This is not a Java problem. This is a problem with whatever software you are using to look at the Java output. For example, your Terminal encoding might be set to ISO-8859-15 rather than the UTF-8 that Java is emitting.
这不是 Java 问题。这是您用来查看 Java 输出的任何软件的问题。例如,您的终端编码可能设置为 ISO-8859-15 而不是 Java 发出的 UTF-8。
It really helps to have an all–UTF-8 workflow for the external world, and an internal world of abstract Unicode code points.
拥有一个面向外部世界的全 UTF-8 工作流程和一个包含抽象 Unicode 代码点的内部世界真的很有帮助。
I suppose it is possible that your are misreading some input, input that is in UTF-8 but which you are misreading as being in some legacy 8-bit encoding. But my best guess is the one already given, that your display device/program's encoding is mis-set.
我想您可能误读了某些输入,即 UTF-8 输入,但您误读了某些传统 8 位编码。但我最好的猜测是已经给出的,您的显示设备/程序的编码设置错误。
回答by karthikeyan paneerselvam
I have used below code to print é
java unicode to file is working
我使用下面的代码将é
java unicode打印到文件正在工作
writer1 = new FileWriter(outputFile, true);
writer2 = new BufferedWriter(writer1);
String str = new String(stringBuffer.toString().getBytes(), **"ISO-8859-1"**);
writer2.write(str);
writer1.flush();
writer2.flush();