Java 将 ISO-8859-1 转换为 UTF-8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22017774/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java convert ISO-8859-1 to UTF-8
提问by user3351706
I have a properties file with Asian translations in it, which I believe is saved as ISO-8859-1. I'm trying to convert them to UTF-8. So è-|?:
would equal 警告:
我有一个带有亚洲翻译的属性文件,我相信它被保存为 ISO-8859-1。我正在尝试将它们转换为 UTF-8。所以è-|?:
将等于警告:
I've tried several methods listed on this site as well as some other sites but have had no luck.
我已经尝试了本网站以及其他一些网站上列出的几种方法,但都没有运气。
byte[] isoBytes = line.getBytes("ISO-8859-1");
byte[] utf8 = new String(isoBytes, "ISO-8859-1").getBytes("UTF-8");
CharBuffer charBuf = null;
Charset isocharset = Charset.forName("iso-8859-1");
CharsetDecoder isoDecoder = Charset.forName("iso-8859-1").newDecoder();
CharsetDecoder utf8Decoder = Charset.forName("UTF-8").newDecoder();
byte sByte[] = line.getBytes("iso-8859-1");
charBuf = utf8Decoder.decode(isoBuf);
What is the easiest way to convert è-|?:
to 警告:
?
什么是转换的最简单的方法è-|?:
来警告:
?
Thank You Rich
谢谢你有钱
@Pshemo had the answer I was looking for
@Pshemo 得到了我正在寻找的答案
byte[] isoBytes = line.getBytes("ISO-8859-1");
System.out.println(new String(isoBytes, "UTF-8"));
Thank you all for your help
谢谢大家的帮助
回答by Thomas
The easiest and safest way would be to save the file as UTF-8 and read it as such.
最简单和最安全的方法是将文件另存为 UTF-8 并按原样读取。
Most likely the answers you found here already also stated that ISO Latin-1 (ISO-8859-1) can't store all the code points that UTF-8 can handle (especially asian characters), thus storing properties (text resources?) as ISO Latin-1 will result in losses.
您在这里找到的答案很可能已经说明 ISO Latin-1 (ISO-8859-1) 无法存储 UTF-8 可以处理的所有代码点(尤其是亚洲字符),因此存储属性(文本资源?)因为 ISO Latin-1 会导致损失。
Thus either store it as UTF-8 or use unicode code points, e.g. \u8b66\u544a
for 警告
(Warning
:) ).
因此,要么将其存储为 UTF-8,要么使用 unicode 代码点,例如\u8b66\u544a
for 警告
( Warning
:))。
回答by Joop Eggen
In fact displaying UTF-8 content would yield in ISO-8859-1: è-|? (plus something). So that is fine.
事实上,在 ISO-8859-1 中显示 UTF-8 内容会产生:è-|? (加上一些东西)。所以没关系。
So the file is in UTF-8. The JDK has the tool native2ascii
to convert and unconvert to u-escaping non-ASCII characters to \uXXXX
.
所以文件是UTF-8。JDK 具有native2ascii
将 u 转义非 ASCII 字符转换和取消转换为\uXXXX
.
native2ascii -encoding UTF-8 old.properties new.properties
Use a programmer's editor like JEdit or Notepad++ to be sure of encodings.
使用像 JEdit 或 Notepad++ 这样的程序员编辑器来确保编码。
回答by Gustavo
This worked for me:
这对我有用:
@Pshemo had the answer I was looking for
@Pshemo 得到了我正在寻找的答案
byte[] isoBytes = line.getBytes("ISO-8859-1");
System.out.println(new String(isoBytes, "UTF-8"));