java 字符串编码 - Shift_JIS / UTF-8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30341853/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
String encoding - Shift_JIS / UTF-8
提问by user1011394
I get a string from a 3rd party library, which is not well encoded. Unfortunately I'm not allowed to change the library or use another one...
我从第 3 方库中获取了一个字符串,该字符串没有很好地编码。不幸的是,我不允许更改库或使用另一个库...
So the actual problem is, that the 3rd party library result string will encode characters like "è ò à ù ì ? ? ü, ..." as SHIFT_JIS (Kanji) inside an UTF-8 string. But only if the character is connected to a word and isn't standalone.
所以实际的问题是,第 3 方库结果字符串会将“è ò à ù ì ? ? ü, ...”等字符编码为 UTF-8 字符串中的 SHIFT_JIS(汉字)。但前提是该字符与一个单词相连并且不是独立的。
For example:
例如:
"? Just a simple test"
“?只是一个简单的测试”
"?Just a simple test"
“?只是一个简单的测试”
I tried the following without success:
我尝试了以下但没有成功:
byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");
UPDATE 1:
更新 1:
That's the content of "resultString".
这就是“resultString”的内容。
Note: The byte array shown, is without any modifications (such as getBytes("Shift_JIS"), it's just the resultString as bytes)
注意:显示的字节数组没有任何修改(例如 getBytes("Shift_JIS"),它只是作为字节的 resultString)
Do you have any ideas? Any help would be greatly appreciated. Thank you.
你有什么想法?任何帮助将不胜感激。谢谢你。
回答by user1011394
Well, very strange:
嗯,很奇怪:
As
作为
byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");
didn't work for me I tried the following:
对我不起作用我尝试了以下方法:
String value = new String(resultString.getBytes("SHIFT-JIS"), "UTF-8")
Works like a charm. Maybe it was because of the underscore and lower case character in "Shift_JIS".
奇迹般有效。也许是因为“Shift_JIS”中的下划线和小写字符。