java 字符串编码 - Shift_JIS / UTF-8

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30341853/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 16:58:03  来源:igfitidea点击:

String encoding - Shift_JIS / UTF-8

javaandroidcharacter-encoding

提问by user1011394

I get a string from a 3rd party library, which is not well encoded. Unfortunately I'm not allowed to change the library or use another one...

我从第 3 方库中获取了一个字符串,该字符串没有很好地编码。不幸的是,我不允许更改库或使用另一个库...

So the actual problem is, that the 3rd party library result string will encode characters like "è ò à ù ì ? ? ü, ..." as SHIFT_JIS (Kanji) inside an UTF-8 string. But only if the character is connected to a word and isn't standalone.

所以实际的问题是,第 3 方库结果字符串会将“è ò à ù ì ? ? ü, ...”等字符编码为 UTF-8 字符串中的 SHIFT_JIS(汉字)。但前提是该字符与一个单词相连并且不是独立的。

For example:

例如:

"? Just a simple test"

“?只是一个简单的测试”

Standalone

独立

"?Just a simple test"

“?只是一个简单的测试”

Connected

连接的

I tried the following without success:

我尝试了以下但没有成功:

byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");

UPDATE 1:

更新 1:

That's the content of "resultString".

这就是“resultString”的内容。

Note: The byte array shown, is without any modifications (such as getBytes("Shift_JIS"), it's just the resultString as bytes)

注意:显示的字节数组没有任何修改(例如 getBytes("Shift_JIS"),它只是作为字节的 resultString)

enter image description hereenter image description here

在此处输入图片说明在此处输入图片说明

Do you have any ideas? Any help would be greatly appreciated. Thank you.

你有什么想法?任何帮助将不胜感激。谢谢你。

回答by user1011394

Well, very strange:

嗯,很奇怪:

As

作为

byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");

didn't work for me I tried the following:

对我不起作用我尝试了以下方法:

String value = new String(resultString.getBytes("SHIFT-JIS"), "UTF-8")

Works like a charm. Maybe it was because of the underscore and lower case character in "Shift_JIS".

奇迹般有效。也许是因为“Shift_JIS”中的下划线和小写字符。