eclipse 将已知编码的文件转换为 UTF-8

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4383504/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 15:27:38  来源:igfitidea点击:

Convert File with known encoding to UTF-8

javaeclipseunicodeencodingutf-8

提问by HymanBauer

I need to convert text file to the String, which, finally, I should put as an input parameter (type InputStream) to IFile.create (Eclipse). Looking for the example or how to do that but still can not figure out...need your help!

我需要将文本文件转换为字符串,最后,我应该将其作为输入参数(类型 InputStream)放入 IFile.create (Eclipse)。正在寻找示例或如何执行此操作,但仍然无法弄清楚...需要您的帮助!

just for testing, I did try to convert original text file to UTF-8 encoded with this code

只是为了测试,我确实尝试将原始文本文件转换为使用此代码编码的 UTF-8

FileInputStream fis = new FileInputStream(FilePath);
InputStreamReader isr = new InputStreamReader(fis);

Reader in = new BufferedReader(isr);
StringBuffer buffer = new StringBuffer();

int ch;
while ((ch = in.read()) > -1) {
    buffer.append((char)ch);
}
in.close();


FileOutputStream fos = new FileOutputStream(FilePath+".test.txt");
Writer out = new OutputStreamWriter(fos, "UTF8");
out.write(buffer.toString());
out.close();

but even thought the final *.test.txt file has UTF-8 encoding, the characters inside are corrupted.

但即使认为最终的 *.test.txt 文件具有 UTF-8 编码,里面的字符也已损坏。

回答by Matt Ball

You need to specify the encoding of the InputStreamReaderusing the Charsetparameter.

您需要指定InputStreamReaderusingCharset参数的编码。

                                    // ↓ whatever the input's encoding is
Charset inputCharset = Charset.forName("ISO-8859-1");
InputStreamReader isr = new InputStreamReader(fis, inputCharset));

This also works:

这也有效:

InputStreamReader isr = new InputStreamReader(fis, "ISO-8859-1"));


See also:

也可以看看:

SO search where I found all these links: https://stackoverflow.com/search?q=java+detect+encoding

所以搜索我找到所有这些链接的地方:https: //stackoverflow.com/search?q=java+detect+encoding



You can get the default charset - which is comes from the system the JVM is running on - at runtime via Charset.defaultCharset().

您可以在运行时通过Charset.defaultCharset().