使用 FileWriter (Java) 以 UTF-8 格式写入文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9852978/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-16 12:21:31  来源:igfitidea点击:

Write a file in UTF-8 using FileWriter (Java)?

javafile-iounicodeutf-8file-format

提问by user1280970

I have the following code however, I want it to write as a UTF-8 file to handle foreign characters. Is there a way of doing this, is there some need to have a parameter?

但是,我有以下代码,我希望将其编写为 UTF-8 文件来处理外来字符。有没有办法做到这一点,是否需要有一个参数?

I would really appreciate your help with this. Thanks.

我真的很感激你在这方面的帮助。谢谢。

try {
  BufferedReader reader = new BufferedReader(new FileReader("C:/Users/Jess/My Documents/actresses.list"));
  writer = new BufferedWriter(new FileWriter("C:/Users/Jess/My Documents/actressesFormatted.csv"));
  while( (line = reader.readLine()) != null) {
    //If the line starts with a tab then we just want to add a movie
    //using the current actor's name.
    if(line.length() == 0)
      continue;
    else if(line.charAt(0) == '\t') {
      readMovieLine2(0, line, surname.toString(), forename.toString());
    } //Else we've reached a new actor
    else {
      readActorName(line);
    }
  }
} catch (IOException e) {
  e.printStackTrace();
}

回答by Edwin Dalorzo

You need to use the OutputStreamWriterclass as the writer parameter for your BufferedWriter. It does accept an encoding. Review javadocsfor it.

您需要使用OutputStreamWriter该类作为BufferedWriter. 它确实接受编码。查看它的javadocs

Somewhat like this:

有点像这样:

BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
    new FileOutputStream("jedis.txt"), "UTF-8"
));

Or you can set the current system encoding with the system property file.encodingto UTF-8.

或者您可以使用系统属性将当前系统编码设置file.encoding为 UTF-8。

java -Dfile.encoding=UTF-8 com.jediacademy.Runner arg1 arg2 ...

You may also set it as a system property at runtime with System.setProperty(...)if you only need it for this specific file, but in a case like this I think I would prefer the OutputStreamWriter.

您也可以在运行时将其设置为系统属性,System.setProperty(...)如果您只需要为这个特定文件使用它,但在这种情况下,我想我更喜欢OutputStreamWriter.

By setting the system property you can use FileWriterand expect that it will use UTF-8 as the default encoding for your files. In this case for all the files that you read and write.

通过设置系统属性,您可以使用FileWriter并期望它将使用 UTF-8 作为文件的默认编码。在这种情况下,对于您读取和写入的所有文件。

EDIT

编辑

  • Starting from API 19, you can replace the String "UTF-8" with StandardCharsets.UTF_8

  • As suggested in the comments below by tchrist, if you intend to detect encoding errors in your file you would be forced to use the OutputStreamWriterapproach and use the constructor that receives a charset encoder.

    Somewhat like

    CharsetEncoder encoder = Charset.forName("UTF-8").newEncoder();
    encoder.onMalformedInput(CodingErrorAction.REPORT);
    encoder.onUnmappableCharacter(CodingErrorAction.REPORT);
    BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("jedis.txt"),encoder));
    

    You may choose between actions IGNORE | REPLACE | REPORT

  • 从 API 19 开始,您可以将字符串“UTF-8”替换为 StandardCharsets.UTF_8

  • 正如tchrist在下面的评论中所建议的,如果您打算检测文件中的编码错误,您将被迫使用该OutputStreamWriter方法并使用接收字符集编码器的构造函数。

    有点像

    CharsetEncoder encoder = Charset.forName("UTF-8").newEncoder();
    encoder.onMalformedInput(CodingErrorAction.REPORT);
    encoder.onUnmappableCharacter(CodingErrorAction.REPORT);
    BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("jedis.txt"),encoder));
    

    您可以在操作之间进行选择 IGNORE | REPLACE | REPORT

Also, this question was already answered here.

另外,这里已经回答这个问题。

回答by Michael Borgwardt

Ditch FileWriterand FileReader, which are useless exactly because they do not allow you to specify the encoding. Instead, use

DitchFileWriterFileReader,这完全没用,因为它们不允许您指定编码。相反,使用

new OutputStreamWriter(new FileOutputStream(file), StandardCharsets.UTF_8)

new OutputStreamWriter(new FileOutputStream(file), StandardCharsets.UTF_8)

and

new InputStreamReader(new FileInputStream(file), StandardCharsets.UTF_8);

new InputStreamReader(new FileInputStream(file), StandardCharsets.UTF_8);

回答by tchrist

Safe Encoding Constructors

安全编码构造函数

Getting Java to properly notify you of encoding errors is tricky. You must use the most verboseand, alas, the least usedof the four alternate contructors for each of InputStreamReaderand OutputStreamWriterto receive a proper exception on an encoding glitch.

让 Java 正确地通知您编码错误是很棘手的。您必须使用四个备用构造函数中最冗长且使用最少的一个,InputStreamReaderOutputStreamWriter在编码故障时接收适当的异常。

For file I/O, always make sure to always use as the second argument to both OutputStreamWriterand InputStreamReaderthe fancy encoder argument:

对于文件 I/O,请始终确保始终将其用作第二个参数OutputStreamWriterInputStreamReader花式编码器参数:

  Charset.forName("UTF-8").newEncoder()

There are other even fancier possibilities, but none of the three simpler possibilities work for exception handing. These do:

还有其他更高级的可能性,但三种更简单的可能性都不适用于异常处理。这些做:

 OutputStreamWriter char_output = new OutputStreamWriter(
     new FileOutputStream("some_output.utf8"),
     Charset.forName("UTF-8").newEncoder() 
 );

 InputStreamReader char_input = new InputStreamReader(
     new FileInputStream("some_input.utf8"),
     Charset.forName("UTF-8").newDecoder() 
 );

As for running with

至于跑步

 $ java -Dfile.encoding=utf8 SomeTrulyRemarkablyLongcLassNameGoeShere

The problem is that that will not use the full encoder argument form for the character streams, and so you will again miss encoding problems.

问题是这不会对字符流使用完整的编码器参数形式,因此您将再次错过编码问题。

Longer Example

更长的例子

Here's a longer example, this one managing a process instead of a file, where we promote two different input bytes streams and one output byte stream all to UTF-8 character streams with full exception handling:

这是一个更长的例子,这个例子管理一个进程而不是一个文件,我们将两个不同的输入字节流和一个输出字节流全部提升为具有完整异常处理的UTF-8 字符流:

 // this runs a perl script with UTF-8 STD{IN,OUT,ERR} streams
 Process
 slave_process = Runtime.getRuntime().exec("perl -CS script args");

 // fetch his stdin byte stream...
 OutputStream
 __bytes_into_his_stdin  = slave_process.getOutputStream();

 // and make a character stream with exceptions on encoding errors
 OutputStreamWriter
   chars_into_his_stdin  = new OutputStreamWriter(
                             __bytes_into_his_stdin,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newEncoder()
                         );

 // fetch his stdout byte stream...
 InputStream
 __bytes_from_his_stdout = slave_process.getInputStream();

 // and make a character stream with exceptions on encoding errors
 InputStreamReader
   chars_from_his_stdout = new InputStreamReader(
                             __bytes_from_his_stdout,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
                         );

// fetch his stderr byte stream...
 InputStream
 __bytes_from_his_stderr = slave_process.getErrorStream();

 // and make a character stream with exceptions on encoding errors
 InputStreamReader
   chars_from_his_stderr = new InputStreamReader(
                             __bytes_from_his_stderr,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
                         );

Now you have three character streams that all raise exception on encoding errors, respectively called chars_into_his_stdin, chars_from_his_stdout, and chars_from_his_stderr.

现在,你有三个字符流上编码的错误都引发异常,分别称为chars_into_his_stdinchars_from_his_stdoutchars_from_his_stderr

This is only slightly more complicated that what you need for your problem, whose solution I gave in the first half of this answer. The key point is this is the only way to detect encoding errors.

这只是比您解决问题所需的稍微复杂一些,我在本答案的前半部分给出了其解决方案。关键是这是检测编码错误的唯一方法。

Just don't get me started about PrintStreams eating exceptions.

只是不要让我开始谈论PrintStream饮食异常。

回答by Phuong

With Chinese text, I tried to use the Charset UTF-16 and lucklily it work.

对于中文文本,我尝试使用字符集 UTF-16,幸运的是它可以工作。

Hope this could help!

希望这会有所帮助!

PrintWriter out = new PrintWriter( file, "UTF-16" );

回答by Phan Ng?c Hoàng D??ng

In my opinion

在我看来

If you wanna write follow kind UTF-8.You should create a byte array.Then,you can do such as the following: byte[] by=("<?xml version=\"1.0\" encoding=\"utf-8\"?>"+"Your string".getBytes();

如果您想编写遵循UTF-8 类型。您应该创建一个字节数组。然后,您可以执行以下操作: byte[] by=("<?xml version=\"1.0\" encoding=\"utf-8\"?>"+"Your string".getBytes();

Then, you can write each byte into file you created. Example:

然后,您可以将每个字节写入您创建的文件中。例子:

OutputStream f=new FileOutputStream(xmlfile);
    byte[] by=("<?xml version=\"1.0\" encoding=\"utf-8\"?>"+"Your string".getBytes();
    for (int i=0;i<by.length;i++){
    byte b=by[i];
    f.write(b);

    }
    f.close();

回答by Lars Briem

Since Java 7 there is an easy way to handle character encoding of BufferedWriter and BufferedReaders. You can create a BufferedWriter directly by using the Files class instead of creating various instances of Writer. You can simply create a BufferedWriter, which considers character encoding, by calling:

从 Java 7 开始,有一种简单的方法来处理 BufferedWriter 和 BufferedReaders 的字符编码。您可以通过使用 Files 类而不是创建 Writer 的各种实例来直接创建 BufferedWriter。您可以简单地创建一个考虑字符编码的 BufferedWriter,方法是调用:

Files.newBufferedWriter(file.toPath(), StandardCharsets.UTF_8);

You can find more about it in JavaDoc:

您可以在 JavaDoc 中找到更多相关信息:

回答by mortensi

Since Java 11 you can do:

从 Java 11 开始,您可以执行以下操作:

FileWriter fw = new FileWriter("filename.txt", Charset.forName("utf-8"));

回答by code ??

OK it's 2019 now, and from Java 11 you have a constructor with Charset:

好的,现在是 2019 年,从 Java 11 开始,您有一个带有 Charset 的构造函数:

FileWriter?(String fileName, Charset charset)

Unfortunately, we still cannot modify the byte buffer size, and it's set to 8192. (https://www.baeldung.com/java-filewriter)

不幸的是,我们仍然无法修改字节缓冲区大小,将其设置为 8192。(https://www.baeldung.com/java-filewriter

回答by zakaria

use OutputStream instead of FileWriter to set encoding type

使用 OutputStream 而不是 FileWriter 来设置编码类型

// file is your File object where you want to write you data 
OutputStream outputStream = new FileOutputStream(file);
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(outputStream, "UTF-8");
outputStreamWriter.write(json); // json is your data 
outputStreamWriter.flush();
outputStreamWriter.close();