如何在 Java 中添加 UTF-8 BOM?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4389005/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to add a UTF-8 BOM in Java?
提问by Fadd
I have a Java stored procedure which fetches record from the table using Resultset
object and creates a CS Vfile.
我有一个 Java 存储过程,它使用Resultset
对象从表中获取记录并创建一个 CS Vfile。
BLOB retBLOB = BLOB.createTemporary(conn, true, BLOB.DURATION_SESSION);
retBLOB.open(BLOB.MODE_READWRITE);
OutputStream bOut = retBLOB.setBinaryStream(0L);
ZipOutputStream zipOut = new ZipOutputStream(bOut);
PrintStream out = new PrintStream(zipOut,false,"UTF-8");
out.write('\ufeff');
out.flush();
zipOut.putNextEntry(new ZipEntry("filename.csv"));
while (rs.next()){
out.print("\"" + rs.getString(i) + "\"");
out.print(",");
}
out.flush();
zipOut.closeEntry();
zipOut.close();
retBLOB.close();
return retBLOB;
But the generated CSV file doesn't show the correct German character. Oracle database also has a NLS_CHARACTERSET
value of UTF8.
但是生成的 CSV 文件没有显示正确的德语字符。Oracle 数据库也有一个NLS_CHARACTERSET
UTF8 值。
Please suggest.
请建议。
采纳答案by axtavt
To write a BOM in UTF-8 you need PrintStream.print()
, not PrintStream.write()
.
要在 UTF-8 中编写 BOM,您需要PrintStream.print()
,而不是PrintStream.write()
.
Also if you want to have BOM in your csv
file, I guess you need to print a BOM after putNextEntry()
.
另外,如果您想在csv
文件中包含 BOM ,我想您需要在putNextEntry()
.
回答by Stephen C
I think that out.write('\ufeff');
should actually be out.print('\ufeff');
.
我认为out.write('\ufeff');
实际上应该是out.print('\ufeff');
。
According the javadoc, the write(int)
method actually writes a byte ... without any character encoding. So out.write('\ufeff');
writes the byte 0xff
. By contrast, the print(char)
method encodes the character as one or bytes using the stream's encoding, and then writes those bytes.
根据javadoc,该write(int)
方法实际上写入一个字节......没有任何字符编码。所以out.write('\ufeff');
写字节0xff
。相比之下,该print(char)
方法使用流的编码将字符编码为一个或多个字节,然后写入这些字节。
回答by astro
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(...), StandardCharsets.UTF_8));
out.write('\ufeff');
out.write(...);
This correctly writes out 0xEF 0xBB 0xBF to the file, which is the UTF-8 representation of the BOM.
这正确地将 0xEF 0xBB 0xBF 写入文件,这是 BOM 的 UTF-8 表示。
回答by Rocio
In my case it works with the code:
就我而言,它适用于以下代码:
PrintWriter out = new PrintWriter(new File(filePath), "UTF-8");
out.write(csvContent);
out.flush();
out.close();
回答by Christopher Schultz
Just in case people areusing PrintStream
s, you need to do it a little differently. While a Writer
will do some magic to convert a single byte into 3 bytes, a PrintStream
requires all 3 bytes of the UTF-8 BOM individually:
万一人都使用PrintStream
S,你需要以不同的方式做到这一点。虽然 aWriter
可以将单个字节转换为 3 个字节,但 aPrintStream
需要单独使用 UTF-8 BOM 的所有 3 个字节:
// Print utf-8 BOM
PrintStream out = System.out;
out.write('\ufeef'); // emits 0xef
out.write('\ufebb'); // emits 0xbb
out.write('\ufebf'); // emits 0xbf
Alternatively, you can use the hex values for those directly:
或者,您可以直接使用十六进制值:
PrintStream out = System.out;
out.write(0xef); // emits 0xef
out.write(0xbb); // emits 0xbb
out.write(0xbf); // emits 0xbf