如何在 Java 中添加 UTF-8 BOM？

Question

提问by Fadd

I have a Java stored procedure which fetches record from the table using Resultsetobject and creates a CS Vfile.

我有一个 Java 存储过程，它使用Resultset对象从表中获取记录并创建一个 CS Vfile。

BLOB retBLOB = BLOB.createTemporary(conn, true, BLOB.DURATION_SESSION);
retBLOB.open(BLOB.MODE_READWRITE);
OutputStream bOut = retBLOB.setBinaryStream(0L);

ZipOutputStream zipOut = new ZipOutputStream(bOut);
PrintStream out = new PrintStream(zipOut,false,"UTF-8");
out.write('\ufeff');
out.flush();

zipOut.putNextEntry(new ZipEntry("filename.csv"));
while (rs.next()){
    out.print("\"" + rs.getString(i) + "\"");
    out.print(",");
}
out.flush();

zipOut.closeEntry();
zipOut.close();
retBLOB.close();

return retBLOB;

But the generated CSV file doesn't show the correct German character. Oracle database also has a NLS_CHARACTERSETvalue of UTF8.

但是生成的 CSV 文件没有显示正确的德语字符。Oracle 数据库也有一个NLS_CHARACTERSETUTF8 值。

Please suggest.

请建议。

Answer 1

采纳答案by axtavt

To write a BOM in UTF-8 you need PrintStream.print(), not PrintStream.write().

要在 UTF-8 中编写 BOM，您需要PrintStream.print()，而不是PrintStream.write().

Also if you want to have BOM in your csvfile, I guess you need to print a BOM after putNextEntry().

另外，如果您想在csv文件中包含 BOM ，我想您需要在putNextEntry().

Answer 2

回答by Stephen C

I think that out.write('\ufeff');should actually be out.print('\ufeff');.

我认为out.write('\ufeff');实际上应该是out.print('\ufeff');。

According the javadoc, the write(int)method actually writes a byte ... without any character encoding. So out.write('\ufeff');writes the byte 0xff. By contrast, the print(char)method encodes the character as one or bytes using the stream's encoding, and then writes those bytes.

根据javadoc，该write(int)方法实际上写入一个字节......没有任何字符编码。所以out.write('\ufeff');写字节0xff。相比之下，该print(char)方法使用流的编码将字符编码为一个或多个字节，然后写入这些字节。

Answer 3

回答by astro

BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(...), StandardCharsets.UTF_8));
out.write('\ufeff');
out.write(...);

This correctly writes out 0xEF 0xBB 0xBF to the file, which is the UTF-8 representation of the BOM.

这正确地将 0xEF 0xBB 0xBF 写入文件，这是 BOM 的 UTF-8 表示。

Answer 4

回答by Rocio

In my case it works with the code:

就我而言，它适用于以下代码：

PrintWriter out = new PrintWriter(new File(filePath), "UTF-8");
out.write(csvContent);
out.flush();
out.close();

Answer 5

回答by Christopher Schultz

Just in case people areusing PrintStreams, you need to do it a little differently. While a Writerwill do some magic to convert a single byte into 3 bytes, a PrintStreamrequires all 3 bytes of the UTF-8 BOM individually:

万一人都使用PrintStreamS，你需要以不同的方式做到这一点。虽然 aWriter可以将单个字节转换为 3 个字节，但 aPrintStream需要单独使用 UTF-8 BOM 的所有 3 个字节：

    // Print utf-8 BOM
    PrintStream out = System.out;
    out.write('\ufeef'); // emits 0xef
    out.write('\ufebb'); // emits 0xbb
    out.write('\ufebf'); // emits 0xbf

Alternatively, you can use the hex values for those directly:

或者，您可以直接使用十六进制值：

    PrintStream out = System.out;
    out.write(0xef); // emits 0xef
    out.write(0xbb); // emits 0xbb
    out.write(0xbf); // emits 0xbf

如何在 Java 中添加 UTF-8 BOM？

提问by Fadd

采纳答案by axtavt

回答by Stephen C

回答by astro

回答by Rocio

回答by Christopher Schultz

相关推荐

最近更新

标签

如何在 Java 中添加 UTF-8 BOM？

提问by Fadd

采纳答案by axtavt

回答by Stephen C

回答by astro

回答by Rocio

回答by Christopher Schultz

相关推荐

在java中搜索二维数组

Java 如何从日历中获取 UTC 时间戳？

Java Servlet 映射：带有尾部斜杠的 URL 的 url-pattern

Java Spring @ContextConfiguration 如何为 xml 放置正确的位置

相关推荐

最近更新

标签