java 导出到 CSV 编码问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10243029/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 00:11:21  来源:igfitidea点击:

Exporting to CSV encoding problems

javagwtencodingjbossexport

提问by Fofole

I have a listgrid in which all items are shown properly with diacritics as they are in db both locally and on jboss server.

我有一个列表网格,其中所有项目都用变音符号正确显示,因为它们在本地和 jboss 服务器上都位于 db 中。

However, on jboss server, when I try to export as csv all the diacritics characters are replaced so I get something like ????coala instead of ?coala, although diacritics are shown properly in the listgrid.

但是,在 jboss 服务器上,当我尝试将所有变音符号导出为 csv 时,所有变音字符都被替换,因此我得到了类似 ????coala 而不是 ?coala 的内容,尽管变音符号在列表网格中正确显示。

Locally works fine both showing in listgrid and exporting.

本地工作正常,无论是在列表网格中显示还是导出。

Here is my code for export:

这是我的导出代码:

private void Export() {
  String exportAs = (String) m_ExportForm.getField("exportType").getValue();  
  FormItem item = m_ExportForm.getField("showInWindow");  
  boolean showInWindow =  item.getValue() == null ? false : (Boolean) item.getValue();  

  // exportAs is either XML or CSV, which we can do with requestProperties
  Map<String,String> params= new java.util.HashMap<String, String>();
  params.put("Accept-Charset","utf-8");

  DSRequest dsRequestProperties = new DSRequest();
  dsRequestProperties.setHttpHeaders(params);
  dsRequestProperties.setExportValueFields(true);
  dsRequestProperties.setExportAs((ExportFormat)EnumUtil.getEnum(ExportFormat.values(), exportAs));  
  dsRequestProperties.setExportDisplay(showInWindow ? ExportDisplay.WINDOW : ExportDisplay.DOWNLOAD);

  // TODO: move in user-config
  dsRequestProperties.setExportTitleSeparatorChar("_");
  dsRequestProperties.setExportDelimiter(";");

  dsRequestProperties.setExportFilename("export." + extensionsValueMap.get(exportAs));
  dsRequestProperties.setContentType("text/csv; charset=UTF-8");
  m_Target.Export(dsRequestProperties);

  Close();
}

Also, in my jboss 7 property file I have this:

另外,在我的 jboss 7 属性文件中,我有这个:

<system-properties>
  <property name="org.apache.catalina.connector.URI_ENCODING" value="UTF-8"/>
  <property name="org.apache.catalina.connector.USE_BODY_ENCODING_FOR_QUERY_STRING" value="true"/>
</system-properties>

which works as the listgrids show diacritics properly.

它可以作为列表网格正确显示变音符号。

Also, in my web.xml I have for my servlet

此外,在我的 web.xml 我有我的 servlet

<init-param>
  <param-name>encoding</param-name>
  <param-value>UTF-8</param-value>
</init-param>

Maybe I'm on a wrong track and this is caused by something else.

也许我走错了路,这是由其他原因引起的。

Both the file exported locally and the file exported from jboss server have the exact file size.

本地导出的文件和jboss服务器导出的文件都有准确的文件大小。

Also, for my Jboss jvm I set the property for java_opts

另外,对于我的 Jboss jvm,我设置了 java_opts 的属性

-Dfile.encoding=UTF-8

EDIT:added the params map due to suggestion. Still nothing.

编辑:根据建议添加了参数映射。依然没有。

回答by prunge

It sounds like it's a character encoding/decoding issue.

听起来像是字符编码/解码问题。

Your code generated a CSV file in the UTF-8 encoding. However, what program are you using to readthe CSV? Windows notepad? If it's a Windows application, chances are it is assuming the text file is in ISO-8859-1encoding.

您的代码生成了一个 UTF-8 编码的 CSV 文件。但是,您使用什么程序来读取CSV 文件?Windows 记事本?如果它是 Windows 应用程序,则很可能假设文本文件采用ISO-8859-1编码。

Option 1:

选项1:

Tell notepad or your Windows application the encoding. With notepad, there is an encoding dropdown in the File/Open dialog. Switch this to UTF-8.

告诉记事本或您的 Windows 应用程序编码。使用记事本,文件/打开对话框中有一个编码下拉菜单。将其切换为 UTF-8。

Option 2:

选项 2:

Change the encoding in your source code from UTF-8to ISO-8859-1, which matches Windows' default encoding. Changing the line:

将源代码中的编码从 更改UTF-8ISO-8859-1,这与 Windows 的默认编码相匹配。改变线路:

dsRequestProperties.setContentType("application/csv; charset=UTF-8");

to

dsRequestProperties.setContentType("application/csv; charset=ISO-8859-1");

will hopefully do the trick. The org.apache.catalina.connector.URI_ENCODINGsetting does not affect the file encoding and should be left as it is.

希望能做到这一点。该org.apache.catalina.connector.URI_ENCODING设置不影响文件编码,应保持原样。

回答by Joop Eggen

I must admit, in this constellation I have not seen a charset=.... But the charset makes more sense for text, so try first:

我必须承认,在这个星座中我没有见过charset=.... 但是字符集对文本更有意义,因此请先尝试:

dsRequestProperties.setContentType("text/csv; charset=UTF-8");

Reason, applicationwhich could well indicate binary data, would make a charset byte encoding dangerous.

Reasonapplication可以很好地指示二进制数据,会使字符集字节编码变得危险。



Added: my explanation for the error

补充:我对错误的解释

Maybe the String asExportgot UTF-8 but gives for a multi-byte char instead two characters. Those are in the non-ASCII range too, and your response somehow wants to deliver ISO-8859-1 (the default latin-1), and writes ??. That are 2 errors.

也许 StringasExport得到了 UTF-8,但给出了一个多字节字符而不是两个字符。这些也在非 ASCII 范围内,您的响应以某种方式希望提供 ISO-8859-1(默认 latin-1),并写入??. 那是2个错误。

You could inspect asExport. Why writing in UTF-8 not succeeds despite charset=UTF-8...

你可以检查asExport。尽管 charset=UTF-8,但为什么用 UTF-8 编写仍不成功...

回答by Fofole

You probably have some additional FilterServlets in your JBoss setup that are interfering with the encoding. Possibly related to authentication or compression.

您的 JBoss 设置中可能有一些额外的 FilterServlet 会干扰编码。可能与身份验证或压缩有关。