windows 4 字节 UTF-8 序列的字节 2 无效，但仅在执行 JAR 时？

Question

提问by Daniel Montes de Oca

I have this java program where I transform with TransformerFactory a XML string that I get from a SQL Server database and write it to a file, and then use this file to generate a PDF.

我有这个 java 程序，我用 TransformerFactory 转换了我从 SQL Server 数据库获取的 XML 字符串并将其写入文件，然后使用该文件生成 PDF。

The thing is that it works fine when I execute it with netbeans, but if I execute the jar in the project dist folder I get a "Invalid byte 2 of 4-byte UTF-8 sequence".

问题是当我使用 netbeans 执行它时它工作正常，但是如果我在项目 dist 文件夹中执行 jar，我会得到“4 字节 UTF-8 序列的无效字节 2”。

After changing the encoding of the XML string to UTF-8 now it works fine from the jar too.

将 XML 字符串的编码更改为 UTF-8 后，它现在也可以从 jar 中正常工作。

So my question is, why would it work when running the project in NetBeans but not from the JAR file before changing the encoding?

所以我的问题是，为什么在更改编码之前在 NetBeans 中运行项目而不是在 JAR 文件中运行项目时它会起作用？

Have tried this only in Windows.

仅在 Windows 中尝试过。

Code:

代码：

Here is the SQL Server query (original):

这是 SQL Server 查询（原始）：

SQLXML xml = null;
String xmlString = "";
while (rs.next()){
    xml = rs.getSQLXML(1);
    xmlString = xml.getString();
}
return xmlString;

...and modified:

...并修改：

SQLXML xml = null;
String xmlString = "";
while (rs.next()){
    xml = rs.getSQLXML(1);
    // Note explicit UTF-8 encoding specified
    xmlString = new String(xml.getString().getBytes(),"UTF8");
 }
 return xmlString;

And here the transformation:

这里的转换：

public static void serialize(Document doc, OutputStream out) throws Exception {
    TransformerFactory tfactory = TransformerFactory.newInstance();
    try {
        Transformer serializer = tfactory.newTransformer();
        serializer.setOutputProperty("indent", "yes");
        serializer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
        serializer.transform(new DOMSource(doc), new StreamResult(out));
    } catch (TransformerException e) {
        e.printStackTrace();
        throw new RuntimeException(e);
    }
}

Answer 1

采纳答案by Luciano

I've tried a simple Application in Netbeans that displays the Charset.defaultCharset(), and it returns "UTF-8". The same one in Eclipse returns "MacRoman". I'm on a Mac, on Windows it'd return "cp-1252".

我在 Netbeans 中尝试了一个简单的应用程序，它显示Charset.defaultCharset()，它返回“UTF-8”。Eclipse 中的同一个返回“MacRoman”。我在 Mac 上，在 Windows 上它会返回“cp-1252”。

So yes, when you run an Application in Netbeans, it defaults to UTF-8 encoding, that's why you didn't have any issues when parsing the XML.

所以是的，当您在 Netbeans 中运行应用程序时，它默认为 UTF-8 编码，这就是您在解析 XML 时没有任何问题的原因。

windows 4 字节 UTF-8 序列的字节 2 无效，但仅在执行 JAR 时？

提问by Daniel Montes de Oca

采纳答案by Luciano

相关推荐

最近更新

标签

windows 4 字节 UTF-8 序列的字节 2 无效，但仅在执行 JAR 时？

提问by Daniel Montes de Oca

采纳答案by Luciano

相关推荐

如何在 Windows 上的 C/C++ 中为文件预分配空间？

windows 如何制作休眠批处理文件？

windows 创建 .exe 文件以打开 .html 文档

windows 自动响应批处理文件中的命令行程序

相关推荐

最近更新

标签