Java xml 解析中的 UTF-8 问题

Question

提问by divz

I am using the following codes to convert XML contents to UTF-8, but they are not working properly:

我正在使用以下代码将 XML 内容转换为 UTF-8，但它们无法正常工作：

1.

InputStream is = new ByteArrayInputStream(strXMLAlert.getBytes("UTF-8"));
Document doc = db.parse(is);

2.

InputSource is = new InputSource(new ByteArrayInputStream(strXMLAlert.getBytes()));
is.setCharacterStream(new StringReader(strXMLAlert));
is.setEncoding("UTF-8");
Document doc = db.parse(is);

Answer 1

采纳答案by Mike Mansell

We probably need a bit more information to answer the question properly. For example, what problem are you seeing? Which Java version are you running?

我们可能需要更多信息才能正确回答问题。例如，您看到了什么问题？您运行的是哪个 Java 版本？

However, expanding your first example to

但是，将您的第一个示例扩展为

DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
String strXMLAlert = "<a>永</a>";
InputStream is = new ByteArrayInputStream(strXMLAlert.getBytes("UTF-8"));
Document document = db.parse(is);
Node item = document.getDocumentElement().getChildNodes().item(0);
String nodeValue = item.getNodeValue();
System.out.println(nodeValue);

In this example, there is a Chinese character in the string. It successfully prints out

在这个例子中，字符串中有一个汉字。它成功打印出来

永

Your second example should also work, although you are providing the content twice. Either provide it as a set of bytes and provide the encoding, or just provide it as characters (the StringReader) and you don't need the encoding (since as characters, it's already been decoded from bytes to characters).

您的第二个示例也应该有效，尽管您提供了两次内容。要么将其作为一组字节提供并提供编码，要么仅将其作为字符（StringReader）提供而您不需要编码（因为作为字符，它已经从字节解码为字符）。

Java xml 解析中的 UTF-8 问题

提问by divz

采纳答案by Mike Mansell

相关推荐

最近更新

标签

Java xml 解析中的 UTF-8 问题

提问by divz

采纳答案by Mike Mansell

相关推荐

Java “ORA-01008：并非所有变量都绑定”错误

Java：拆分包含特殊字符的字符串

Json/Java 新手 - 这是什么数据类型？约会时间？13 位长。使用 PHP

如何为java启用任务标签（TODO，...）？

相关推荐

最近更新

标签