Java 如何从字符串中的 XML 加载 org.w3c.dom.Document？

Question

提问by Frank Krueger

I have a complete XML document in a string and would like a Documentobject. Google turns up all sorts of garbage. What is the simplest solution? (In Java 1.5)

我在一个字符串中有一个完整的 XML 文档，并且想要一个Document对象。谷歌会出现各种垃圾。最简单的解决方案是什么？（在 Java 1.5 中）

SolutionThanks to Matt McMinn, I have settled on this implementation. It has the right level of input flexibility and exception granularity for me. (It's good to know if the error came from malformed XML - SAXException- or just bad IO - IOException.)

解决方案感谢Matt McMinn，我已经确定了这个实现。它具有适合我的输入灵活性和异常粒度级别。（很高兴知道错误是否来自格式错误的 XML - SAXException- 或者只是错误的 IO - IOException。）

public static org.w3c.dom.Document loadXMLFrom(String xml)
    throws org.xml.sax.SAXException, java.io.IOException {
    return loadXMLFrom(new java.io.ByteArrayInputStream(xml.getBytes()));
}

public static org.w3c.dom.Document loadXMLFrom(java.io.InputStream is) 
    throws org.xml.sax.SAXException, java.io.IOException {
    javax.xml.parsers.DocumentBuilderFactory factory =
        javax.xml.parsers.DocumentBuilderFactory.newInstance();
    factory.setNamespaceAware(true);
    javax.xml.parsers.DocumentBuilder builder = null;
    try {
        builder = factory.newDocumentBuilder();
    }
    catch (javax.xml.parsers.ParserConfigurationException ex) {
    }  
    org.w3c.dom.Document doc = builder.parse(is);
    is.close();
    return doc;
}

Answer 1

采纳答案by Matt McMinn

This works for me in Java 1.5 - I stripped out specific exceptions for readability.

这在 Java 1.5 中对我有用 - 为了可读性，我去掉了特定的异常。

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import java.io.ByteArrayInputStream;

public Document loadXMLFromString(String xml) throws Exception
{
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

    factory.setNamespaceAware(true);
    DocumentBuilder builder = factory.newDocumentBuilder();

    return builder.parse(new ByteArrayInputStream(xml.getBytes()));
}

Answer 2

回答by erickson

Whoa there!

哇哦！

There's a potentially serious problem with this code, because it ignores the character encoding specified in the String(which is UTF-8 by default). When you call String.getBytes()the platform default encoding is used to encode Unicode characters to bytes. So, the parser may think it's getting UTF-8 data when in fact it's getting EBCDIC or something… not pretty!

此代码存在潜在的严重问题，因为它忽略了中指定的字符编码String（默认为 UTF-8）。当您调用String.getBytes()平台时，默认编码用于将 Unicode 字符编码为字节。因此，解析器可能认为它正在获取 UTF-8 数据，而实际上它正在获取 EBCDIC 或其他东西……不漂亮！

Instead, use the parse method that takes an InputSource, which can be constructed with a Reader, like this:

相反，使用带有 InputSource 的 parse 方法，它可以用 Reader 构造，如下所示：

import java.io.StringReader;
import org.xml.sax.InputSource;
…
        return builder.parse(new InputSource(new StringReader(xml)));

It may not seem like a big deal, but ignorance of character encoding issues leads to insidious code rot akin to y2k.

这似乎没什么大不了的，但是对字符编码问题的无知会导致类似于 y2k 的阴险代码腐烂。

Answer 3

回答by shsteimer

Just had a similar problem, except i needed a NodeList and not a Document, here's what I came up with. It's mostly the same solution as before, augmented to get the root element down as a NodeList and using erickson's suggestion of using an InputSource instead for character encoding issues.

刚刚遇到了类似的问题，除了我需要一个 NodeList 而不是一个文档，这就是我想出的。它与以前的解决方案大致相同，增加了将根元素作为 NodeList 并使用 erickson 的建议，即使用 InputSource 而不是字符编码问题。

private String DOC_ROOT="root";
String xml=getXmlString();
Document xmlDoc=loadXMLFrom(xml);
Element template=xmlDoc.getDocumentElement();
NodeList nodes=xmlDoc.getElementsByTagName(DOC_ROOT);

public static Document loadXMLFrom(String xml) throws Exception {
        InputSource is= new InputSource(new StringReader(xml));
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder = null;
        builder = factory.newDocumentBuilder();
        Document doc = builder.parse(is);
        return doc;
    }

Answer 4

回答by Xavier Dury

To manipulate XML in Java, I always tend to use the Transformer API:

为了在 Java 中操作 XML，我总是倾向于使用 Transformer API：

import javax.xml.transform.Source;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMResult;
import javax.xml.transform.stream.StreamSource;

public static Document loadXMLFrom(String xml) throws TransformerException {
    Source source = new StreamSource(new StringReader(xml));
    DOMResult result = new DOMResult();
    TransformerFactory.newInstance().newTransformer().transform(source , result);
    return (Document) result.getNode();
}

Java 如何从字符串中的 XML 加载 org.w3c.dom.Document？

提问by Frank Krueger

采纳答案by Matt McMinn

回答by erickson

回答by shsteimer

回答by Xavier Dury

相关推荐

最近更新

标签

Java 如何从字符串中的 XML 加载 org.w3c.dom.Document？

提问by Frank Krueger

采纳答案by Matt McMinn

回答by erickson

回答by shsteimer

回答by Xavier Dury

相关推荐

Android 无法启动 Activity - java.lang.RuntimeException: 无法启动 Activity ComponentInfo

Java Singleton vs static - 有真正的性能优势吗？

java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration

如何在 Java 中创建哈希表？

相关推荐

最近更新

标签