Java 在不知道xml文件结构的情况下解析xml文件内容

Question

提问by canadiancreed

I've been working on learning some new tech using java to parse files and for the msot part it's going well. However, I'm at a lost as to how I could parse an xml file to where the structure is not known upon receipt. Lots of examples of how to do so if you know the structure (getElementByTagName seems to be the way to go), but no dynamic options, at least not that I've found.

我一直在努力学习一些使用 java 解析文件的新技术，对于 msot 部分，它进展顺利。但是，我不知道如何将 xml 文件解析为收到时不知道结构的地方。如果您知道结构（getElementByTagName 似乎是要走的路），则有很多关于如何执行此操作的示例，但没有动态选项，至少我没有找到。

So the tl;dr version of this question, how can I parse an xml file where I cannot rely on knowing it's structure?

所以这个问题的 tl;dr 版本，我如何解析一个我不能依赖于知道它的结构的 xml 文件？

Answer 1

采纳答案by Jason C

Well the parsing part is easy; like helderdarocha stated in the comments, the parser only requires valid XML, it does not care about the structure. You can use Java's standard DocumentBuilderto obtain a Document:

解析部分很容易；就像在评论中所说的 Holderdarocha 一样，解析器只需要有效的 XML，它不关心结构。您可以使用 Java 的标准DocumentBuilder来获取Document：

InputStream in = new FileInputStream(...);
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in);

(If you're parsing multiple documents, you can keep reusing the same DocumentBuilder.)

（如果您要解析多个文档，则可以继续重复使用相同的DocumentBuilder.）

Then you can start with the root document element and use familiar DOMmethods from there on out:

然后你可以从根文档元素开始，然后使用熟悉的DOM方法：

Element root = doc.getDocumentElement(); // perform DOM operations starting here.

As for processing it, well it really depends on what you want to do with it, but you can use the methods of Nodelike getFirstChild()and getNextSibling()to iterate through children and process as you see fit based on structure, tags, and attributes.

至于处理它，它真的取决于你想用它做什么，但是你可以使用NodelikegetFirstChild()和方法getNextSibling()遍历子项，并根据结构、标签和属性，按照你认为合适的方式进行处理。

Consider the following example:

考虑以下示例：

import java.io.ByteArrayInputStream;
import java.io.InputStream;
import javax.xml.parsers.DocumentBuilderFactory;   
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;


public class XML {

    public static void main (String[] args) throws Exception {

        String xml = "<objects><circle color='red'/><circle color='green'/><rectangle>hello</rectangle><glumble/></objects>";

        // parse
        InputStream in = new ByteArrayInputStream(xml.getBytes("utf-8"));
        Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in);

        // process
        Node objects = doc.getDocumentElement();
        for (Node object = objects.getFirstChild(); object != null; object = object.getNextSibling()) {
            if (object instanceof Element) {
                Element e = (Element)object;
                if (e.getTagName().equalsIgnoreCase("circle")) {
                    String color = e.getAttribute("color");
                    System.out.println("It's a " + color + " circle!");
                } else if (e.getTagName().equalsIgnoreCase("rectangle")) {
                    String text = e.getTextContent();
                    System.out.println("It's a rectangle that says \"" + text + "\".");
                } else {
                    System.out.println("I don't know what a " + e.getTagName() + " is for.");
                }
            }
        }

    }

}

The input XML document (hard-coded for example) is:

输入 XML 文档（例如硬编码）是：

<objects>
    <circle color='red'/>
    <circle color='green'/>
    <rectangle>hello</rectangle>
    <glumble/>
</objects>

The output is:

输出是：

It's a red circle!
It's a green circle!
It's a rectangle that says "hello".
I don't know what a glumble is for.

Java 在不知道xml文件结构的情况下解析xml文件内容

提问by canadiancreed

采纳答案by Jason C

相关推荐

最近更新

标签

Java 在不知道xml文件结构的情况下解析xml文件内容

提问by canadiancreed

采纳答案by Jason C

相关推荐

Java 为什么打印“B”比打印“#”慢得多？

Java：将数组传递给函数时出现“.class”预期错误

Java Guava EventBus 调度

Java 如何在 spring 中使用 spring.xml 连接数据库？

相关推荐

最近更新

标签