Java 为什么我得到额外的文本节点作为根节点的子节点？

Question

提问by Vikas Mangal

I want to print the child elements of the root node. This is my XML file.

我想打印根节点的子元素。这是我的 XML 文件。

<?xml version="1.0"?>
<!-- Comment-->
<company>
   <staff id="1001">
       <firstname>yong</firstname>
       <lastname>mook kim</lastname>
       <nickname>mkyong</nickname>
       <salary>100000</salary>
   </staff>
   <staff id="2001">
       <firstname>low</firstname>
       <lastname>yin fong</lastname>
       <nickname>fong fong</nickname>
       <salary>200000</salary>
   </staff>
</company>

According to my understanding, root node is 'company' and its child nodes must be 'staff' and 'staff' (as there are 'staff' nodes 2 times). But when I am trying to get them through my java code I am getting 5 child nodes. Where are the 3 extra text nodes coming from?

根据我的理解，根节点是'company'，它的子节点必须是'staff'和'staff'（因为有2次'staff'节点）。但是当我试图通过我的 Java 代码获取它们时，我得到了 5 个子节点。3 个额外的文本节点来自哪里？

Java Code:

Java代码：

package com.training.xml;

import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class ReadingXML {

public static void main(String[] args) {
    try {

        File file = new File("D:\TestFile.xml");

        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(file);
        doc.getDocumentElement().normalize();

        System.out.println("root element: " + doc.getDocumentElement().getNodeName());

        Node rootNode = doc.getDocumentElement(); 
        System.out.println("root: " + rootNode.getNodeName());

        NodeList nList = rootNode.getChildNodes(); 

        for(int i = 0; i < nList.getLength(); i++) {
            System.out.println("node name: " + nList.item(i).getNodeName() );
        }           
    } catch(Exception e) {
        e.printStackTrace();
    }
}
}

OUTPUT:

输出：

root element: company
root: company
node name: #text
node name: staff
node name: #text
node name: staff
node name: #text

Why the three text nodes are coming over here?

为什么三个文本节点都过来了？

Answer 1

采纳答案by Jon Skeet

Why the three text nodes are coming over here ?

为什么三个文本节点会从这里过来？

They're the whitespace between the child elements. If you only want the child elements, you should just ignore nodes of other types:

它们是子元素之间的空白。如果你只想要子元素，你应该忽略其他类型的节点：

for (int i = 0;i < nList.getLength(); i++) {
    Node node = nList.item(i);
    if (node.getNodeType() == Node.ELEMENT_NODE) {
        System.out.println("node name: " + node.getNodeName());
    }
}

Or you could change your document to not have that whitespace.

或者您可以更改您的文档，使其不包含该空格。

Or you could use a different XML API which allows you to easily ask for just elements. (The DOM API is a pain in various ways.)

或者您可以使用不同的 XML API，它允许您轻松地仅请求元素。（DOM API 在各种方面都令人痛苦。）

If you only want to ignore element content whitespace, you can use Text.isElementContentWhitespace.

如果只想忽略元素内容的空格，可以使用Text.isElementContentWhitespace.

Java 为什么我得到额外的文本节点作为根节点的子节点？

提问by Vikas Mangal

采纳答案by Jon Skeet

相关推荐

最近更新

标签

Java 为什么我得到额外的文本节点作为根节点的子节点？

提问by Vikas Mangal

采纳答案by Jon Skeet

相关推荐

Java 将 Arraylist 拆分为更小的 ArrayList

Java 使用 arraylist<string> 填充 Android 微调器

Java 无法解析字符串值中的占位符

Java ActiveMQ 从队列中获取所有消息

相关推荐

最近更新

标签