Java 为什么我得到额外的文本节点作为根节点的子节点?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20259742/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why am I getting extra text nodes as child nodes of root node?
提问by Vikas Mangal
I want to print the child elements of the root node. This is my XML file.
我想打印根节点的子元素。这是我的 XML 文件。
<?xml version="1.0"?>
<!-- Comment-->
<company>
<staff id="1001">
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
<staff id="2001">
<firstname>low</firstname>
<lastname>yin fong</lastname>
<nickname>fong fong</nickname>
<salary>200000</salary>
</staff>
</company>
According to my understanding, root node is 'company' and its child nodes must be 'staff' and 'staff' (as there are 'staff' nodes 2 times). But when I am trying to get them through my java code I am getting 5 child nodes. Where are the 3 extra text nodes coming from?
根据我的理解,根节点是'company',它的子节点必须是'staff'和'staff'(因为有2次'staff'节点)。但是当我试图通过我的 Java 代码获取它们时,我得到了 5 个子节点。3 个额外的文本节点来自哪里?
Java Code:
Java代码:
package com.training.xml;
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class ReadingXML {
public static void main(String[] args) {
try {
File file = new File("D:\TestFile.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(file);
doc.getDocumentElement().normalize();
System.out.println("root element: " + doc.getDocumentElement().getNodeName());
Node rootNode = doc.getDocumentElement();
System.out.println("root: " + rootNode.getNodeName());
NodeList nList = rootNode.getChildNodes();
for(int i = 0; i < nList.getLength(); i++) {
System.out.println("node name: " + nList.item(i).getNodeName() );
}
} catch(Exception e) {
e.printStackTrace();
}
}
}
OUTPUT:
输出:
root element: company
root: company
node name: #text
node name: staff
node name: #text
node name: staff
node name: #text
Why the three text nodes are coming over here?
为什么三个文本节点都过来了?
采纳答案by Jon Skeet
Why the three text nodes are coming over here ?
为什么三个文本节点会从这里过来?
They're the whitespace between the child elements. If you only want the child elements, you should just ignore nodes of other types:
它们是子元素之间的空白。如果你只想要子元素,你应该忽略其他类型的节点:
for (int i = 0;i < nList.getLength(); i++) {
Node node = nList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
System.out.println("node name: " + node.getNodeName());
}
}
Or you could change your document to not have that whitespace.
或者您可以更改您的文档,使其不包含该空格。
Or you could use a different XML API which allows you to easily ask for just elements. (The DOM API is a pain in various ways.)
或者您可以使用不同的 XML API,它允许您轻松地仅请求元素。(DOM API 在各种方面都令人痛苦。)
If you only want to ignore element content whitespace, you can use Text.isElementContentWhitespace
.
如果只想忽略元素内容的空格,可以使用Text.isElementContentWhitespace
.