使用 Java 的 XPath 循环节点并提取特定的子节点值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3996385/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Looping over nodes and extracting specific subnode values using Java's XPath
提问by BoomShaka
I understand from Googling that it makes more sense to extract data from XML using XPath than by using DOM looping.
我从谷歌搜索中了解到,使用 XPath 从 XML 中提取数据比使用 DOM 循环更有意义。
At the moment, I have implemented a solution using DOM, but the code is verbose, and it feels untidy and unmaintainable, so I would like to switch to a cleaner XPath solution.
目前我已经实现了一个使用DOM的解决方案,但是代码比较冗长,感觉不整洁,难以维护,所以想换一个更简洁的XPath方案。
Let's say I have this structure:
假设我有这个结构:
<products>
<product>
<title>Some title 1</title>
<image>Some image 1</image>
</product>
<product>
<title>Some title 2</title>
<image>Some image 2</image>
</product>
...
</products>
I want to be able to run a for loop for each of the <product>
elements, and inside this for loop, extract the title and image node values.
我希望能够为每个<product>
元素运行一个 for 循环,并在这个 for 循环中提取标题和图像节点值。
My code looks like this:
我的代码如下所示:
InputStream is = conn.getInputStream();
DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(is);
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/products/product");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList products = (NodeList) result;
for (int i = 0; i < products.getLength(); i++) {
Node n = products.item(i);
if (n != null && n.getNodeType() == Node.ELEMENT_NODE) {
Element product = (Element) n;
// do some DOM navigation to get the title and image
}
}
Inside my for
loop I get each <product>
as a Node
, which is cast to an Element
.
在我的for
循环中,我将每个都<product>
作为 a Node
,将其强制转换为Element
.
Can I simply use my instance of XPathExpression
to compile and run another XPath
on the Node
or the Element
?
我可以简单地使用我的实例在或上XPathExpression
编译和运行另一个吗?XPath
Node
Element
回答by Gopi
Yes, you can always do like this -
是的,你总是可以这样做——
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/products/product");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
expr = xpath.compile("title"); // The new xpath expression to find 'title' within 'product'.
NodeList products = (NodeList) result;
for (int i = 0; i < products.getLength(); i++) {
Node n = products.item(i);
if (n != null && n.getNodeType() == Node.ELEMENT_NODE) {
Element product = (Element) n;
NodeList nodes = (NodeList) expr.evaluate(product,XPathConstants.NODESET); //Find the 'title' in the 'product'
System.out.println("TITLE: " + nodes.item(0).getTextContent()); // And here is the title
}
}
Here I have given example of extracting the 'title' value. In same way you can do for 'image'
在这里,我给出了提取“标题”值的示例。以同样的方式你可以为“图像”做
回答by dogbane
I'm not a big fan of this approach because you have to build a document (which might be expensive) before you can apply XPaths to it.
我不是这种方法的忠实粉丝,因为您必须先构建一个文档(这可能很昂贵),然后才能对其应用 XPath。
I've found VTD-XMLa lot more efficient when it comes to applying XPaths to documents, because you don't need to load the whole document into memory. Here is some sample code:
我发现VTD-XML在将 XPath 应用于文档时效率更高,因为您不需要将整个文档加载到内存中。下面是一些示例代码:
final VTDGen vg = new VTDGen();
vg.parseFile("file.xml", false);
final VTDNav vn = vg.getNav();
final AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("/products/product");
while (ap.evalXPath() != -1) {
System.out.println("PRODUCT:");
// you could either apply another xpath or simply get the first child
if (vn.toElement(VTDNav.FIRST_CHILD, "title")) {
int val = vn.getText();
if (val != -1) {
System.out.println("Title: " + vn.toNormalizedString(val));
}
vn.toElement(VTDNav.PARENT);
}
if (vn.toElement(VTDNav.FIRST_CHILD, "image")) {
int val = vn.getText();
if (val != -1) {
System.out.println("Image: " + vn.toNormalizedString(val));
}
vn.toElement(VTDNav.PARENT);
}
}
Also see this post on Faster XPaths with VTD-XML.
另请参阅有关使用 VTD-XML 的 Faster XPaths 的这篇文章。