XPath、XML 命名空间和 Java
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5465840/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
XPath, XML Namespaces and Java
提问by MrWizard54
I've spent the past day attempting to extract a one XML node out of the following document and am unable to grasp the nuances of XML Namespaces to make it work.
过去一天我一直试图从以下文档中提取一个 XML 节点,但无法掌握 XML 命名空间的细微差别以使其工作。
The XML file is to large to post in total so here is the portion that concerns me:
XML 文件太大而无法发布,所以这里是我关注的部分:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<XFDL xmlns="http://www.PureEdge.com/XFDL/6.5" xmlns:custom="http://www.PureEdge.com/XFDL/Custom" xmlns:designer="http://www.PureEdge.com/Designer/6.1" xmlns:pecs="http://www.PureEdge.com/PECustomerService" xmlns:xfdl="http://www.PureEdge.com/XFDL/6.5">
<globalpage sid="global">
<global sid="global">
<xmlmodel xmlns:xforms="http://www.w3.org/2003/xforms">
<instances>
<xforms:instance id="metadata">
<form_metadata>
<metadataver version="1.0"/>
<metadataverdate>
<date day="05" month="Jul" year="2005"/>
</metadataverdate>
<title>
<documentnbr number="2062" prefix.army="DA" scope="army" suffix=""/>
<longtitle>HAND RECEIPT/ANNEX NUMBER </longtitle>
</title>
The document continues and is well formed all the way down. I am attempting to extract the "number" attribute from the "documentnbr" tag (three from the bottom).
该文件继续并一直很好地形成。我试图从“documentnbr”标签(从底部开始三个)中提取“number”属性。
The code that I'm using to do this looks like this:
我用来执行此操作的代码如下所示:
/***
* Locates the Document Number information in the file and returns the form number.
* @return File's self-declared number.
* @throws InvalidFormException Thrown when XPath cannot find the "documentnbr" element in the file.
*/
public String getFormNumber() throws InvalidFormException
{
try{
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new XFDLNamespaceContext());
Node result = (Node)xPath.evaluate(QUERY_FORM_NUMBER, doc, XPathConstants.NODE);
if(result != null) {
return result.getNodeValue();
} else {
throw new InvalidFormException("Unable to identify form.");
}
} catch (XPathExpressionException err) {
throw new InvalidFormException("Unable to find form number in file.");
}
}
Where QUERY_FORM_NUMBER is my XPath expression, and XFDLNamespaceContext implements NamespaceContext and looks like this:
其中 QUERY_FORM_NUMBER 是我的 XPath 表达式,而 XFDLNamespaceContext 实现了 NamespaceContext,如下所示:
public class XFDLNamespaceContext implements NamespaceContext {
@Override
public String getNamespaceURI(String prefix) {
if (prefix == null) throw new NullPointerException("Invalid Namespace Prefix");
else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX))
return "http://www.PureEdge.com/XFDL/6.5";
else if ("custom".equals(prefix))
return "http://www.PureEdge.com/XFDL/Custom";
else if ("designer".equals(prefix))
return "http://www.PureEdge.com/Designer/6.1";
else if ("pecs".equals(prefix))
return "http://www.PureEdge.com/PECustomerService";
else if ("xfdl".equals(prefix))
return "http://www.PureEdge.com/XFDL/6.5";
else if ("xforms".equals(prefix))
return "http://www.w3.org/2003/xforms";
else
return XMLConstants.NULL_NS_URI;
}
@Override
public String getPrefix(String arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public Iterator getPrefixes(String arg0) {
// TODO Auto-generated method stub
return null;
}
}
I've tried many different XPath queries but I keep feeling like this should work:
我尝试了许多不同的 XPath 查询,但我一直觉得这应该有效:
protected static final String QUERY_FORM_NUMBER =
"/globalpage/global/xmlmodel/xforms:instances/instance" +
"/form_metadata/title/documentnbr[number]";
Unfortunately it does not work and I continually get a null return.
不幸的是它不起作用,我不断得到空返回。
I've done a fair amount of reading here, here, and here, but nothing has proved sufficiently illuminating to help me get this working.
我已经在这里、这里和这里阅读了大量的资料,但没有任何内容证明有足够的启发性来帮助我完成这项工作。
I'm almost positive that I'm going to face-palm when I figure this out but I'm really at wit's end as to what I'm missing.
当我弄清楚这一点时,我几乎可以肯定我会面对面,但我真的不知道我错过了什么。
Thank you for reading through all of this and thanks in advance for the help.
感谢您阅读所有这些内容,并提前感谢您的帮助。
-Andy
-安迪
采纳答案by Jason S
Aha, I tried to debug your expression + got it to work. You missed a few things. This XPath expression should do it:
啊哈,我试图调试你的表达式 + 让它工作。你错过了一些事情。这个 XPath 表达式应该这样做:
/XFDL/globalpage/global/xmlmodel/instances/instance/form_metadata/title/documentnbr/@number
- You need to include the root element (XFDL in this case)
- I didn't end up needing to use any namespaces in the expression for some reason. Not sure why. If this is the case, then the NamespaceContext.getNamespaceURI() never gets called. If I replace
instance
withxforms:instance
then getNamespaceURI() gets called once withxforms
as the input argument, but the program throws an exception. - The syntax for attribute values is
@attr
, not[attr]
.
- 您需要包含根元素(在本例中为 XFDL)
- 由于某种原因,我最终不需要在表达式中使用任何命名空间。不知道为什么。如果是这种情况,则 NamespaceContext.getNamespaceURI() 永远不会被调用。如果我替换
instance
为xforms:instance
然后 getNamespaceURI()xforms
作为输入参数被调用一次,但程序抛出异常。 - 属性值的语法是
@attr
, not[attr]
。
My complete sample code:
我的完整示例代码:
import java.io.File;
import java.io.IOException;
import java.util.Collections;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import javax.xml.XMLConstants;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;
public class XPathNamespaceExample {
static public class MyNamespaceContext implements NamespaceContext {
final private Map<String, String> prefixMap;
MyNamespaceContext(Map<String, String> prefixMap)
{
if (prefixMap != null)
{
this.prefixMap = Collections.unmodifiableMap(new HashMap<String, String>(prefixMap));
}
else
{
this.prefixMap = Collections.emptyMap();
}
}
public String getPrefix(String namespaceURI) {
// TODO Auto-generated method stub
return null;
}
public Iterator getPrefixes(String namespaceURI) {
// TODO Auto-generated method stub
return null;
}
public String getNamespaceURI(String prefix) {
if (prefix == null) throw new NullPointerException("Invalid Namespace Prefix");
else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX))
return "http://www.PureEdge.com/XFDL/6.5";
else if ("custom".equals(prefix))
return "http://www.PureEdge.com/XFDL/Custom";
else if ("designer".equals(prefix))
return "http://www.PureEdge.com/Designer/6.1";
else if ("pecs".equals(prefix))
return "http://www.PureEdge.com/PECustomerService";
else if ("xfdl".equals(prefix))
return "http://www.PureEdge.com/XFDL/6.5";
else if ("xforms".equals(prefix))
return "http://www.w3.org/2003/xforms";
else
return XMLConstants.NULL_NS_URI;
}
}
protected static final String QUERY_FORM_NUMBER =
"/XFDL/globalpage/global/xmlmodel/xforms:instances/instance" +
"/form_metadata/title/documentnbr[number]";
public static void main(String[] args) {
try
{
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.parse(new File(args[0]));
System.out.println(extractNodeValue(doc, "/XFDL/globalpage/@sid"));
System.out.println(extractNodeValue(doc, "/XFDL/globalpage/global/xmlmodel/instances/instance/@id" ));
System.out.println(extractNodeValue(doc, "/XFDL/globalpage/global/xmlmodel/instances/instance/form_metadata/title/documentnbr/@number" ));
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
}
private static String extractNodeValue(Document doc, String expression) {
try{
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new MyNamespaceContext(null));
Node result = (Node)xPath.evaluate(expression, doc, XPathConstants.NODE);
if(result != null) {
return result.getNodeValue();
} else {
throw new RuntimeException("can't find expression");
}
} catch (XPathExpressionException err) {
throw new RuntimeException(err);
}
}
}
回答by Grzegorz Szpetkowski
SAX (alternative to XPath) version:
SAX(替代 XPath)版本:
SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
final String[] number = new String[1];
DefaultHandler handler = new DefaultHandler()
{
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException
{
if (qName.equals("documentnbr"))
number[0] = attributes.getValue("number");
}
};
saxParser.parse("input.xml", handler);
System.out.println(number[0]);
I see it's more complicated to use XPath with namespaces as it should be(my opinion). Here is my (simple) code:
我看到它的更复杂的使用XPath名称空间,因为它应该是(我认为)。这是我的(简单)代码:
XPath xpath = XPathFactory.newInstance().newXPath();
NamespaceContextMap contextMap = new NamespaceContextMap();
contextMap.put("custom", "http://www.PureEdge.com/XFDL/Custom");
contextMap.put("designer", "http://www.PureEdge.com/Designer/6.1");
contextMap.put("pecs", "http://www.PureEdge.com/PECustomerService");
contextMap.put("xfdl", "http://www.PureEdge.com/XFDL/6.5");
contextMap.put("xforms", "http://www.w3.org/2003/xforms");
contextMap.put("", "http://www.PureEdge.com/XFDL/6.5");
xpath.setNamespaceContext(contextMap);
String expression = "//:documentnbr/@number";
InputSource inputSource = new InputSource("input.xml");
String number;
number = (String) xpath.evaluate(expression, inputSource, XPathConstants.STRING);
System.out.println(number);
You can get NamespaceContextMap class (not mine) from here(GPL license). There is also 6376058bug.
回答by gioele
Have a look at the XPathAPIlibrary. It is a simpler way to use XPath without messing with the low-level Java API, especially when dealing with namespaces.
查看XPathAPI库。这是使用 XPath 的一种更简单的方法,而不会弄乱低级 Java API,尤其是在处理名称空间时。
The code to get the number
attribute would be:
获取number
属性的代码是:
String num = XPathAPI.selectSingleNodeAsString(doc, '//documentnbr/@number');
Namespaces are automatically extracted from the root node (doc
in this case). In case you need to explicitly define additional namespaces you can use this:
命名空间是从根节点(doc
在这种情况下)自动提取的。如果您需要明确定义其他命名空间,您可以使用:
Map<String, String> nsMap = new HashMap<String, String>();
nsMap.put("xforms", "http://www.w3.org/2003/xforms");
String num =
XPathAPI.selectSingleNodeAsString(doc, '//documentnbr/@number', nsMap);
(Disclaimer: I'm the author of the library.)
(免责声明:我是图书馆的作者。)