XPath、XML 命名空间和 Java

Question

提问by MrWizard54

I've spent the past day attempting to extract a one XML node out of the following document and am unable to grasp the nuances of XML Namespaces to make it work.

过去一天我一直试图从以下文档中提取一个 XML 节点，但无法掌握 XML 命名空间的细微差别以使其工作。

The XML file is to large to post in total so here is the portion that concerns me:

XML 文件太大而无法发布，所以这里是我关注的部分：

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<XFDL xmlns="http://www.PureEdge.com/XFDL/6.5" xmlns:custom="http://www.PureEdge.com/XFDL/Custom" xmlns:designer="http://www.PureEdge.com/Designer/6.1" xmlns:pecs="http://www.PureEdge.com/PECustomerService" xmlns:xfdl="http://www.PureEdge.com/XFDL/6.5">
   <globalpage sid="global">
      <global sid="global">
         <xmlmodel xmlns:xforms="http://www.w3.org/2003/xforms">
            <instances>
               <xforms:instance id="metadata">
                  <form_metadata>
                     <metadataver version="1.0"/>
                     <metadataverdate>
                        <date day="05" month="Jul" year="2005"/>
                     </metadataverdate>
                     <title>
                        <documentnbr number="2062" prefix.army="DA" scope="army" suffix=""/>
                        <longtitle>HAND RECEIPT/ANNEX NUMBER </longtitle>
                     </title>

The document continues and is well formed all the way down. I am attempting to extract the "number" attribute from the "documentnbr" tag (three from the bottom).

该文件继续并一直很好地形成。我试图从“documentnbr”标签（从底部开始三个）中提取“number”属性。

The code that I'm using to do this looks like this:

我用来执行此操作的代码如下所示：

/***
     * Locates the Document Number information in the file and returns the form number.
     * @return File's self-declared number.
     * @throws InvalidFormException Thrown when XPath cannot find the "documentnbr" element in the file.
     */
    public String getFormNumber() throws InvalidFormException
    {
        try{
            XPath xPath = XPathFactory.newInstance().newXPath();
            xPath.setNamespaceContext(new XFDLNamespaceContext());

            Node result = (Node)xPath.evaluate(QUERY_FORM_NUMBER, doc, XPathConstants.NODE);
            if(result != null) {
                return result.getNodeValue();
            } else {
                throw new InvalidFormException("Unable to identify form.");
            }

        } catch (XPathExpressionException err) {
            throw new InvalidFormException("Unable to find form number in file.");
        }

    }

Where QUERY_FORM_NUMBER is my XPath expression, and XFDLNamespaceContext implements NamespaceContext and looks like this:

其中 QUERY_FORM_NUMBER 是我的 XPath 表达式，而 XFDLNamespaceContext 实现了 NamespaceContext，如下所示：

public class XFDLNamespaceContext implements NamespaceContext {

    @Override
    public String getNamespaceURI(String prefix) {
        if (prefix == null) throw new NullPointerException("Invalid Namespace Prefix");
        else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX))
            return "http://www.PureEdge.com/XFDL/6.5";
        else if ("custom".equals(prefix))
            return "http://www.PureEdge.com/XFDL/Custom";
        else if ("designer".equals(prefix)) 
            return "http://www.PureEdge.com/Designer/6.1";
        else if ("pecs".equals(prefix)) 
            return "http://www.PureEdge.com/PECustomerService";
        else if ("xfdl".equals(prefix))
            return "http://www.PureEdge.com/XFDL/6.5";      
        else if ("xforms".equals(prefix)) 
            return "http://www.w3.org/2003/xforms";
        else    
            return XMLConstants.NULL_NS_URI;
    }

    @Override
    public String getPrefix(String arg0) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public Iterator getPrefixes(String arg0) {
        // TODO Auto-generated method stub
        return null;
    }

}

I've tried many different XPath queries but I keep feeling like this should work:

我尝试了许多不同的 XPath 查询，但我一直觉得这应该有效：

protected static final String QUERY_FORM_NUMBER = 
        "/globalpage/global/xmlmodel/xforms:instances/instance" + 
        "/form_metadata/title/documentnbr[number]";

Unfortunately it does not work and I continually get a null return.

不幸的是它不起作用，我不断得到空返回。

I've done a fair amount of reading here, here, and here, but nothing has proved sufficiently illuminating to help me get this working.

我已经在这里、这里和这里阅读了大量的资料，但没有任何内容证明有足够的启发性来帮助我完成这项工作。

I'm almost positive that I'm going to face-palm when I figure this out but I'm really at wit's end as to what I'm missing.

当我弄清楚这一点时，我几乎可以肯定我会面对面，但我真的不知道我错过了什么。

Thank you for reading through all of this and thanks in advance for the help.

感谢您阅读所有这些内容，并提前感谢您的帮助。

-Andy

-安迪

Answer 1

采纳答案by Jason S

Aha, I tried to debug your expression + got it to work. You missed a few things. This XPath expression should do it:

啊哈，我试图调试你的表达式 + 让它工作。你错过了一些事情。这个 XPath 表达式应该这样做：

/XFDL/globalpage/global/xmlmodel/instances/instance/form_metadata/title/documentnbr/@number

You need to include the root element (XFDL in this case)
I didn't end up needing to use any namespaces in the expression for some reason. Not sure why. If this is the case, then the NamespaceContext.getNamespaceURI() never gets called. If I replace instancewith xforms:instancethen getNamespaceURI() gets called once with xformsas the input argument, but the program throws an exception.
The syntax for attribute values is @attr, not [attr].

您需要包含根元素（在本例中为 XFDL）
由于某种原因，我最终不需要在表达式中使用任何命名空间。不知道为什么。如果是这种情况，则 NamespaceContext.getNamespaceURI() 永远不会被调用。如果我替换instance为xforms:instance然后 getNamespaceURI()xforms作为输入参数被调用一次，但程序抛出异常。
属性值的语法是@attr, not [attr]。

My complete sample code:

我的完整示例代码：

import java.io.File;
import java.io.IOException;
import java.util.Collections;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

import javax.xml.XMLConstants;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

public class XPathNamespaceExample {
    static public class MyNamespaceContext implements NamespaceContext {
        final private Map<String, String> prefixMap;
        MyNamespaceContext(Map<String, String> prefixMap)
        {
            if (prefixMap != null)
            {
                this.prefixMap = Collections.unmodifiableMap(new HashMap<String, String>(prefixMap));
            }
            else
            {
                this.prefixMap = Collections.emptyMap();
            }
        }
        public String getPrefix(String namespaceURI) {
            // TODO Auto-generated method stub
            return null;
        }
        public Iterator getPrefixes(String namespaceURI) {
            // TODO Auto-generated method stub
            return null;
        }
        public String getNamespaceURI(String prefix) {
                if (prefix == null) throw new NullPointerException("Invalid Namespace Prefix");
                else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX))
                    return "http://www.PureEdge.com/XFDL/6.5";
                else if ("custom".equals(prefix))
                    return "http://www.PureEdge.com/XFDL/Custom";
                else if ("designer".equals(prefix)) 
                    return "http://www.PureEdge.com/Designer/6.1";
                else if ("pecs".equals(prefix)) 
                    return "http://www.PureEdge.com/PECustomerService";
                else if ("xfdl".equals(prefix))
                    return "http://www.PureEdge.com/XFDL/6.5";      
                else if ("xforms".equals(prefix)) 
                    return "http://www.w3.org/2003/xforms";
                else    
                    return XMLConstants.NULL_NS_URI;
        }


    }

    protected static final String QUERY_FORM_NUMBER = 
        "/XFDL/globalpage/global/xmlmodel/xforms:instances/instance" + 
        "/form_metadata/title/documentnbr[number]";

    public static void main(String[] args) {
        try
        {
            DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
            DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
            Document doc = docBuilder.parse(new File(args[0]));
            System.out.println(extractNodeValue(doc, "/XFDL/globalpage/@sid"));
            System.out.println(extractNodeValue(doc, "/XFDL/globalpage/global/xmlmodel/instances/instance/@id" ));
            System.out.println(extractNodeValue(doc, "/XFDL/globalpage/global/xmlmodel/instances/instance/form_metadata/title/documentnbr/@number" ));
        } catch (SAXException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        }
    }

    private static String extractNodeValue(Document doc, String expression) {
        try{

            XPath xPath = XPathFactory.newInstance().newXPath();
            xPath.setNamespaceContext(new MyNamespaceContext(null));

            Node result = (Node)xPath.evaluate(expression, doc, XPathConstants.NODE);
            if(result != null) {
                return result.getNodeValue();
            } else {
                throw new RuntimeException("can't find expression");
            }

        } catch (XPathExpressionException err) {
            throw new RuntimeException(err);
        }
    }
}

Answer 2

回答by Grzegorz Szpetkowski

SAX (alternative to XPath) version:

SAX（替代 XPath）版本：

SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
final String[] number = new String[1];
DefaultHandler handler = new DefaultHandler()
{           
    @Override
    public void startElement(String uri, String localName, String qName,
    Attributes attributes) throws SAXException
    {
        if (qName.equals("documentnbr"))
            number[0] = attributes.getValue("number");
    }
};
saxParser.parse("input.xml", handler);
System.out.println(number[0]);

I see it's more complicated to use XPath with namespaces as it should be(my opinion). Here is my (simple) code:

我看到它的更复杂的使用XPath名称空间，因为它应该是（我认为）。这是我的（简单）代码：

XPath xpath = XPathFactory.newInstance().newXPath();

NamespaceContextMap contextMap = new NamespaceContextMap();
contextMap.put("custom", "http://www.PureEdge.com/XFDL/Custom");
contextMap.put("designer", "http://www.PureEdge.com/Designer/6.1");
contextMap.put("pecs", "http://www.PureEdge.com/PECustomerService");
contextMap.put("xfdl", "http://www.PureEdge.com/XFDL/6.5");
contextMap.put("xforms", "http://www.w3.org/2003/xforms");
contextMap.put("", "http://www.PureEdge.com/XFDL/6.5");

xpath.setNamespaceContext(contextMap);
String expression = "//:documentnbr/@number";
InputSource inputSource = new InputSource("input.xml");
String number;
number = (String) xpath.evaluate(expression, inputSource, XPathConstants.STRING);
System.out.println(number);

You can get NamespaceContextMap class (not mine) from here(GPL license). There is also 6376058bug.

您可以从这里（GPL 许可）获取 NamespaceContextMap 类（不是我的）。还有6376058错误。

Answer 3

回答by gioele

Have a look at the XPathAPIlibrary. It is a simpler way to use XPath without messing with the low-level Java API, especially when dealing with namespaces.

查看XPathAPI库。这是使用 XPath 的一种更简单的方法，而不会弄乱低级 Java API，尤其是在处理名称空间时。

The code to get the numberattribute would be:

获取number属性的代码是：

String num = XPathAPI.selectSingleNodeAsString(doc, '//documentnbr/@number');

Namespaces are automatically extracted from the root node (docin this case). In case you need to explicitly define additional namespaces you can use this:

命名空间是从根节点（doc在这种情况下）自动提取的。如果您需要明确定义其他命名空间，您可以使用：

Map<String, String> nsMap = new HashMap<String, String>();
nsMap.put("xforms", "http://www.w3.org/2003/xforms");

String num =
    XPathAPI.selectSingleNodeAsString(doc, '//documentnbr/@number', nsMap);

(Disclaimer: I'm the author of the library.)

（免责声明：我是图书馆的作者。）

XPath、XML 命名空间和 Java

提问by MrWizard54

采纳答案by Jason S

回答by Grzegorz Szpetkowski

回答by gioele

相关推荐

最近更新

标签

XPath、XML 命名空间和 Java

提问by MrWizard54

采纳答案by Jason S

回答by Grzegorz Szpetkowski

回答by gioele

相关推荐

java Java中是否有常用的有理数库？

java 初始化 JFrame

java 休眠刷新？

Java 中的并发：同步静态方法

相关推荐

最近更新

标签