如何使用带有 XPath 的 Java 中的命名空间查询 XML?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6390339/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to query XML using namespaces in Java with XPath?
提问by Inez
When my XML looks like this (no xmlns
) then I can easly query it with XPath like /workbook/sheets/sheet[1]
当我的 XML 看起来像这样(否xmlns
)时,我可以使用 XPath 轻松查询它,例如/workbook/sheets/sheet[1]
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook>
<sheets>
<sheet name="Sheet1" sheetId="1" r:id="rId1"/>
</sheets>
</workbook>
But when it looks like this then I can't
但是当它看起来像这样时,我就不能
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<sheets>
<sheet name="Sheet1" sheetId="1" r:id="rId1"/>
</sheets>
</workbook>
Any ideas?
有任何想法吗?
采纳答案by Mads Hansen
In the second example XML file the elements are bound to a namespace. Your XPath is attempting to address elements that are bound to the default "no namespace" namespace, so they don't match.
在第二个示例 XML 文件中,元素绑定到命名空间。您的 XPath 试图解决绑定到默认“无命名空间”命名空间的元素,因此它们不匹配。
The preferred method is to register the namespace with a namespace-prefix. It makes your XPath much easier to develop, read, and maintain.
首选方法是使用命名空间前缀注册命名空间。它使您的 XPath 更易于开发、阅读和维护。
However, it is not mandatory that you register the namespace and use the namespace-prefix in your XPath.
但是,注册命名空间并在 XPath 中使用命名空间前缀并不是强制性的。
You canformulate an XPath expression that uses a generic match for an element and a predicate filter that restricts the match for the desired local-name()
and the namespace-uri()
. For example:
您可以制定一个 XPath 表达式,该表达式使用元素的通用匹配以及限制所需local-name()
和namespace-uri()
. 例如:
/*[local-name()='workbook'
and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']
/*[local-name()='sheets'
and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']
/*[local-name()='sheet'
and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'][1]
As you can see, it produces an extremely long and verbose XPath statement that is very difficult to read (and maintain).
如您所见,它生成了一个非常长且冗长的 XPath 语句,非常难以阅读(和维护)。
You could also just match on the local-name()
of the element and ignore the namespace. For example:
您也可以只匹配local-name()
元素的 并忽略命名空间。例如:
/*[local-name()='workbook']/*[local-name()='sheets']/*[local-name()='sheet'][1]
However, you run the risk of matching the wrong elements.If your XML has mixed vocabularies (which may not be an issue for this instance) that use the same local-name()
, your XPath could match on the wrong elements and select the wrong content:
但是,您冒着匹配错误元素的风险。如果您的 XML 具有使用相同 的混合词汇(这可能不是这个实例的问题)local-name()
,您的 XPath 可能会匹配错误的元素并选择错误的内容:
回答by stevevls
Your problem is the default namespace. Check out this article for how to deal with namespaces in your XPath: http://www.edankert.com/defaultnamespaces.html
您的问题是默认命名空间。查看这篇文章,了解如何处理 XPath 中的名称空间:http: //www.edankert.com/defaultnamespaces.html
One of the conclusions they draw is:
他们得出的结论之一是:
So, to be able to use XPath expressions on XML content defined in a (default) namespace, we need to specify a namespace prefix mapping
因此,为了能够在(默认)命名空间中定义的 XML 内容上使用 XPath 表达式,我们需要指定命名空间前缀映射
Note that this doesn't mean that you have to change your source document in any way (though you're free to put the namespace prefixes in there if you so desire). Sounds strange, right? What you willdo is create a namespace prefix mapping in your java code and use said prefix in your XPath expression. Here, we'll create a mapping from spreadsheet
to your default namespace.
请注意,这并不意味着您必须以任何方式更改源文档(尽管您可以根据需要随意将名称空间前缀放入其中)。听起来很奇怪,对吧?什么,你会做的是在你的Java代码和使用创建一个命名空间前缀映射说在你的XPath表达式的前缀。在这里,我们将创建一个映射spreadsheet
到您的默认命名空间。
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
// there's no default implementation for NamespaceContext...seems kind of silly, no?
xpath.setNamespaceContext(new NamespaceContext() {
public String getNamespaceURI(String prefix) {
if (prefix == null) throw new NullPointerException("Null prefix");
else if ("spreadsheet".equals(prefix)) return "http://schemas.openxmlformats.org/spreadsheetml/2006/main";
else if ("xml".equals(prefix)) return XMLConstants.XML_NS_URI;
return XMLConstants.NULL_NS_URI;
}
// This method isn't necessary for XPath processing.
public String getPrefix(String uri) {
throw new UnsupportedOperationException();
}
// This method isn't necessary for XPath processing either.
public Iterator getPrefixes(String uri) {
throw new UnsupportedOperationException();
}
});
// note that all the elements in the expression are prefixed with our namespace mapping!
XPathExpression expr = xpath.compile("/spreadsheet:workbook/spreadsheet:sheets/spreadsheet:sheet[1]");
// assuming you've got your XML document in a variable named doc...
Node result = (Node) expr.evaluate(doc, XPathConstants.NODE);
And voila...Now you've got your element saved in the result
variable.
瞧...现在你已经把你的元素保存在result
变量中了。
Caveat:if you're parsing your XML as a DOM with the standard JAXP classes, be sure to call setNamespaceAware(true)
on your DocumentBuilderFactory
. Otherwise, this code won't work!
警告:如果您使用标准 JAXP 类将 XML 解析为 DOM,请务必调用setNamespaceAware(true)
您的DocumentBuilderFactory
. 否则,此代码将不起作用!
回答by cordsen
Make sure that you are referencing the namespace in your XSLT
确保您在 XSLT 中引用名称空间
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" >
回答by Wayne
All namespaces that you intend to select from in the source XML must be associated with a prefix in the host language. In Java/JAXP this is done by specifying the URI for each namespace prefix using an instance of javax.xml.namespace.NamespaceContext
. Unfortunately, there is no implementationof NamespaceContext
provided in the SDK.
您打算从源 XML 中选择的所有名称空间必须与宿主语言中的前缀相关联。在 Java/JAXP 中,这是通过使用javax.xml.namespace.NamespaceContext
. 不幸的是,没有实现的NamespaceContext
SDK中提供。
Fortunately, it's very easy to write your own:
幸运的是,编写自己的代码非常容易:
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import javax.xml.namespace.NamespaceContext;
public class SimpleNamespaceContext implements NamespaceContext {
private final Map<String, String> PREF_MAP = new HashMap<String, String>();
public SimpleNamespaceContext(final Map<String, String> prefMap) {
PREF_MAP.putAll(prefMap);
}
public String getNamespaceURI(String prefix) {
return PREF_MAP.get(prefix);
}
public String getPrefix(String uri) {
throw new UnsupportedOperationException();
}
public Iterator getPrefixes(String uri) {
throw new UnsupportedOperationException();
}
}
Use it like this:
像这样使用它:
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
HashMap<String, String> prefMap = new HashMap<String, String>() {{
put("main", "http://schemas.openxmlformats.org/spreadsheetml/2006/main");
put("r", "http://schemas.openxmlformats.org/officeDocument/2006/relationships");
}};
SimpleNamespaceContext namespaces = new SimpleNamespaceContext(prefMap);
xpath.setNamespaceContext(namespaces);
XPathExpression expr = xpath
.compile("/main:workbook/main:sheets/main:sheet[1]");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
Note that even though the first namespace does not specify a prefix in the source document (i.e. it is the default namespace) you must associate it with a prefix anyway. Your expression should then reference nodes in that namespace using the prefix you've chosen, like this:
请注意,即使第一个名称空间未在源文档中指定前缀(即它是默认名称空间),您也必须将其与前缀相关联。然后,您的表达式应使用您选择的前缀引用该命名空间中的节点,如下所示:
/main:workbook/main:sheets/main:sheet[1]
The prefix names you choose to associate with each namespace are arbitrary; they do not need to match what appears in the source XML.This mapping is just a way to tell the XPath engine that a given prefix name in an expression correlates with a specific namespace in the source document.
您选择与每个命名空间关联的前缀名称是任意的;它们不需要与源 XML 中出现的内容相匹配。这种映射只是告诉 XPath 引擎表达式中给定的前缀名称与源文档中的特定名称空间相关联的一种方式。
回答by tomaj
I've written a simple NamespaceContext
implementation (here), that takes a Map<String, String>
as input, where the key
is a prefix, and the value
is a namespace.
我已经编写了一个简单的NamespaceContext
实现(here),它将 aMap<String, String>
作为输入,其中 thekey
是前缀,而 thevalue
是命名空间。
It follows the NamespaceContextspesification, and you can see how it works in the unit tests.
它遵循NamespaceContext规范,您可以在单元测试中看到它是如何工作的。
Map<String, String> mappings = new HashMap<>();
mappings.put("foo", "http://foo");
mappings.put("foo2", "http://foo");
mappings.put("bar", "http://bar");
context = new SimpleNamespaceContext(mappings);
context.getNamespaceURI("foo"); // "http://foo"
context.getPrefix("http://foo"); // "foo" or "foo2"
context.getPrefixes("http://foo"); // ["foo", "foo2"]
Note that it has a dependency on Google Guava
请注意,它依赖于Google Guava
回答by kasi
If you are using Spring, it already contains org.springframework.util.xml.SimpleNamespaceContext.
如果您使用的是 Spring,它已经包含 org.springframework.util.xml.SimpleNamespaceContext。
import org.springframework.util.xml.SimpleNamespaceContext;
...
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
SimpleNamespaceContext nsc = new SimpleNamespaceContext();
nsc.bindNamespaceUri("a", "http://some.namespace.com/nsContext");
xpath.setNamespaceContext(nsc);
XPathExpression xpathExpr = xpath.compile("//a:first/a:second");
String result = (String) xpathExpr.evaluate(object, XPathConstants.STRING);
回答by rogerdpack
Startlingly, if I don't set factory.setNamespaceAware(true);
then the xpath you mentioned does work with and without namespaces at play. You just aren't able to select things "with namespace specified" only generic xpaths. Go figure. So this may be an option:
令人惊讶的是,如果我不设置,factory.setNamespaceAware(true);
那么您提到的 xpath 确实可以在使用和不使用名称空间的情况下工作。您只是无法选择“指定命名空间”的内容,而只能选择通用 xpath。去搞清楚。所以这可能是一个选择:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);
回答by joriki
Two things to add to the existing answers:
要添加到现有答案中的两件事:
I don't know whether this was the case when you asked the question: With Java 10, your XPath actually works for the second document if you don't use
setNamespaceAware(true)
on the document builder factory (false
is the default).If you do want to use
setNamespaceAware(true)
, other answers have already shown how to do this using a namespace context. However, you don't need to provide the mapping of prefixes to namespaces yourself, as these answers do: It's already there in the document element, and you can use that for your namespace context:
我不知道当您问这个问题时是否是这种情况:使用 Java 10,如果您不在
setNamespaceAware(true)
文档构建器工厂(false
是默认设置)上使用,您的 XPath 实际上适用于第二个文档。如果您确实想使用
setNamespaceAware(true)
,其他答案已经展示了如何使用命名空间上下文来做到这一点。但是,您不需要自己提供前缀到命名空间的映射,就像这些答案所做的那样:它已经存在于文档元素中,您可以将其用于命名空间上下文:
import java.util.Iterator;
import javax.xml.namespace.NamespaceContext;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class DocumentNamespaceContext implements NamespaceContext {
Element documentElement;
public DocumentNamespaceContext (Document document) {
documentElement = document.getDocumentElement();
}
public String getNamespaceURI(String prefix) {
return documentElement.getAttribute(prefix.isEmpty() ? "xmlns" : "xmlns:" + prefix);
}
public String getPrefix(String namespaceURI) {
throw new UnsupportedOperationException();
}
public Iterator<String> getPrefixes(String namespaceURI) {
throw new UnsupportedOperationException();
}
}
The rest of the code is as in the other answers. Then the XPath /:workbook/:sheets/:sheet[1]
yields the sheet element. (You could also use a non-empty prefix for the default namespace, as the other answers do, by replacing prefix.isEmpty()
by e.g. prefix.equals("spreadsheet")
and using the XPath /spreadsheet:workbook/spreadsheet:sheets/spreadsheet:sheet[1]
.)
其余代码与其他答案相同。然后 XPath/:workbook/:sheets/:sheet[1]
生成工作表元素。(您也可以使用非空前缀作为默认名称空间,就像其他答案一样,通过替换prefix.isEmpty()
为 egprefix.equals("spreadsheet")
并使用 XPath /spreadsheet:workbook/spreadsheet:sheets/spreadsheet:sheet[1]
。)
P.S.: I just found herethat there's actually a method Node.lookupNamespaceURI(String prefix)
, so you could use that instead of the attribute lookup:
PS:我刚刚在这里发现实际上有一个方法Node.lookupNamespaceURI(String prefix)
,因此您可以使用它而不是属性查找:
public String getNamespaceURI(String prefix) {
return documentElement.lookupNamespaceURI(prefix.isEmpty() ? null : prefix);
}
Also, note that namespaces can be declared on elements other than the document element, and those wouldn't be recognized (by either version).
另外,请注意命名空间可以在文档元素以外的元素上声明,并且这些元素不会被识别(由任一版本)。