.net 为什么需要 XmlNamespaceManager?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7178111/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 15:50:37  来源:igfitidea点击:

Why is XmlNamespaceManager necessary?

.netxpathxml-namespacesselectsinglenode

提问by Code Jockey

I've come up kinda dry as to why-- at least in the .Net Framework -- it is necessary to use an XmlNamespaceManagerin order to handle namespaces (or the rather clunky and verbose [local-name()=...XPath predicate/function/whatever) when performing XPath queries. I dounderstand why namespaces are necessary or at least beneficial, but whyis it so complex?

我有点不明白为什么- 至少在 .Net 框架中 -在执行 XPath 查询时,有必要使用 anXmlNamespaceManager来处理命名空间(或相当笨重和冗长的[local-name()=...XPath 谓词/函数/任何东西) . 我确实理解为什么命名空间是必要的或至少是有益的,但为什么它如此复杂?

In order to query a simple XML Document (no namespaces)...

为了查询一个简单的 XML 文档(没有命名空间)...

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode>
   <nodeName>Some Text Here</nodeName>
</rootNode>

...one can use something like doc.SelectSingleNode("//nodeName")(which would match <nodeName>Some Text Here</nodeName>)

...可以使用类似的东西doc.SelectSingleNode("//nodeName")(匹配<nodeName>Some Text Here</nodeName>

Mystery #1: My first annoyance-- If I understand correctly -- is that merely adding a namespace reference to the parent/root tag (whether used as part of a child node tag or not) like so:

谜团 #1我的第一个烦恼——如果我理解正确的话——只是向父/根标记(无论是否用作子节点标记的一部分)添加命名空间引用,如下所示:

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode xmlns="http://example.com/xmlns/foo">
   <nodeName>Some Text Here</nodeName>
</rootNode>

...requires several extra lines of code to get the same result:

...需要几行额外的代码才能获得相同的结果:

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("ab", "http://example.com/xmlns/foo")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//ab:nodeName", nsmgr)

...essentially dreaming up a non-existent prefix ("ab") to find a node that doesn't even use a prefix. How does this make sense?What is wrong (conceptually) with doc.SelectSingleNode("//nodeName")?

...基本上是想出一个不存在的前缀 (" ab") 来找到一个甚至不使用前缀的节点。这有什么意义?有什么问题(概念上)doc.SelectSingleNode("//nodeName")

Mystery #2: So, say you've got an XML document that uses prefixes:

谜团 #2:那么,假设您有一个使用前缀的 XML 文档:

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode xmlns:cde="http://example.com/xmlns/foo" xmlns:feg="http://example.com/xmlns/bar">
   <cde:nodeName>Some Text Here</cde:nodeName>
   <feg:nodeName>Some Other Value</feg:nodeName>
   <feg:otherName>Yet Another Value</feg:otherName>
</rootNode>

... If I understand correctly, you would have to add both namespaces to the XmlNamespaceManager, in order to make a query for a single node...

...如果我理解正确,您必须将两个命名空间都添加到XmlNamespaceManager, 以便查询单个节点...

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("cde", "http://example.com/xmlns/foo")
nsmgr.AddNamespace("feg", "http://example.com/xmlns/bar")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//feg:nodeName", nsmgr)

... Why, in this case, do I need (conceptually) a namespace manager?

... 为什么,在这种情况下,我需要(概念上)一个命名空间管理器?

******REDACTED into comments below****

****** 已编辑为下面的评论****

Edit Added:My revised and refined question is based upon the apparent redundancy of the XmlNamespaceManager in what I believe to be the majority of cases and the use of the namespace manager to specify a mapping of prefix to URI:

编辑补充:我修改和改进的问题是基于 XmlNamespaceManager 在我认为是大多数情况下的明显冗余以及使用命名空间管理器来指定前缀到 URI 的映射:

When the direct mapping of the namespace prefix ("cde") to the namespace URI ("http://example.com/xmlns/foo") is explicitly stated in the source document:

当源文档中明确说明了命名空间前缀(“cde”)到命名空间 URI(“ http://example.com/xmlns/foo”)的直接映射时:

...<rootNode xmlns:cde="http://example.com/xmlns/foo"...

what is the conceptual need for a programmer to recreate that mapping before making a query?

程序员在进行查询之前重新创建该映射的概念需求是什么?

采纳答案by Paul Butcher

The basic point (as pointed out by Kev, above), is that the namespace URI is the important part of the namespace, rather than the namespace prefix, the prefix is an "arbitrary convenience"

基本点(正如上面 Kev所指出的),是命名空间 URI 是命名空间的重要组成部分,而不是命名空间前缀,前缀是“任意方便”

As for why you need a namespace manager, rather than there being some magic that works it out using the document, I can think of two reasons.

至于为什么需要命名空间管理器,而不是使用文档来解决这个问题,我可以想到两个原因。

Reason 1

原因一

If it were permitted to only add namespace declarations to the documentElement, as in your examples, it would indeed be trivial for selectSingleNode to just use whatever is defined.

如果只允许将命名空间声明添加到 documentElement,就像在您的示例中一样,那么 selectSingleNode 只使用定义的任何内容确实是微不足道的。

However, you can define namespace prefixes on any element in a document, and namespace prefixes are not uniquely bound to any given namespace in a document. Consider the following example

但是,您可以在文档中的任何元素上定义命名空间前缀,并且命名空间前缀不会唯一绑定到文档中的任何给定命名空间。考虑下面的例子

<w xmlns:a="mynamespace">
  <a:x>
    <y xmlns:a="myOthernamespace">
      <z xmlns="mynamespace">
      <b:z xmlns:b="mynamespace">
      <z xmlns="myOthernamespace">
      <b:z xmlns:b="myOthernamespace">
    </y>
  </a:x>
</w>

In this example, what would you want //z, //a:zand //b:zto return? How, without some kind of external namespace manager, would you express that?

在这个例子中,你想要什么//z//a:z//b:z返回什么?如果没有某种外部命名空间管理器,您会如何表达?

Reason 2

原因2

It allows you to reuse the same XPath expression for any equivalent document, without needing to know anything about the namespace prefixes in use.

它允许您对任何等效文档重用相同的 XPath 表达式,而无需了解有关正在使用的名称空间前缀的任何信息。

myXPathExpression = "//z:y"
doc1.selectSingleNode(myXPathExpression);
doc2.selectSingleNode(myXPathExpression);

doc1:

文档 1:

<x>
  <z:y xmlns:z="mynamespace" />
</x>

doc2:

文档2:

<x xmlns"mynamespace">
  <y>
</x>

In order to achieve this latter goal without a namespace manager, you would have to inspect each document, building a custom XPath expression for each one.

为了在没有名称空间管理器的情况下实现后一个目标,您必须检查每个文档,为每个文档构建自定义 XPath 表达式。

回答by Adrian Zanescu

The reason is simple. There is no required connection between the prefixes you use in your XPath query and the declared prefixes in the xml document. To give an example the following xmls are semantically equivalent:

原因很简单。您在 XPath 查询中使用的前缀与 xml 文档中声明的前缀之间没有必要的联系。举个例子,以下 xml 在语义上是等效的:

<aaa:root xmlns:aaa="http://someplace.org">
 <aaa:element>text</aaa:element>
</aaa:root>

vs

对比

  <bbb:root xmlns:bbb="http://someplace.org">
     <bbb:element>text</bbb:element>
  </bbb:root>

The "ccc:root/ccc:element" query will match both instances provided there is a mapping in the namespace manager for that.

如果ccc:root/ccc:element命名空间管理器中有一个映射,“ ” 查询将匹配这两个实例。

nsmgr.AddNamespace("ccc", "http://someplace.org")

The .NET implementation does not care about the literal prefixes used in the xml only that there is a prefix defined for the query literal and that the namespace value matches the actual value of the doc. This is required to have constant query expressions even if the prefixes vary between consumed documents and it's the correct implementation for the general case.

.NET 实现不关心 xml 中使用的文字前缀,只关心为查询文字定义的前缀,并且命名空间值与文档的实际值匹配。即使前缀在使用的文档之间有所不同,这也需要具有恒定的查询表达式,并且这是一般情况的正确实现。

回答by Jez

As far as I can tell, there is no good reason that you should need to manually define an XmlNamespaceManagerto get at abc-prefixed nodes if you have a document like this:

据我所知,如果您有这样的文档,则没有充分的理由需要手动定义一个XmlNamespaceManager以获取abc-prefixed 节点:

<itemContainer xmlns:abc="http://abc.com" xmlns:def="http://def.com">
    <abc:nodeA>...</abc:nodeA>
    <def:nodeB>...</def:nodeB>
    <abc:nodeC>...</abc:nodeC>
</itemContainer>

Microsoft simply couldn't be bothered to write something to detect that xmlns:abchad already been specified in a parent node. I could be wrong, and if so, I'd welcome comments on this answer so I can update it.

微软根本不会费心去写一些东西来检测xmlns:abc已经在父节点中指定的东西。我可能是错的,如果是这样,我欢迎对此答案发表评论,以便我可以对其进行更新。

However, this blog postseems to confirm my suspicion. It basically says that you need to manually define an XmlNamespaceManagerand manually iterate through the xmlns:attributes, adding each one to the namespace manager. Dunno why Microsoft couldn't do this automatically.

然而,这篇博文似乎证实了我的怀疑。它基本上是说您需要手动定义一个XmlNamespaceManager并手动遍历xmlns:属性,将每个属性添加到命名空间管理器中。不知道为什么 Microsoft 不能自动执行此操作。

Here's a method I created based on that blog post to automatically generate an XmlNamespaceManagerbased on the xmlns:attributes of a source XmlDocument:

这是我基于该博客文章创建的一种方法,可XmlNamespaceManager根据xmlns:源的属性自动生成一个XmlDocument

/// <summary>
/// Creates an XmlNamespaceManager based on a source XmlDocument's name table, and prepopulates its namespaces with any 'xmlns:' attributes of the root node.
/// </summary>
/// <param name="sourceDocument">The source XML document to create the XmlNamespaceManager for.</param>
/// <returns>The created XmlNamespaceManager.</returns>
private XmlNamespaceManager createNsMgrForDocument(XmlDocument sourceDocument)
{
    XmlNamespaceManager nsMgr = new XmlNamespaceManager(sourceDocument.NameTable);

    foreach (XmlAttribute attr in sourceDocument.SelectSingleNode("/*").Attributes)
    {
        if (attr.Prefix == "xmlns")
        {
            nsMgr.AddNamespace(attr.LocalName, attr.Value);
        }
    }

    return nsMgr;
}

And I use it like so:

我像这样使用它:

XPathNavigator xNav = xmlDoc.CreateNavigator();
XPathNodeIterator xIter = xNav.Select("//abc:NodeC", createNsMgrForDocument(xmlDoc));

回答by Kev

I answer to point 1:

我回答第 1 点:

Setting a default namespace for an XML document still means that the nodes, even without a namespace prefix, i.e.:

为 XML 文档设置默认命名空间仍然意味着节点,即使没有命名空间前缀,即:

<rootNode xmlns="http://someplace.org">
   <nodeName>Some Text Here</nodeName>
</rootNode>

are no longer in the "empty" namespace. You still need some way to reference these nodes using XPath, so you create a prefix to reference them, even if it is "made up".

不再在“空”命名空间中。您仍然需要某种方式来使用 XPath 引用这些节点,因此您创建了一个前缀来引用它们,即使它是“组成的”。

To answer point 2:

回答第2点:

<rootNode xmlns:cde="http://someplace.org" xmlns:feg="http://otherplace.net">
   <cde:nodeName>Some Text Here</cde:nodeName>
   <feg:nodeName>Some Other Value</feg:nodeName>
   <feg:otherName>Yet Another Value</feg:otherName>
</rootNode>

Internally in the instance document, the nodes that reside in a namespace are stored with their node name and their long namespace name, it's called (in W3C parlance) an expanded name.

在实例文档内部,驻留在命名空间中的节点存储有它们的节点名称和它们的长命名空间名称,它被称为(在 W3C 术语中)扩展名

For example <cde:nodeName>is essentially stored as <http://someplace.org:nodeName>. A namespace prefix is an arbitrary convenience for humans so that when we type out XML or have to read it we don't have to do this:

例如<cde:nodeName>,本质上存储为<http://someplace.org:nodeName>. 命名空间前缀对人类来说是一种随意的便利,因此当我们输入 XML 或必须阅读它时,我们不必这样做:

<rootNode>
   <http://someplace.org:nodeName>Some Text Here</http://someplace.org:nodeName>
   <http://otherplace.net:nodeName>Some Other Value</http://otherplace.net:nodeName>
   <http://otherplace.net:otherName>Yet Another Value</http://otherplace.net:otherName>
</rootNode>

When an XML document is searched, it's not searched by the friendly prefix, they search is done by namespace URI so you have to tell XPath about your namespaces via a namespace table passed in using XmlNamespaceManager.

当搜索 XML 文档时,它不是通过友好前缀搜索,而是通过名称空间 URI 进行搜索,因此您必须通过使用传入的名称空间表告诉 XPath 您的名称空间XmlNamespaceManager

回答by Christian Schwarz

You need to register the URI/prefix pairs to the XmlNamespaceManager instance to let SelectSingleNode() know whichparticular "nodeName" node you're referring to - the one from "http://someplace.org" or the one from "http://otherplace.net".

您需要将 URI/前缀对注册到 XmlNamespaceManager 实例,以让 SelectSingleNode() 知道您指的是哪个特定的“nodeName”节点 - 来自“http://someplace.org”的节点或来自“http: //otherplace.net”。

Please note that the concrete prefix name doesn't matter when you're doing the XPath query. I believe this works too:

请注意,当您进行 XPath 查询时,具体的前缀名称并不重要。我相信这也有效:

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("any", "http://someplace.org")
nsmgr.AddNamespace("thing", "http://otherplace.net")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//thing:nodeName", nsmgr)

SelectSingleNode() just needs a connection between the prefix from your XPath expression and the namespace URI.

SelectSingleNode() 只需要在来自 XPath 表达式的前缀和命名空间 URI 之间建立连接。

回答by Phil R

This thread has helped me understand the issue of namespaces much more clearly. Thanks. When I saw Jez's code, I tried it because it looked like a better solution than I had programmed. I discovered some shortcomings with it, though. As written, it looks only in the root node (but namespaces can be listed anywhere.), and it doesn't handle default namespaces. I tried to address these issues by modifying his code, but to no avail.

这个线程帮助我更清楚地理解命名空间的问题。谢谢。当我看到Jez 的代码时,我尝试了它,因为它看起来比我编写的程序更好。不过,我发现了它的一些缺点。正如所写,它只在根节点中查找(但名称空间可以在任何地方列出。),并且不处理默认名称空间。我试图通过修改他的代码来解决这些问题,但无济于事。

Here is my version of that function. It uses regular expressions to find the namespace mappings throughout the file; works with default namespaces, giving them the arbitrary prefix 'ns'; and handles multiple occurrences of the same namespace.

这是我对该功能的版本。它使用正则表达式来查找整个文件中的命名空间映射;使用默认命名空间,为它们提供任意前缀“ns”;并处理同一命名空间的多次出现。

private XmlNamespaceManager CreateNamespaceManagerForDocument(XmlDocument document)
{
    var nsMgr = new XmlNamespaceManager(document.NameTable);

    // Find and remember each xmlns attribute, assigning the 'ns' prefix to default namespaces.
    var nameSpaces = new Dictionary<string, string>();
    foreach (Match match in new Regex(@"xmlns:?(.*?)=([\x22\x27])(.+?)").Matches(document.OuterXml))
        nameSpaces[match.Groups[1].Value + ":" + match.Groups[3].Value] = match.Groups[1].Value == "" ? "ns" : match.Groups[1].Value;

    // Go through the dictionary, and number non-unique prefixes before adding them to the namespace manager.
    var prefixCounts = new Dictionary<string, int>();
    foreach (var namespaceItem in nameSpaces)
    {
        var prefix = namespaceItem.Value;
        var namespaceURI = namespaceItem.Key.Split(':')[1];
        if (prefixCounts.ContainsKey(prefix)) 
            prefixCounts[prefix]++; 
        else 
            prefixCounts[prefix] = 0;
        nsMgr.AddNamespace(prefix + prefixCounts[prefix].ToString("#;;"), namespaceURI);
    }
    return nsMgr;
}