xml 什么是空元素?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2279501/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is an empty element?
提问by Roland Bouman
According to the XML spec, this is the definition of an empty element:
根据 XML 规范,这是一个空元素的定义:
An element with no content is said to be empty.] The representation of an empty element is either a start-tag immediately followed by an end-tag, or an empty-element tag.
没有内容的元素称为空元素。] 空元素的表示形式要么是紧跟在结束标签之后的开始标签,要么是空元素标签。
(see: http://www.w3.org/TR/REC-xml/#NT-content)
(见:http: //www.w3.org/TR/REC-xml/#NT-content)
Now, I have no problem understanding empty-element tags: <i-am-empty/>and no misunderstanding is possible. But it seems to me the standard contradicts itself in the other case: on the one hand it says that any tag with no contentis empty, on the other hand it says that this can be represented by a start-tag followed immediately by an end-tag. But if we look at the definition of content:
现在,我对空元素标签的理解没有问题:<i-am-empty/>并且没有误解是可能的。但在我看来,该标准在另一种情况下自相矛盾:一方面它说任何带有 no 的标签content都是空的,另一方面它说这可以用一个开始标签来表示,后面紧跟一个结束 -标签。但是如果我们看一下 的定义content:
[43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)*
It seems to me that contentconsists of two optional parts, CharData?and a group ()*. But since both these parts are optional, it would mean that nothing (as in, absence of characters) matches this production. SO if I would try to match this definition of content to whatever is inside <am-i-empty-or-not></am-i-empty-or-not>I would get a positive match. So, on the one hand this is an empty tag because it is "a start-tag immediately followed by an end-tag", on the other hand it is not empty because between the tags I can positively match the definition of production rule [43] for content, in which case it contains content, which means it can't be empty.
在我看来,它content由两个可选部分CharData?和一个 group 组成()*。但是由于这两个部分都是可选的,这意味着没有任何东西(如缺少字符)与此产生式匹配。因此,如果我尝试将这个内容定义与里面的任何内容相匹配,<am-i-empty-or-not></am-i-empty-or-not>我会得到一个肯定的匹配。所以,一方面这是一个空标签,因为它是“一个紧跟一个结束标签的开始标签”,另一方面它不是空的,因为在标签之间我可以肯定地匹配生产规则的定义[ 43] 为内容,在这种情况下它包含内容,这意味着它不能为空。
Can anybody explain what rules take precedence? Does anybody know about any DOM or parser implementations that have differrent opinions on this?
谁能解释什么规则优先?有人知道对此有不同意见的任何 DOM 或解析器实现吗?
采纳答案by Thilo
But since both these parts are optional, it would mean that nothing (as in, absence of characters) matches this production.
但是由于这两个部分都是可选的,这意味着没有任何东西(如缺少字符)与此产生式匹配。
That may be true, but the wording in the spec on this issue is quite clear. There are even examples for empty elements in the next paragraph.
这可能是真的,但规范中关于这个问题的措辞非常清楚。下一段中甚至还有空元素的示例。
<IMG align="left"
src="http://www.w3.org/Icons/WWW/w3c_home" />
<br></br>
<br/>
So the only way (in this context, with the surrounding wording and examples) to read
所以唯一的方式(在这种情况下,用周围的措辞和例子)阅读
An element with no content
一个没有内容的元素
would be to include "content that (while matching the production) is completely empty" (i.e. zero-length, not even white-space).
将包括“(在匹配生产时)完全为空的内容”(即零长度,甚至不是空白)。
回答by Ian Boyd
I wanted to check what different variations of "empty"actually are empty.
我想检查一下“空”的哪些不同变体实际上是空的。
Variation A
变体A
<Santa/>
<Santa/>
gives a tree of
给出一棵树
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
Variation B
变体B
<Santa></Santa>
<Santa></Santa>
gives a DOM tree of:
给出一个 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
Variation C
变体 C
<Santa>Space</Santa>
<Santa>Space</Santa>
gives a DOM tree of:
给出一个 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
Variation D
变体 D
<Santa>Tab</Santa>
<Santa>Tab</Santa>
gives a DOM tree of:
给出一个 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
Variation E
变体E
<Santa>CRLF</Santa>
<Santa>CRLF</Santa>
gives a DOM tree of:
给出一个 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
All variations of text give the same DOM tree. When a XML document is asked to serialize itself, the DOM tree:
文本的所有变体都提供相同的 DOM 树。当 XML 文档被要求对其自身进行序列化时,DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
results in the serialized text:
导致序列化文本:
<?xml version="1.0"?>
<Santa/>
Manually adding an empty text node
手动添加空文本节点
I wanted to see what happens if i build the DOM tree:
我想看看如果我构建 DOM 树会发生什么:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
|- NODE_TEXT #text ""
using the pseudo-code:
使用伪代码:
XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.appendChild(doc.CreateText(""));
When that DOM document is saved to a stream, it comes out as:
当该 DOM 文档被保存到流中时,它会显示为:
<?xml version="1.0"?>
<Santa/>
Even when the element is forced to have a child (i.e. forced to not be empty), the DOM takes it to be empty.
甚至当元素被迫有一个孩子(即被迫不为空)时,DOM 将其视为空。
Force text node with whitespace
强制带有空格的文本节点
And then if i make sure to put some whitespace in the TEXTnode:
然后,如果我确保在TEXT节点中放置一些空格:
XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.appendChild(doc.CreateText(" "));
It comes out as the XML:
它作为 XML 出现:
<?xml version="1.0" ?>
<Santa> </Santa>
with the DOM tree:
使用 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
|- NODE_TEXT #text " "
Interesting; it's not round-trippable.
有趣的; 它不是往返的。
Force a TAB CRLF
强制一个 TAB CRLF
XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.appendChild(doc.CreateText(TAB+LF+CR));
It comes out as the XML:
它作为 XML 出现:
<?xml version="1.0"?> <Santa>TABLF CR </Santa>
with the DOM tree:
使用 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
|- NODE_TEXT #text "\t\n\n"
Yes, XML converts all CRinto LF, and yes, it's not round-trippable. If you parse:
是的,XML 将全部转换CR为LF,并且是的,它不是可往返的。如果你解析:
<?xml version="1.0"?> <Santa>TABLF CR </Santa>
you will get the DOM tree of:
您将获得以下 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
Setting element.text
设置 element.text
Finally we come to what happens if you set an element's text through it's .textproperty.
最后我们来看看如果你通过它的.text属性设置一个元素的文本会发生什么。
Set no text:
设置无文字:
XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
//santa.text = ""; example where we don't set the text
gives the DOM tree:
给出 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
and the XML:
和 XML:
<?xml version="1.0"?>
<Santa/>
Setting empty text
设置空文本
XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.text = ""; //example where we do set the text
gives the DOM tree:
给出 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
|- NODE_TEXT #text ""
and the XML:
和 XML:
<?xml version="1.0"?>
<Santa/>
Setting single space
设置单个空间
XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.text = " ";
gives the DOM tree:
给出 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
|- NODE_TEXT #text " "
and the XML:
和 XML:
<?xml version="1.0"?>
<Santa> </Santa>
Setting more whitepsace
设置更多空白
XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.text = LF+TAB+CR;
gives the DOM tree:
给出 DOM 树:
|- NODE_DOCUMENT #document ""
|- NODE_ELEMENT Santa ""
|- NODE_TEXT #text "\n\t\n"
and the XML:
和 XML:
<?xml version="1.0"?> <Santa>LF TABLF </Santa>
So what they told you was true, from a certain point of view.
所以他们告诉你的是真的,从某种角度来看。
- an xml string that contains only whitespace in the element will be empty when parsed
- an DOM element that contain only whitespace in its text node will render the whitespace when converted to an xml string
- 元素中仅包含空格的 xml 字符串在解析时将为空
- 在其文本节点中仅包含空格的 DOM 元素将在转换为 xml 字符串时呈现空格
回答by John Saunders
<element />
and
和
<element></element>
are both empty elements. Any productions from standards must be interpreted to have this result.
都是空元素。任何来自标准的产品都必须被解释为有这个结果。

