xml DTD中PCDATA和CDATA的区别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/918450/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 12:32:21  来源:igfitidea点击:

Difference between PCDATA and CDATA in DTD

xmldtd

提问by Jakub Arnold

What is the difference between #PCDATAand #CDATAin DTD?

是什么区别#PCDATA,并#CDATADTD

采纳答案by Matthew Vines

PCDATA - Parsed Character Data

PCDATA - 解析的字符数据

XML parsers normally parse all the text in an XML document.

XML 解析器通常解析 XML 文档中的所有文本。

CDATA - (Unparsed) Character Data

CDATA -(未解析的)字符数据

The term CDATA is used about text data that should not be parsed by the XML parser.

术语 CDATA 用于表示不应由 XML 解析器解析的文本数据。

Characters like "<" and "&" are illegal in XML elements.

像“<”和“&”这样的字符在 XML 元素中是非法的。

回答by Rose Perrone

  • PCDATAis text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.
  • CDATAis text that will notbe parsed by a parser. Tags inside the text will notbe treated as markup and entities will not be expanded.
  • PCDATA是将由解析器解析的文本。文本内的标签将被视为标记,实体将被扩展。
  • CDATA不会被解析器解析的文本。文本内的标签 不会被视为标记,实体不会被扩展。

By default, everything is PCDATA. In the following example, ignoring the root, <bar>will be parsed, and it'll have no content, but one child.

默认情况下,一切都是PCDATA. 在下面的例子中,忽略根,<bar>将被解析,它没有内容,只有一个孩子。

<?xml version="1.0"?>
<foo>
<bar><test>content!</test></bar>
</foo>

When we want to specify that an element will only contain text, and no child elements, we use the keyword PCDATA, because this keyword specifies that the element must contain parsable character data – that is , any text except the characters less-than (<) , greater-than (>) , ampersand (&), quote(') and double quote (").

当我们想指定一个元素只包含文本,不包含子元素时,我们使用关键字PCDATA,因为这个关键字指定元素必须包含可解析的字符数据——即除了小于 ( <)的字符之外的任何文本,大于 ( >) 、与号 ( &)、quote( ') 和双引号 ( ")。

In the next example, <bar>contains CDATA. Its content will not be parsed and is thus <test>content!</test>.

在下一个示例中,<bar>包含CDATA. 它的内容不会被解析,因此是<test>content!</test>

<?xml version="1.0"?>
<foo>
<bar><![CDATA[<test>content!</test>]]></bar>
</foo>

There are several content models in SGML. The #PCDATAcontent model says that an element may contain plain text. The "parsed" part of it means that markup (including PIs, comments and SGML directives) in it is parsed instead of displayed as raw text. It also means that entity references are replaced.

SGML 中有几种内容模型。该#PCDATA内容模型说,一个元素可以包含纯文本。它的“解析”部分意味着其中的标记(包括 PI、注释和 SGML 指令)被解析而不是显示为原始文本。这也意味着实体引用被替换。

Another type of content model allowing plain text contents is CDATA. In XML, the element content model may not implicitly be set to CDATA, but in SGML, it means that markup and entity references are ignored in the contents of the element. In attributes of CDATAtype however, entity references are replaced.

另一种允许纯文本内容的内容模型是CDATA. 在 XML 中,元素内容模型可能不会隐式设置为CDATA,但在 SGML 中,这意味着元素内容中的标记和实体引用被忽略。CDATA然而,在类型属性中,实体引用被替换。

In XML, #PCDATAis the only plain text content model. You use it if you at all want to allow text contents in the element. The CDATAcontent model may be used explicitly through the CDATAblock markup in #PCDATA, but element contents may not be defined as CDATAper default.

在 XML 中,#PCDATA是唯一的纯文本内容模型。如果您希望在元素中允许文本内容,则可以使用它。的CDATA内容模型可以明确地通过使用CDATA在块标记#PCDATA,但元件的内容可能不被定义为CDATA每默认。

In a DTD, the type of an attribute that contains text must be CDATA. The CDATAkeyword in an attribute declaration has a different meaning than the CDATAsection in an XML document. In a CDATAsection all characters are legal (including <,>,&,'and "characters), except the ]]>end tag.

在 DTD 中,包含文本的属性类型必须是CDATA. CDATA属性声明中的关键字CDATA与 XML 文档中的部分具有不同的含义。在一个CDATA部分中的所有字符是合法的(包括<>&'"字符),除了]]>结束标记。

#PCDATAis not appropriate for the type of an attribute. It is used for the type of "leaf" text.

#PCDATA不适用于属性的类型。它用于“叶”文本类型。

#PCDATAis prepended by a hash in the content model to distinguish this keyword from an element named PCDATA(which would be perfectly legal).

#PCDATA在内容模型中以散列开头,以将此关键字与命名的元素区分开来PCDATA(这将是完全合法的)。

回答by winter

PCDATA – parsed character data. It parse to all the data in an xml document.

PCDATA – 解析的字符数据。它解析为 xml 文档中的所有数据。

Example:

例子:

<family>
    <mother>mom</mother>
    <father>dad</father>
</family>

Here, the family element contains 2 more elements “mother”and ”father”. So it parse further to get the text of mother and father to give the value of family as “mom dad”

在这里,family 元素包含另外两个元素“mother”“father”。所以它进一步解析得到母亲和父亲的文本以赋予家庭作为“妈妈爸爸”的价值

CDATA – unparsed characted Data.This is the data that should not be parsed further in an xml document.

CDATA – 未解析的字符数据。这是不应在 xml 文档中进一步解析的数据。

<family>
    <![CDATA[ 
       <mother>mom</mother>
       <father>dad</father>
    ]]>
</family>

Here, the value of family will be <mother>mom</mother><father>dad</father>.

在这里,家庭的价值将是<mother>mom</mother><father>dad</father>

回答by Oli

From here(Google is your friend):

从这里谷歌是你的朋友):

In a DTD, PCDATA and CDATA are used to assert something about the allowable content of elements and attributes, respectively. In an element's content model, #PCDATA says that the element contains (may contain) "any old text." (With exceptions as noted below.) In an attribute's declaration, CDATA is one sort of constraint you can put on the attribute's allowable values (other sorts, all mutually exclusive, include ID, IDREF, and NMTOKEN). An attribute whose allowable values are CDATA can (like PCDATA in an element) contain "any old text."

A potentially really confusing issue is that there's another"CDATA," also referred to as marked sections. A marked section is a portion of element (#PCDATA) content delimited with special strings: to close it. If you remember that PCDATA is "parsed character data," a CDATA section is literally the same thing, without the "parsed." Parsers transmit the content of a marked section to downstream applications without hiccupping every time they encounter special characters like < and &. This is useful when you're coding a document that contains lots of those special characters (like scripts and code fragments); it's easier on data entry, and easier on reading, than the corresponding entity reference.

So you can infer that the exception to the "any old text" rule is that PCDATA cannot include any of these unescaped special characters, UNLESS they fall within the scope of a CDATA marked section.

在 DTD 中,PCDATA 和 CDATA 分别用于对元素和属性的允许内容进行断言。在元素的内容模型中,#PCDATA 表示该元素包含(可能包含)“任何旧文本”。(下面提到的例外情况除外。)在属性声明中,CDATA 是一种可以对属性的允许值施加的约束(其他类型,全部互斥,包括 ID、IDREF 和 NMTOKEN)。允许值为 CDATA 的属性可以(如元素中的 PCDATA)包含“任何旧文本”。

一个潜在的真正令人困惑的问题是还有另一个“CDATA”,也称为标记部分。标记部分是元素 (#PCDATA) 内容的一部分,用特殊字符串分隔:关闭它。如果您还记得 PCDATA 是“已解析的字符数据”,那么 CDATA 部分实际上是相同的东西,没有“已解析”。解析器将标记部分的内容传输到下游应用程序,每次遇到诸如 < 和 & 之类的特殊字符时都不会打嗝。当您编写包含大量特殊字符(如脚本和代码片段)的文档时,这很有用;与相应的实体引用相比,它更容易输入数据,也更容易阅读。

因此,您可以推断“任何旧文本”规则的例外情况是 PCDATA 不能包含任何这些未转义的特殊字符,除非它们属于 CDATA 标记部分的范围。

回答by Rachana K

The very main difference between PCDATA and CDATA is

PCDATA 和 CDATA 之间的主要区别是

PCDATA - Basically used for ELEMENTS while

PCDATA - 主要用于 ELEMENTS 而

CDATA - Used for Attributes of XML i.e ATTLIST

CDATA - 用于 XML 的属性,即 ATTLIST

回答by Premraj

CDATA (Character DATA): It is similarly to a comment but it is part of document. i.e. CDATA is a data, it is part of the document but the data can not parsed in XML.
Note:XML comment omits while parsing an XML but CDATA shows as it is.

CDATA(çharacter DATA):这是类似评论,但它是文档的一部分。即 CDATA 是一个数据,它是文档的一部分,但该数据不能在 XML 中解析。
注意:解析 XML 时会省略 XML 注释,但 CDATA 会按原样显示。

PCDATA (Parsed Character DATA) :By default, everything is PCDATA. PCDATA is a data, it can be parsed in XML.

PCDATA(Parsed çharacter DATA):默认情况下,一切都是PCDATA。PCDATA 是一个数据,可以用 XML 解析。