什么时候需要对 XML 中的字符进行转义？

Question

提问by Kozlov

When should we replace < > & " 'in XML to characters like &ltetc.

我们什么时候应该将< > & " 'XML 中的字符替换为诸如此类的字符&lt。

My understanding is that it's just to make sure that if the content part of XML has > <the parser will not treat is start or end of a tag.

我的理解是，这只是为了确保如果 XML 的内容部分具有> <解析器将不会处理标签的开始或结束。

Also, if I have a XML like:

另外，如果我有一个像这样的 XML：

<hello>mor>ning<hello>

should this be replaced to either:

是否应将其替换为：

&lthello&gtmor&gtning&lthello&gt
&lthello&gtmor>ning&lthello&gt
<hello>mor&gtning<hello>

&lthello&gtmor&gtning&lthello&gt
&lthello&gtmor>ning&lthello&gt
<hello>mor&gtning<hello>

I don't understand why replacing is needed. When exactly is it required and what exactly (tags or text) should be replaced?

我不明白为什么需要更换。什么时候需要它，究竟应该替换什么（标签或文本）？

Answer 1

采纳答案by Quentin

<, >, &, "and 'all have special meanings in XML (such as "start of entity" or "attribute value delimiter").

<，>，&，"并且'都在XML特殊含义（比如“启动实体”或“属性值分隔符”）。

In order to have those characters appear as data (instead of for their special meaning) they can be represented by entities (<for <and so on).

为了让这些字符作为数据出现（而不是因为它们的特殊含义），它们可以用实体（<for<等）表示。

Sometimes those special meanings are context sensitive (e.g. " doesn't mean "attribute delimiter" outside of a tag) and there are places where they can appear raw as data. Rather then worry about those exceptions, it is simplest to just always represent them as entities if you want to avoid their special meaning. Then the only gotcha is explicit CDATA sections where the special meaning doesn't hold (and &won't start an entity).

有时，这些特殊含义是上下文相关的（例如，“ 并不意味着标签外的“属性分隔符”），并且在某些地方它们可以原始数据形式出现。与其担心这些异常，最简单的方法是始终表示它们作为实体，如果你想避免它们的特殊含义。那么唯一的问题是显式的 CDATA 部分，其中特殊含义不成立（并且&不会启动实体）。

should this be replaced to either

是否应该将其替换为

It shouldn't be represented as any of those. Entities must be terminated with a semi-colon.

它不应该被表示为任何这些。实体必须以分号结尾。

How you should represent it depends on which bit of your example of data and which is markup. You haven't said, for example, if <hello>is supposed to be data or the start tag for a hello element.

您应该如何表示它取决于您的数据示例的哪一部分以及哪一个是标记。例如，您没有说是否<hello>应该是数据或 hello 元素的开始标记。

Answer 2

回答by Cumbayah

Section 2.4 of the XML Specificationclearly states:

XML 规范的第 2.4 节明确指出：

The ampersand character (&) and the left angle bracket (<) must not appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they must be escaped using either numeric character references or the strings " & " and " < " respectively. The right angle bracket (>) may be represented using the string " > ", and must, for compatibility, be escaped using either " > " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.

与符号 (&) 和左尖括号 (<) 不得以其文字形式出现，除非用作标记定界符，或者在注释、处理指令或 CDATA 部分中。如果其他地方需要它们，则必须分别使用数字字符引用或字符串“&”和“<”进行转义。右尖括号 (>) 可以使用字符串 " > " 表示，为了兼容性，必须使用 " > " 或出现在内容中的字符串 " ]]> " 中的字符引用进行转义，当该字符串未标记 CDATA 部分的结尾时。

Answer 3

回答by Felix Kling

You have to encode all characters that have a special meaning in XML but should not be interpreted by the parser.

您必须对 XML 中具有特殊含义但不应由解析器解释的所有字符进行编码。

Assuming your XML is

假设您的 XML 是

<hello>mor>ning</hello>

you would encode it as

你会把它编码为

<hello>mor&gt;ning</hello>

or use a CDATA^[Wikipedia]section:

或使用CDATA^{[维基百科]}部分：

<hello><![CDATA[mor>ning]]></hello>

Answer 4

回答by Ronnie

You can see this explanation enter link description herebut basically, characters like < and > are important when parsing the xml document. If extra of these special characters are included in the xml node text or attribute text, the parser will not be able to properly understand the document. If you are sending xml to some web service, all of the special characters should be properly escaped.

您可以在此处看到此解释输入链接描述，但基本上，在解析 xml 文档时，< 和 > 之类的字符很重要。如果 xml 节点文本或属性文本中包含额外的这些特殊字符，解析器将无法正确理解文档。如果您将 xml 发送到某个 Web 服务，则应正确转义所有特殊字符。

Answer 5

回答by tammysoliman

https://github.com/savonrb/gyoku/blob/master/README.md

You can use Gyoku not to escape the characters in CDATA.

您可以使用 Gyoku 不转义 CDATA 中的字符。

什么时候需要对 XML 中的字符进行转义？

提问by Kozlov

采纳答案by Quentin

回答by Cumbayah

回答by Felix Kling

回答by Ronnie

回答by tammysoliman

相关推荐

最近更新

标签

什么时候需要对 XML 中的字符进行转义？

提问by Kozlov

采纳答案by Quentin

回答by Cumbayah

回答by Felix Kling

回答by Ronnie

回答by tammysoliman

相关推荐

xml cvc-complex-type.2.4.a：发现以元素“MarkupListURI”开头的无效内容。应为“{MarkupDeleteURI}”之一

xml 使用 XMLHttpRequest 发送 PUT/DELETE 数据

xml XQuery [value()]: 'value()' 需要一个单例（或空序列），找到类型为 'xdt:untypedAtomic *' 的操作数

在针对 WSDL（xsd 架构）验证 xml 时了解 elementFormDefault 限定/不限定

相关推荐

最近更新

标签