什么时候需要对 XML 中的字符进行转义?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6898259/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
When is it required to escape characters in XML?
提问by Kozlov
When should we replace < > & " 'in XML to characters like <etc.
我们什么时候应该将< > & " 'XML 中的字符替换为诸如此类的字符<。
My understanding is that it's just to make sure that if the content part of XML has > <the parser will not treat is start or end of a tag.
我的理解是,这只是为了确保如果 XML 的内容部分具有> <解析器将不会处理标签的开始或结束。
Also, if I have a XML like:
另外,如果我有一个像这样的 XML:
<hello>mor>ning<hello>
should this be replaced to either:
是否应将其替换为:
<hello>mor>ning<hello><hello>mor>ning<hello><hello>mor>ning<hello>
<hello>mor>ning<hello><hello>mor>ning<hello><hello>mor>ning<hello>
I don't understand why replacing is needed. When exactly is it required and what exactly (tags or text) should be replaced?
我不明白为什么需要更换。什么时候需要它,究竟应该替换什么(标签或文本)?
采纳答案by Quentin
<, >, &, "and 'all have special meanings in XML (such as "start of entity" or "attribute value delimiter").
<,>,&,"并且'都在XML特殊含义(比如“启动实体”或“属性值分隔符”)。
In order to have those characters appear as data (instead of for their special meaning) they can be represented by entities (<for <and so on).
为了让这些字符作为数据出现(而不是因为它们的特殊含义),它们可以用实体(<for<等)表示。
Sometimes those special meanings are context sensitive (e.g. " doesn't mean "attribute delimiter" outside of a tag) and there are places where they can appear raw as data. Rather then worry about those exceptions, it is simplest to just always represent them as entities if you want to avoid their special meaning. Then the only gotcha is explicit CDATA sections where the special meaning doesn't hold (and &won't start an entity).
有时,这些特殊含义是上下文相关的(例如,“ 并不意味着标签外的“属性分隔符”),并且在某些地方它们可以原始数据形式出现。与其担心这些异常,最简单的方法是始终表示它们作为实体,如果你想避免它们的特殊含义。那么唯一的问题是显式的 CDATA 部分,其中特殊含义不成立(并且&不会启动实体)。
should this be replaced to either
是否应该将其替换为
It shouldn't be represented as any of those. Entities must be terminated with a semi-colon.
它不应该被表示为任何这些。实体必须以分号结尾。
How you should represent it depends on which bit of your example of data and which is markup. You haven't said, for example, if <hello>is supposed to be data or the start tag for a hello element.
您应该如何表示它取决于您的数据示例的哪一部分以及哪一个是标记。例如,您没有说是否<hello>应该是数据或 hello 元素的开始标记。
回答by Cumbayah
Section 2.4 of the XML Specificationclearly states:
XML 规范的第 2.4 节明确指出:
The ampersand character (&) and the left angle bracket (<) must not appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they must be escaped using either numeric character references or the strings " & " and " < " respectively. The right angle bracket (>) may be represented using the string " > ", and must, for compatibility, be escaped using either " > " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.
与符号 (&) 和左尖括号 (<) 不得以其文字形式出现,除非用作标记定界符,或者在注释、处理指令或 CDATA 部分中。如果其他地方需要它们,则必须分别使用数字字符引用或字符串“&”和“<”进行转义。右尖括号 (>) 可以使用字符串 " > " 表示,为了兼容性,必须使用 " > " 或出现在内容中的字符串 " ]]> " 中的字符引用进行转义,当该字符串未标记 CDATA 部分的结尾时。
回答by Felix Kling
You have to encode all characters that have a special meaning in XML but should not be interpreted by the parser.
您必须对 XML 中具有特殊含义但不应由解析器解释的所有字符进行编码。
Assuming your XML is
假设您的 XML 是
<hello>mor>ning</hello>
you would encode it as
你会把它编码为
<hello>mor>ning</hello>
or use a CDATA[Wikipedia]section:
或使用CDATA[维基百科]部分:
<hello><![CDATA[mor>ning]]></hello>
回答by Ronnie
You can see this explanation enter link description herebut basically, characters like < and > are important when parsing the xml document. If extra of these special characters are included in the xml node text or attribute text, the parser will not be able to properly understand the document. If you are sending xml to some web service, all of the special characters should be properly escaped.
您可以在此处看到此解释输入链接描述,但基本上,在解析 xml 文档时,< 和 > 之类的字符很重要。如果 xml 节点文本或属性文本中包含额外的这些特殊字符,解析器将无法正确理解文档。如果您将 xml 发送到某个 Web 服务,则应正确转义所有特殊字符。
回答by tammysoliman
https://github.com/savonrb/gyoku/blob/master/README.md
https://github.com/savonrb/gyoku/blob/master/README.md
You can use Gyoku not to escape the characters in CDATA.
您可以使用 Gyoku 不转义 CDATA 中的字符。

