有没有办法在 XML 文件中包含大于或小于符号?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29398950/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is there a way to include greater than or less than signs in an XML file?
提问by Mookayama
I have an XML file from a client that has greater than >and less than <signs in it and it fails an XML format check.
Is there a way to get around this without asking the client to fix the file?
我有一个来自客户端的 XML 文件,其中包含大于>和小于<符号,但未通过 XML 格式检查。有没有办法在不要求客户端修复文件的情况下解决这个问题?
e.g.
例如
<?xml version="1.0" encoding="UTF-8"?>
<note Name="PrintPgmInfo <> VDD">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
回答by Rahul Tripathi
You may try to use it like this:
您可以尝试像这样使用它:
< = <
> = >
These are known as Character Entity References
这些被称为字符实体引用
回答by david tallon
You will have to use XML escape characters:
您将不得不使用 XML 转义字符:
" to "
' to '
< to <
> to >
& to &
Google escaping characters in XML for more information.
有关详细信息,请参阅 XML 中的 Google 转义字符。
回答by bjimba
The direct answer to your question:
直接回答你的问题:
Is there a way to get around this without asking the client to fix the file ?
有没有办法在不要求客户修复文件的情况下解决这个问题?
is "no". The data you are getting is not valid XML, and you are correct in rejecting it. I highly recommend going back to the client and saying that they must provide valid XML, using Character Entity References as mentioned by David and Rahul.
没有”。您获得的数据不是有效的 XML,您拒绝它是正确的。我强烈建议回到客户端并说他们必须提供有效的 XML,使用 David 和 Rahul 提到的字符实体引用。
回答by gary
To answer your question plainly no, you cannot have an XML file with <or >in any of its value fields because the XML format uses these characters to signify the parent and child elements, e.g. <note>, <to>, <from>, etc.
要回答你的问题说白了没有,你不能有一个XML文件<或>因为XML格式使用这些字符来表示父和子元素,如在任何它的价值领域<note>,<to>,<from>,等。
Expanding on my answer: When a Python script writes <or >using the XML library, the library translates them to <or >, respectively. I don't believe this is possible with that library since it is actually filtering out the <and >characters as well as the Character Entity References. This makes sense - the XML library is preventing you from disrupting the syntax used for the parent xml.etree.cElementTree.Elementor any child xml.etree.cElementTree.SubElementobject fields. For example, use the code block in this great answerto experiment:
扩展我的答案:当 Python 脚本编写<或>使用XML 库时,库将它们分别转换为<或>。我不相信这个库是可能的,因为它实际上过滤掉了<和>字符以及字符实体引用。这是有道理的 - XML 库阻止您破坏用于父对象xml.etree.cElementTree.Element或任何子xml.etree.cElementTree.SubElement对象字段的语法。例如,使用这个很棒的答案中的代码块进行实验:
import xml.etree.cElementTree as ET
root = ET.Element("root")
doc = ET.SubElement(root, "doc")
ET.SubElement(doc, "field1", name="blah").text = "some <value>"
ET.SubElement(doc, "field2", name="asdfasd").text = "some <other value>"
tree = ET.ElementTree(root)
tree.write("filename.xml")
This yields <root><doc><field1 name="blah">some <value></field1><field2 name="asdfasd">some <other value></field2></doc></root>.
这产生<root><doc><field1 name="blah">some <value></field1><field2 name="asdfasd">some <other value></field2></doc></root>.
Prettifying it:
美化它:
<root>
<doc>
<field1 name="blah">
some <value>
</field1>
<field2 name="asdfasd">
some <other value>
</field2>
</doc>
</root>
However, there's nothing stopping you from adding these characters manually: read in the XML file and re-write it, adding text, even if it contains <or >. If you want a proper XML file though, just be sure that these characters are only used within comment fields.
但是,没有什么可以阻止您手动添加这些字符:读入 XML 文件并重新编写它,添加文本,即使它包含<或>。但是,如果您想要一个合适的 XML 文件,只需确保这些字符仅在注释字段中使用。
For your particular problem,you could read in the lines from the client's XML files, then either remove the <and >characters or, if the client requires them, move them to a commented portion of the line. Part of the challenge is that you have to leave in the <note>,`, etc. portions of the file... This is challenging but it would be possible!
对于您的特定问题,您可以从客户端的 XML 文件中读取行,然后删除<和>字符,或者,如果客户端需要它们,将它们移动到该行的注释部分。部分挑战是您必须保留<note>,文件的` 等部分...这是具有挑战性的,但这是可能的!
The following is what I'd expect the result to look like.
以下是我期望的结果。
<?xml version="1.0" encoding="UTF-8"?>
<note Name="PrintPgmInfo VDD"> <!-- PrintPgmInfo <> VDD -->
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

