有没有办法在 XML 文件中包含大于或小于符号?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29398950/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 12:07:42  来源:igfitidea点击:

Is there a way to include greater than or less than signs in an XML file?

xml

提问by Mookayama

I have an XML file from a client that has greater than >and less than <signs in it and it fails an XML format check. Is there a way to get around this without asking the client to fix the file?

我有一个来自客户端的 XML 文件,其中包含大于>和小于<符号,但未通过 XML 格式检查。有没有办法在不要求客户端修复文件的情况下解决这个问题?

e.g.

例如

<?xml version="1.0" encoding="UTF-8"?>

<note Name="PrintPgmInfo <> VDD">
 <to>Tove</to>
 <from>Jani</from>
 <heading>Reminder</heading>
 <body>Don't forget me this weekend!</body>
</note>

回答by Rahul Tripathi

You may try to use it like this:

您可以尝试像这样使用它:

< = &lt;

> = &gt;

These are known as Character Entity References

这些被称为字符实体引用

回答by david tallon

You will have to use XML escape characters:

您将不得不使用 XML 转义字符:

" to  &quot;
' to  &apos;
< to  &lt;
> to  &gt;
& to  &amp;

Google escaping characters in XML for more information.

有关详细信息,请参阅 XML 中的 Google 转义字符。

回答by bjimba

The direct answer to your question:

直接回答你的问题:

Is there a way to get around this without asking the client to fix the file ?

有没有办法在不要求客户修复文件的情况下解决这个问题?

is "no". The data you are getting is not valid XML, and you are correct in rejecting it. I highly recommend going back to the client and saying that they must provide valid XML, using Character Entity References as mentioned by David and Rahul.

没有”。您获得的数据不是有效的 XML,您拒绝它是正确的。我强烈建议回到客户端并说他们必须提供有效的 XML,使用 David 和 Rahul 提到的字符实体引用。

回答by gary

To answer your question plainly no, you cannot have an XML file with <or >in any of its value fields because the XML format uses these characters to signify the parent and child elements, e.g. <note>, <to>, <from>, etc.

要回答你的问题说白了没有,你不能有一个XML文件<>因为XML格式使用这些字符来表示父和子元素,如在任何它的价值领域<note><to><from>,等。

Expanding on my answer: When a Python script writes <or >using the XML library, the library translates them to &ltor &gt, respectively. I don't believe this is possible with that library since it is actually filtering out the <and >characters as well as the Character Entity References. This makes sense - the XML library is preventing you from disrupting the syntax used for the parent xml.etree.cElementTree.Elementor any child xml.etree.cElementTree.SubElementobject fields. For example, use the code block in this great answerto experiment:

扩展我的答案:当 Python 脚本编写<>使用XML 库时,库将它们分别转换为&lt&gt。我不相信这个库是可能的,因为它实际上过滤掉了<>字符以及字符实体引用。这是有道理的 - XML 库阻止您破坏用于父对象xml.etree.cElementTree.Element或任何子xml.etree.cElementTree.SubElement对象字段的语法。例如,使用这个很棒的答案中的代码块进行实验:

import xml.etree.cElementTree as ET

root = ET.Element("root")
doc = ET.SubElement(root, "doc")

ET.SubElement(doc, "field1", name="blah").text = "some <value>"
ET.SubElement(doc, "field2", name="asdfasd").text = "some <other value>"

tree = ET.ElementTree(root)
tree.write("filename.xml")

This yields <root><doc><field1 name="blah">some &lt;value&gt;</field1><field2 name="asdfasd">some &lt;other value&gt;</field2></doc></root>.

这产生<root><doc><field1 name="blah">some &lt;value&gt;</field1><field2 name="asdfasd">some &lt;other value&gt;</field2></doc></root>.

Prettifying it:

美化它:

<root>
    <doc>
        <field1 name="blah">
            some &lt;value&gt;
        </field1>
        <field2 name="asdfasd">
            some &lt;other value&gt;
        </field2>
    </doc>
</root>


However, there's nothing stopping you from adding these characters manually: read in the XML file and re-write it, adding text, even if it contains <or >. If you want a proper XML file though, just be sure that these characters are only used within comment fields.

但是,没有什么可以阻止您手动添加这些字符:读入 XML 文件并重新编写它,添加文本,即使它包含<>。但是,如果您想要一个合适的 XML 文件,只需确保这些字符仅在注释字段中使用。

For your particular problem,you could read in the lines from the client's XML files, then either remove the <and >characters or, if the client requires them, move them to a commented portion of the line. Part of the challenge is that you have to leave in the <note>,`, etc. portions of the file... This is challenging but it would be possible!

对于您的特定问题,您可以从客户端的 XML 文件中读取行,然后删除<>字符,或者,如果客户端需要它们,将它们移动到该行的注释部分。部分挑战是您必须保留<note>,文件的` 等部分...这是具有挑战性的,但这是可能的!

The following is what I'd expect the result to look like.

以下是我期望的结果。

<?xml version="1.0" encoding="UTF-8"?>

<note Name="PrintPgmInfo VDD"> <!-- PrintPgmInfo <> VDD -->
 <to>Tove</to>
 <from>Jani</from>
 <heading>Reminder</heading>
 <body>Don't forget me this weekend!</body>
</note>