如何解决 XML 中的与号 (&) 转换问题?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17423495/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to solve Ampersand (&) conversion issue in XML?
提问by user2493287
I am creating XML file using XMLDocument, but when XML node get '&' as data, it converting in "Ampersand(&)amp;" but i need actual value that is '&', Can anyone please tell me how can I achieve it?
我正在使用 XMLDocument 创建 XML 文件,但是当 XML 节点获取“&”作为数据时,它会在“Ampersand(&)amp;”中进行转换 但我需要'&'的实际价值,谁能告诉我如何实现它?
Result:
结果:
回答by Tim Pietzcker
A single &is illegalin an XML document (outside of CDATA sections; see @rsp's answer), so this is not possible. If there is a verbatim ampersand in your node data, it hasto be encoded as &.
XML 文档中的单个&是非法的(在 CDATA 部分之外;请参阅@rsp 的回答),因此这是不可能的。如果您的节点数据中有逐字与号,则必须将其编码为&.
But it's also no problem because any XML reader will decode &as a literal &when parsing the XML file.
但这也没有问题,因为任何 XML 阅读器在解析 XML 文件时都会将其解码&为文字&。
回答by rsp
回答by CtheGood
I once had this situation where I wanted to preserve raw ampersands in XML. Though your parser may not be the same as mine (I use MarkLogic), the following still applies to your situation with any XML parser:
我曾经遇到过这种情况,我想在 XML 中保留原始与符号。尽管您的解析器可能与我的不同(我使用 MarkLogic),但以下内容仍然适用于您使用任何 XML 解析器的情况:
Issues with the ampersand character
与符号字符有关的问题
The ampersand character can be tricky to construct in an XQuery string, as it is an escape character to the XQuery parser. The ways to construct the ampersand character in XQuery are:
Use the XML entity syntax (for example, &).
Use a CDATA element (<![CDATA[element content here]]>), which tells the XQuery parser to read the content as character data.
Use the repair option on xdmp:document-load, xdmp:document-get, or xdmp:unquote.
https://help.marklogic.com/knowledgebase/article/View/55/0/xquery-ampersand-in-string
Obviously, the first option listed above, which is to escape ampersands, was not the direction we wanted to go. We wanted raw ampersands, not the escaped entity.
The second option seemed at first a good idea, and I played around with CDATA elements for a very long time. CDATA allows "character data", and everything inside is considered character data, not real XML. After playing around with some examples, I discovered that you could potentially make CDATA return ampersands, but CDATA elements are VERY unfriendly. For instance, creating dynamic CDATA elements is near impossible, you cannot simply wrap an XML structure inside of a CDATA. CDATA is meant to have static, predefined characters inside of it. If there is an effective way of using CDATA, I was not able to find it.
Xdmp:quote and xdmp:unquote do the trick that we need, though not in the way that we expect them too. For example:
显然,上面列出的第一个选项,也就是逃避&符号,不是我们想要走的方向。我们想要原始&符号,而不是逃脱的实体。
乍一看,第二个选项似乎是个好主意,我用 CDATA 元素玩了很长时间。CDATA 允许“字符数据”,里面的一切都被认为是字符数据,而不是真正的 XML。在尝试了一些示例之后,我发现您可以潜在地使 CDATA 返回&符号,但是 CDATA 元素非常不友好。例如,创建动态 CDATA 元素几乎是不可能的,您不能简单地将 XML 结构包装在 CDATA 内。CDATA 旨在在其中包含静态的预定义字符。如果有使用 CDATA 的有效方法,我找不到它。Xdmp:quote 和 xdmp:unquote 完成了我们需要的技巧,虽然不是我们期望的方式。例如:
let $xml := <rootNode title="test"><firstLevel type="crazy"><secondLevel reason="testing">D&C</secondLevel><secondLevel owner="clint">D&C</secondLevel></firstLevel></rootNode>
return xdmp:quote($xml//secondLevel[1])
(: Returns <secondLevel reason="testing">D&C</secondLevel> :)
But
但
let $xml := <rootNode title="test"><firstLevel type="crazy"><secondLevel reason="testing">D&C</secondLevel><secondLevel owner="clint">D&C</secondLevel></firstLevel></rootNode>
return xdmp:quote($xml//secondLevel[1]/node())
(: Returns D&C - an unescaped ampersand! :)
The second example gives us the unescaped ampersand, but only because the object we are trying to xdmp:quote is text, and not an element. In the first example, if we try to quote the element, it will return us with the text version of the XML, but with D&C - escaped ampersand. Thus, in order to have xdmp:quote give us a string with ampersands, the object with the ampersand must be stand-alone text.
From here, there are probably a few different directions we could go, and my idea is surely not the most elegant or efficient. But I decided to make a recursive function, parsing all the XML as text, and allowing an xdmp:quote of pure text for ampersands.
第二个示例为我们提供了未转义的&符号,但这仅仅是因为我们尝试 xdmp:quote 的对象是文本,而不是元素。在第一个示例中,如果我们尝试引用元素,它将返回 XML 的文本版本,但带有 D&C - 转义符号。因此,为了让 xdmp:quote 给我们一个带有 & 符号的字符串,带有 & 符号的对象必须是独立的文本。
从这里开始,我们可能有几个不同的方向,我的想法肯定不是最优雅或最有效的。但是我决定创建一个递归函数,将所有 XML 解析为文本,并允许 xdmp:quote of pure text for &s。
declare function local:stringify($xml)
{
if (xdmp:node-kind($xml) eq "text") then
xdmp:quote($xml, <options xmlns="xdmp:quote">
<method>text</method>
</options>)
else if (xdmp:node-kind($xml) eq "element") then
fn:string-join(
(fn:concat("<", fn:local-name($xml)),
for $attr in $xml/@*
return fn:concat(' ', fn:local-name($attr), '="', $attr, '"'),
">",
for $node in $xml/node()
return local:stringify($node),
fn:concat("</", fn:local-name($xml), ">")
), "")
else ()
};
let $xml := <rootNode title="test"><firstLevel type="crazy"><secondLevel reason="testing">D&C</secondLevel><secondLevel owner="clint">D&C</secondLevel></firstLevel></rootNode>
return local:stringify($xml)
(: Returns <rootNode title="test"><firstLevel type="crazy"><secondLevel reason="testing">D&C</secondLevel><secondLevel owner="clint">D&C</secondLevel></firstLevel></rootNode> :)
So while this solution does not allow an ampersand to exist in XML that is passed around in our application, it does allow this packaged XML that is being treated as text to be passed around.
因此,虽然此解决方案不允许在我们的应用程序中传递的 XML 中存在&符号,但它确实允许传递这种被视为文本的打包 XML。
回答by Nikunj Vekariya
I guess one can use below line.
Option like "repair-full"will take &as &only
我想可以使用下面的行。选项 like"repair-full"将仅&作为&
let $InputXML := xdmp:unquote($inputSearchDetails, "", ("format-xml", "repair-full"))
let $InputXML := xdmp:unquote($inputSearchDetails, "", ("format-xml", "repair-full"))

