如何在 XML 属性中保存换行符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2004386/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 12:54:53  来源:igfitidea点击:

How to save newlines in XML attribute?

xmlxsltnewline

提问by Tommy

I need to save content that containing newlines in some XML attributes, not text. The method should be picked so that I am able to decode it in XSLT 1.0/ESXLT/XSLT 2.0

我需要保存在某些 XML 属性中包含换行符的内容,而不是文本。应该选择该方法,以便我能够在 XSLT 1.0/ESXLT/XSLT 2.0 中对其进行解码

What is the best encoding method?

最好的编码方法是什么?

Please suggest/give some ideas.

请建议/给出一些想法。

回答by Tomalak

In a compliant DOM API there is nothing you need to do. Simply save actual newline characters to the attribute, the API will encode them correctly on its own (see Canonical XML spec, section 5.2).

在兼容的 DOM API 中,您无需执行任何操作。只需将实际的换行符保存到属性中,API 就会自行对它们进行正确编码(请参阅Canonical XML 规范,第 5.2 节)。

If you do your own encoding (i.e. replacing \nwith 
before saving the attribute value), the API will encode your input again, resulting in 
in the XML file.

如果您自己编码(即在保存属性值之前替换\n
),API 将再次对您的输入进行编码,从而生成
XML 文件。

Bottom line is, the string value is saved verbatim. You get out what you put in, no need to interfere.

底线是,字符串值是逐字保存的。你把你放进去的东西拿出来,不需要干涉。

However… some implementations are not compliant. For example, they will encode &characters in attribute values, but forget about newline characters or tabs. This puts you in a losing position since you can't simply replace newlines with 
beforehand.

但是……有些实现不合规。例如,它们将对&属性值中的字符进行编码,但忘记了换行符或制表符。这使您处于亏损状态,因为您不能简单地
预先替换换行符。

These implementations will save newline characters unencoded, like this:

这些实现将保存未编码的换行符,如下所示:

<xml attribute="line 1
line 2" />

Upon parsing such a document, literal newlines in attributes are normalized into a single space (again, in accordance to the spec) - and thus they are lost.

在解析这样的文档时,属性中的文字换行符被规范化为单个空格(同样,根据规范) - 因此它们会丢失。

Saving (and retaining!) newlines in attributes is impossible in these implementations.

在这些实现中,保存(和保留!)属性中的换行符是不可能的。

回答by Asaph

You can use the entity &#10;to represent a newline in an XML attribute. &#13;can be used to represent a carriage return. A windows style CRLF could be represented as &#13;&#10;.

您可以使用实体&#10;来表示 XML 属性中的换行符。&#13;可用于表示回车。Windows 样式的 CRLF 可以表示为&#13;&#10;.

This is legal XML syntax. See XML specfor more details.

这是合法的 XML 语法。有关更多详细信息,请参阅XML 规范

回答by rosca dragos

A crude answer can be:

一个粗略的答案可以是:

XmlDocument xDoc = new XmlDocument();
xDoc.Load(@"Agenda.xml");
//make stuff with the xml
//make attributes value = "\r\n" (you need both expressions to make a new line)
string a = xDoc.InnerXml.Replace("&#xD;", "\r").Replace("&#xA;", "\n").Replace("><",">\r    \n<");
StreamWriter sDoc = new StreamWriter(@"Agenda.xml");
sDoc.Write(a);
sDoc.Flush();
sDoc.Dispose();

This will as you see is just a string

如您所见,这只是一个字符串

回答by OG Sean

A slightly different approach that has been helpful in some situations-

在某些情况下有帮助的略有不同的方法-

Placeholders and Find & Replace.

占位符和查找和替换。

Before parsing you can simply use your own custom linebreak marker/placeholder, then on the 2nd half of the situation just string replace it with whatever line break character is effective, whether that's \n or or or #&10; or \u2028 or any of the various line break characters out there. Find & replace them back in after setting the placeholder of your own in the data initially.

在解析之前,您可以简单地使用自己的自定义换行标记/占位符,然后在情况的第二部分,只需将其字符串替换为任何有效的换行符,无论是 \n 或 or 还是 #&10; 或 \u2028 或任何各种换行符。最初在数据中设置您自己的占位符后,查找并替换它们。

This is useful when parsers like jQuery $.parseXML() strip the unencoded line breaks. For example, you could use {LBREAK} as your line break char, insert it while raw text, and replace it later after parsed to an XML object. String.replaceAll() is a helpful prototype.

当像 jQuery $.parseXML() 这样的解析器去除未编码的换行符时,这很有用。例如,您可以使用 {LBREAK} 作为换行符,在原始文本中插入它,并在解析为 XML 对象后将其替换。String.replaceAll() 是一个有用的原型。

So rough code concept with jquery and a replaceAll prototype (have not tested this code but it will show the concept):

带有 jquery 和 replaceAll 原型的粗略代码概念(尚未测试此代码,但它将显示概念):

function onXMLHandleLineBreaks(_result){
    var lineBreakCharacterThatGetsLost = '&#10;';
    var lineBreakCharacterThatGetsLost = '&#xD;';
    var rawXMLText = _result; // hold as text only until line breaks are ready
        rawXMLText = String(rawXMLText).replaceAll(lineBreakCharacterThatGetsLost, '{mylinebreakmarker}'); // placemark the linebreaks with a regex find and replace proto
    var xmlObj = $.parseXML(rawXML); // to xml obj
    $(xmlObj).html( String(xmlObj.html()).replaceAll('{mylinebreakmarker}'), lineBreakCharacterThatWorks ); // add back in line breaks
    console.log('xml with linebreaks that work: ' + xmlObj);
}

And of course you could adjust the line break chars that work or don't work to your data situation, and you could put that in a loop for a set of line break characters that don't work and iterate through them to do a an entire set of linebreak characters.

当然,您可以根据您的数据情况调整有效或无效的换行符,并且您可以将其放入一组无效的换行符的循环中,并遍历它们以执行整组换行符。