在 text/xml 值中编码 CR-LF 换行符的正确方法是什么？

Question

提问by AlwaysLearning

As opposed to application/xml files which could do anything, or normalizedString values which convert all whitespace sequences to a single space character, I'm asking here specifically in the context of text/xml files with string values. For the sake of simplicity, let's say I'm only using ASCII characters with a UTF8 encoded file.

与可以做任何事情的 application/xml 文件或将所有空白序列转换为单个空格字符的 normalizedString 值相反，我在这里专门在具有字符串值的 text/xml 文件的上下文中询问。为简单起见，假设我仅将 ASCII 字符与 UTF8 编码文件一起使用。

Given the following two-line text string I wish to represent in XML:

给定以下我希望用 XML 表示的两行文本字符串：

Hello
World!

Which is the following bytes in memory:

这是内存中的以下字节：

0000: 48 65 6c 6c 6f 0d 0a 57 6f 72 6c 64 21 Hello..World!

According to RFC 2046, any text/* MIME type MUST (not should) represent a line break using Carriage Return followed by Linefeed character sequence. In that light, the following XML fragment should be right:

根据 RFC 2046，任何 text/* MIME 类型必须（不应该）使用回车符后跟换行符字符序列来表示换行符。有鉴于此，以下 XML 片段应该是正确的：

<tag>Hello
World!</tag>

or

或者

0000: 3c 74 61 67 3c 48 65 6c 6c 6f 0d 0a 57 6f 72 6c <tag>Hello..Worl
0010: 64 21 3c 2f 74 61 67 3c                         d!</tag>

But I regularly see files like the following:

但我经常看到如下文件：

<tag><![CDATA[Hello
World!]]></tag>

Or, even stranger:

或者，甚至是陌生人：

<tag>Hello&xD;
World!</tag>

Where the &0xD; sequence is followed by a single Linefeed character:

其中 &0xD; 序列后跟一个换行符：

0000: 3c 74 61 67 3c 48 65 6c 6c 6f 26 78 44 3b 0a 57 <tag>Hello&xD;.W
0010: 6f 72 6c 64 21 3c 2f 74 61 67 3c                orld!</tag>

What am I missing here? What's the correct way to represent multiple lines of text in an XML string value so that it can come out the other end unmolested?

我在这里缺少什么？在 XML 字符串值中表示多行文本以便它可以不受干扰地从另一端出来的正确方法是什么？

Answer 1

采纳答案by AlwaysLearning

After writing NUnit tests in Mono and JUnit tests in Java, the answer would appear to be to use either <tag>Hello\nWorld!</tag> or <tag>Hello\nWorld!</tag> as below...

在 Mono 中编写 NUnit 测试并在 Java 中编写 JUnit 测试后，答案似乎是使用 <tag>Hello \nWorld!</tag> 或 <tag>Hello \nWorld!</tag>如下...

Foo.cs:

Foo.cs：

using System.IO;
using System.Text;
using System.Xml.Serialization;

namespace XmlStringTests
{
    public class Foo
    {
        public string greeting;

        public static Foo DeserializeFromXmlString (string xml)
        {
            Foo result;
            using (MemoryStream memoryStream = new MemoryStream()) {
                byte[] buffer = Encoding.UTF8.GetBytes (xml);
                memoryStream.Write (buffer, 0, buffer.Length);
                memoryStream.Seek (0, SeekOrigin.Begin);
                XmlSerializer xs = new XmlSerializer (typeof(Foo));
                result = (Foo)xs.Deserialize (memoryStream);
            }
            return result;
        }
    }
}

XmlStringTests.cs:

XmlStringTests.cs：

using NUnit.Framework;

namespace XmlStringTests
{
    [TestFixture]
    public class XmlStringTests
    {
        const string expected = "Hello\u000d\u000aWorld!";

        [Test(Description="Fails")]
        public void Cdata ()
        {
            const string test = "<Foo><greeting><![CDATA[Hello\u000d\u000aWorld!]]></greeting></Foo>";
            Foo bar = Foo.DeserializeFromXmlString (test);
            Assert.AreEqual (expected, bar.greeting);
        }

        [Test(Description="Fails")]
        public void CdataWithHash13 ()
        {
            const string test = "<Foo><greeting><![CDATA[Hello&#13;\u000aWorld!]]></greeting></Foo>";
            Foo bar = Foo.DeserializeFromXmlString (test);
            Assert.AreEqual (expected, bar.greeting);
        }

        [Test(Description="Fails")]
        public void CdataWithHashxD ()
        {
            const string test = "<Foo><greeting><![CDATA[Hello&#xd;\u000aWorld!]]></greeting></Foo>";
            Foo bar = Foo.DeserializeFromXmlString (test);
            Assert.AreEqual (expected, bar.greeting);
        }

        [Test(Description="Fails")]
        public void Simple ()
        {
            const string test = "<Foo><greeting>Hello\u000d\u000aWorld!</greeting></Foo>";
            Foo bar = Foo.DeserializeFromXmlString (test);
            Assert.AreEqual (expected, bar.greeting);
        }

        [Test(Description="Passes")]
        public void SimpleWithHash13 ()
        {
            const string test = "<Foo><greeting>Hello&#13;\u000aWorld!</greeting></Foo>";
            Foo bar = Foo.DeserializeFromXmlString (test);
            Assert.AreEqual (expected, bar.greeting);
        }

        [Test(Description="Passes")]
        public void SimpleWithHashxD ()
        {
            const string test = "<Foo><greeting>Hello&#xd;\u000aWorld!</greeting></Foo>";
            Foo bar = Foo.DeserializeFromXmlString (test);
            Assert.AreEqual (expected, bar.greeting);
        }
    }
}

Foo.java:

Foo.java：

import java.io.StringReader;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Unmarshaller;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;

@XmlRootElement(name = "Foo")
@XmlType(propOrder = { "greeting" })
public class Foo {
    public String greeting;

    public static Foo DeserializeFromXmlString(String xml) {
        try {
            JAXBContext context = JAXBContext.newInstance(Foo.class);
            Unmarshaller unmarshaller = context.createUnmarshaller();
            Foo foo = (Foo) unmarshaller.unmarshal(new StringReader(xml));
            return foo;
        } catch (JAXBException e) {
            e.printStackTrace();
            return null;
        }
    }
}

XmlStringTests.java:

XmlStringTests.java：

import static org.junit.Assert.*;
import org.junit.Test;


public class XmlStringTests {
    String expected = "Hello\r\nWorld!";

    @Test //Fails
    public void testCdata ()
    {
        String test = "<Foo><greeting><![CDATA[Hello\r\nWorld!]]></greeting></Foo>";
        Foo bar = Foo.DeserializeFromXmlString (test);
        assertEquals (expected, bar.greeting);
    }

    @Test //Fails
    public void testCdataWithHash13 ()
    {
        String test = "<Foo><greeting><![CDATA[Hello&#13;\nWorld!]]></greeting></Foo>";
        Foo bar = Foo.DeserializeFromXmlString (test);
        assertEquals (expected, bar.greeting);
    }

    @Test //Fails
    public void testCdataWithHashxD ()
    {
        String test = "<Foo><greeting><![CDATA[Hello&#xd;\nWorld!]]></greeting></Foo>";
        Foo bar = Foo.DeserializeFromXmlString (test);
        assertEquals (expected, bar.greeting);
    }

    @Test //Fails
    public void testSimple ()
    {
        String test = "<Foo><greeting>Hello\r\nWorld!</greeting></Foo>";
        Foo bar = Foo.DeserializeFromXmlString (test);
        assertEquals (expected, bar.greeting);
    }

    @Test //Passes
    public void testSimpleWithHash13 ()
    {
        String test = "<Foo><greeting>Hello&#13;\nWorld!</greeting></Foo>";
        Foo bar = Foo.DeserializeFromXmlString (test);
        assertEquals (expected, bar.greeting);
    }

    @Test //Passes
    public void testSimpleWithHashxD ()
    {
        String test = "<Foo><greeting>Hello&#xd;\nWorld!</greeting></Foo>";
        Foo bar = Foo.DeserializeFromXmlString (test);
        assertEquals (expected, bar.greeting);
    }
}

I hope this saves some people some time.

我希望这可以为一些人节省一些时间。

Answer 2

回答by Eric Galluzzo

CR (&x0D;), LF (&x0A;), CRLF, or a few other combinations are all valid. As noted in the spec, all of these are translated to a single &x0A; character.

CR (&x0D;)、LF (&x0A;)、CRLF 或其他一些组合都是有效的。如规范中所述，所有这些都被转换为单个 &x0A; 特点。

在 text/xml 值中编码 CR-LF 换行符的正确方法是什么？

提问by AlwaysLearning

采纳答案by AlwaysLearning

回答by Eric Galluzzo

相关推荐

最近更新

标签

在 text/xml 值中编码 CR-LF 换行符的正确方法是什么？

提问by AlwaysLearning

采纳答案by AlwaysLearning

回答by Eric Galluzzo

相关推荐

xml 如何在 XSLT 中实现 if-else 语句？

xml 如何使用xml配置log4j 1.2

XML 是一种编程语言吗？

xml XSL / XPath 表达式来检查一个节点是否包含至少一个非空子节点

相关推荐

最近更新

标签