为什么包含 XML 标头时 C# XmlDocument.LoadXml(string) 会失败?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/310669/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 22:40:53  来源:igfitidea点击:

Why does C# XmlDocument.LoadXml(string) fail when an XML header is included?

c#.netxml

提问by

Does anyone have any idea why the following code sample fails with an XmlException "Data at the root level is invalid. Line 1, position 1."

有没有人知道为什么下面的代码示例失败并显示 XmlException “根级别的数据无效。第 1 行,位置 1。”

var body = "<?xml version="1.0" encoding="utf-16"?><Report> ......"
XmlDocument bodyDoc = new XmlDocument();            
bodyDoc.LoadXml(body);

回答by

I figured it out. Read the MSDN documentation and it says to use .Load instead of LoadXml when reading from strings. Found out this works 100% of time. Oddly enough using StringReader causes problems. I think the main reason is that this is a Unicode encoded string and that could cause problems because StringReader is UTF-8 only.

我想到了。阅读 MSDN 文档,它说在读取字符串时使用 .Load 而不是 LoadXml。发现这 100% 的时间有效。奇怪的是,使用 StringReader 会导致问题。我认为主要原因是这是一个 Unicode 编码的字符串,这可能会导致问题,因为 StringReader 只是 UTF-8。

MemoryStream stream = new MemoryStream();
            byte[] data = body.PayloadEncoding.GetBytes(body.Payload);
            stream.Write(data, 0, data.Length);
            stream.Seek(0, SeekOrigin.Begin);

            XmlTextReader reader = new XmlTextReader(stream);

            // MSDN reccomends we use Load instead of LoadXml when using in memory XML payloads
            bodyDoc.Load(reader);

回答by Zach Burlingame

Background

背景

Although your question does have the encoding set as UTF-16, you don't have the string properly escaped so I wasn't sure if you did, in fact, accurately transpose the string into your question.

尽管您的问题确实将编码设置为 UTF-16,但您没有正确转义字符串,因此我不确定您是否确实将字符串准确地转换为您的问题。

I ran into the same exception:

我遇到了同样的异常:

System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1.

System.Xml.XmlException: 根级别的数据无效。第 1 行,位置 1。

However, my code looked like this:

但是,我的代码如下所示:

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml);

The Problem

问题

The problem is that strings are stored internally as UTF-16 in .NET however the encoding specified in the XML document header may be different. E.g.:

问题是字符串在 .NET 中内部存储为 UTF-16,但是 XML 文档标头中指定的编码可能不同。例如:

<?xml version="1.0" encoding="utf-8"?>

From the MSDN documentation for String here:

此处的 String 的 MSDN 文档中:

Each Unicode character in a string is defined by a Unicode scalar value, also called a Unicode code point or the ordinal (numeric) value of the Unicode character. Each code point is encoded using UTF-16 encoding, and the numeric value of each element of the encoding is represented by a Char object.

字符串中的每个 Unicode 字符都由一个 Unicode 标量值定义,也称为 Unicode 代码点或 Unicode 字符的序数(数字)值。每个码位都使用UTF-16编码,编码的每个元素的数值由一个Char对象表示。

This means that when you pass XmlDocument.LoadXml() your string with an XML header, it must say the encoding is UTF-16. Otherwise, the actual underlying encoding won't match the encoding reported in the header and will result in an XmlException being thrown.

这意味着当您使用 XML 标头传递 XmlDocument.LoadXml() 字符串时,它必须说明编码为 UTF-16。否则,实际的底层编码将与标头中报告的编码不匹配,并将导致抛出 XmlException。

The Solution

解决方案

The solution for this problem is to make sure the encoding used in whatever you pass the Load or LoadXml method matches what you say it is in the XML header. In my example above, either change your XML header to state UTF-16 or to encode the input in UTF-8 and use one of the XmlDocument.Load methods.

此问题的解决方案是确保在您传递 Load 或 LoadXml 方法的任何内容中使用的编码与您在 XML 标头中所说的内容相匹配。在我上面的示例中,将您的 XML 标头更改为状态 UTF-16 或将输入编码为 UTF-8 并使用XmlDocument.Load 方法之一

Below is sample code demonstrating how to use a MemoryStream to build an XmlDocument using a string which defines a UTF-8 encode XML document (but of course, is stored a UTF-16 .NET string).

下面的示例代码演示了如何使用 MemoryStream 使用定义 UTF-8 编码 XML 文档的字符串构建 XmlDocument(当然,存储的是 UTF-16 .NET 字符串)。

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";

// Encode the XML string in a UTF-8 byte array
byte[] encodedString = Encoding.UTF8.GetBytes(xml);

// Put the byte array into a stream and rewind it to the beginning
MemoryStream ms = new MemoryStream(encodedString);
ms.Flush();
ms.Position = 0;

// Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(ms);

回答by Zach Burlingame

Try this:

尝试这个:

XmlDocument bodyDoc = new XmlDocument();
bodyDoc.XMLResolver = null;
bodyDoc.Load(body);

回答by keithl8041

This worked for me:

这对我有用:

var xdoc = new XmlDocument { XmlResolver = null };  
xdoc.LoadXml(xmlFragment);

回答by Gunner

Simple and effective solution: Instead of using the LoadXml()method use the Load()method

简单有效的解决方法:用LoadXml()方法代替Load()方法

For example:

例如:

XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("sample.xml");

回答by Sander Kouwenhoven

This really saved my day.

这真的拯救了我的一天。

I have written a extension method based on Zach's answer, also I have extended it to use the encoding as a parameter, allowing for different encodings beside from UTF-8 to be used, and I wrapped the MemoryStream in a 'using' statement.

我已经根据 Zach 的回答编写了一个扩展方法,我还扩展了它以使用编码作为参数,允许使用除 UTF-8 之外的不同编码,并且我将 MemoryStream 包装在“使用”语句中。

public static class XmlHelperExtentions
{
    /// <summary>
    /// Loads a string through .Load() instead of .LoadXml()
    /// This prevents character encoding problems.
    /// </summary>
    /// <param name="xmlDocument"></param>
    /// <param name="xmlString"></param>
    public static void LoadString(this XmlDocument xmlDocument, string xmlString, Encoding encoding = null) {

        if (encoding == null) {
            encoding = Encoding.UTF8;
        }

        // Encode the XML string in a byte array
        byte[] encodedString = encoding.GetBytes(xmlString);

        // Put the byte array into a stream and rewind it to the beginning
        using (var ms = new MemoryStream(encodedString)) {
            ms.Flush();
            ms.Position = 0;

            // Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
            xmlDocument.Load(ms);
        }
    }
}

回答by Rubarb

I had the same problem when switching from absolute to relative path for my xml file. The following solves both loading and using relative source path issues. Using a XmlDataProvider, which is defined in xaml (should be possible in code too) :

当我的 xml 文件从绝对路径切换到相对路径时,我遇到了同样的问题。以下解决了加载和使用相对源路径问题。使用在 xaml 中定义的 XmlDataProvider(在代码中也应该是可能的):

    <Window.Resources>
    <XmlDataProvider 
        x:Name="myDP"
        x:Key="MyData"
        Source=""
        XPath="/RootElement/Element"
        IsAsynchronous="False"
        IsInitialLoadEnabled="True"                         
        debug:PresentationTraceSources.TraceLevel="High"  /> </Window.Resources>

The data provider automatically loads the document once the source is set. Here's the code :

设置源后,数据提供者会自动加载文档。这是代码:

        m_DataProvider = this.FindResource("MyData") as XmlDataProvider;
        FileInfo file = new FileInfo("MyXmlFile.xml");

        m_DataProvider.Document = new XmlDocument();
        m_DataProvider.Source = new Uri(file.FullName);

回答by xadriel

Simple line:

简单线:

bodyDoc.LoadXml(new MemoryStream(Encoding.Unicode.GetBytes(body)));

bodyDoc.LoadXml(new MemoryStream(Encoding.Unicode.GetBytes(body)));

回答by Hugh

I had the same issue because the XML file I was uploading was encoded using UTF-8-BOM (UTF-8 byte-order mark).

我遇到了同样的问题,因为我上传的 XML 文件是使用 UTF-8-BOM(UTF-8 字节顺序标记)编码的。

Switched the encoding to UTF-8 in Notepad++ and was able to load the XML file in code.

在 Notepad++ 中将编码切换为 UTF-8,并且能够在代码中加载 XML 文件。