C# .net 中的流式 XML 序列化

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/721537/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 22:29:23  来源:igfitidea点击:

streaming XML serialization in .net

c#.netxmlserializationstreaming

提问by Luca Martinetti

I'm trying to serialize a very large IEnumerable<MyObject>using an XmlSerializerwithout keeping all the objects in memory.

我正在尝试IEnumerable<MyObject>使用 an序列化一个非常大XmlSerializer的对象,而不将所有对象保留在内存中。

The IEnumerable<MyObject>is actually lazy..

IEnumerable<MyObject>实际上是懒..

I'm looking for a streaming solution that will:

我正在寻找一种流媒体解决方案,它将:

  1. Take an object from the IEnumerable<MyObject>Serialize it to the underlying stream using the standard serialization (I don't want to handcraft the XML here!)
  2. Discard the in memory data and move to the next
  1. IEnumerable<MyObject>使用标准序列化将对象从序列化到底层流(我不想在这里手工制作 XML!
  2. 丢弃内存数据并移动到下一个

I'm trying with this code:

我正在尝试使用此代码:

using (var writer = new StreamWriter(filePath))
{
 var xmlSerializer = new XmlSerializer(typeof(MyObject));
  foreach (var myObject in myObjectsIEnumerable)
  {
   xmlSerializer.Serialize(writer, myObject);
  }
}

but I'm getting multiple XML headers and I cannot specify a root tag <MyObjects>so my XML is invalid.

但是我得到了多个 XML 标头,而且我无法指定根标记,<MyObjects>因此我的 XML 无效。

Any idea?

任何的想法?

Thanks

谢谢

采纳答案by Dour High Arch

The XmlWriterclass is a fast streaming API for XML generation. It is rather low-level, MSDN has an articleon instantiating a validating XmlWriter using XmlWriter.Create().

XmlWriter班是XML生成一个快速流API。这是相当低级的,MSDN 有一篇关于使用XmlWriter.Create().

Edit: link fixed. Here is sample code from the article:

编辑:链接已修复。这是文章中的示例代码:

async Task TestWriter(Stream stream) 
{
    XmlWriterSettings settings = new XmlWriterSettings();
    settings.Async = true;

    using (XmlWriter writer = XmlWriter.Create(stream, settings)) {
        await writer.WriteStartElementAsync("pf", "root", "http://ns");
        await writer.WriteStartElementAsync(null, "sub", null);
        await writer.WriteAttributeStringAsync(null, "att", null, "val");
        await writer.WriteStringAsync("text");
        await writer.WriteEndElementAsync();
        await writer.WriteCommentAsync("cValue");
        await writer.WriteCDataAsync("cdata value");
        await writer.WriteEndElementAsync();
        await writer.FlushAsync();
    }
}

回答by Chris Doggett

Here's what I use:

这是我使用的:

using System;
using System.Collections.Generic;
using System.Xml;
using System.Xml.Serialization;
using System.Text;
using System.IO;

namespace Utils
{
    public class XMLSerializer
    {
        public static Byte[] StringToUTF8ByteArray(String xmlString)
        {
            return new UTF8Encoding().GetBytes(xmlString);
        }

        public static String SerializeToXML<T>(T objectToSerialize)
        {
            StringBuilder sb = new StringBuilder();

            XmlWriterSettings settings = 
                new XmlWriterSettings {Encoding = Encoding.UTF8, Indent = true};

            using (XmlWriter xmlWriter = XmlWriter.Create(sb, settings))
            {
                if (xmlWriter != null)
                {
                    new XmlSerializer(typeof(T)).Serialize(xmlWriter, objectToSerialize);
                }
            }

            return sb.ToString();
        }

        public static void DeserializeFromXML<T>(string xmlString, out T deserializedObject) where T : class
        {
            XmlSerializer xs = new XmlSerializer(typeof (T));

            using (MemoryStream memoryStream = new MemoryStream(StringToUTF8ByteArray(xmlString)))
            {
                deserializedObject = xs.Deserialize(memoryStream) as T;
            }
        }
    }
}

Then just call:

然后只需调用:

string xml = Utils.SerializeToXML(myObjectsIEnumerable);

I haven't tried it with, for example, an IEnumerablethat fetches objects one at a time remotely, or any other weird use cases, but it works perfectly for List<T>and other collections that are in memory.

我还没有尝试过,例如,IEnumerable远程一次获取一个对象,或任何其他奇怪的用例,但它非常适用List<T>于内存中的其他集合。

EDIT: Based on your comments in response to this, you could use XmlDocument.LoadXmlto load the resulting XML string into an XmlDocument, save the first one to a file, and use that as your master XML file. For each item in the IEnumerable, use LoadXmlagain to create a new in-memory XmlDocument, grab the nodes you want, append them to the master document, and save it again, getting rid of the new one.

编辑:根据您对此的评论,您可以使用XmlDocument.LoadXml将生成的 XML 字符串加载到 中XmlDocument,将第一个保存到文件中,并将其用作主 XML 文件。对于 中的每个项目IEnumerableLoadXml再次使用以创建新的内存中XmlDocument,获取您想要的节点,将它们附加到主文档,然后再次保存,摆脱新的。

After you're finished, there may be a way to wrap all of the nodes in your root tag. You could also use XSL and XslCompiledTransformto write another XML file with the objects properly wrapped in the root tag.

完成后,可能有一种方法可以将所有节点包装在根标记中。您还可以使用 XSLXslCompiledTransform编写另一个 XML 文件,并将对象正确包装在根标记中。

回答by John Saunders

You can do this by implementing the IXmlSerializableinterface on the large class. The implementation of the WriteXmlmethod can write the start tag, then simply loop over the IEnumerable<MyObject>and serialize each MyObjectto the same XmlWriter, one at a time.

您可以通过IXmlSerializable在大型类上实现接口来做到这一点。该WriteXml方法的实现可以编写开始标记,然后简单地循环遍历IEnumerable<MyObject>并将每个序列化为MyObject相同的XmlWriter,一次一个。

In this implementation, there won't beany in-memory data to get rid of (past what the garbage collector will collect).

在这个实现中,不会任何内存数据需要清除(超过垃圾收集器将收集的数据)。