C# .net 中的流式 XML 序列化
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/721537/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
streaming XML serialization in .net
提问by Luca Martinetti
I'm trying to serialize a very large IEnumerable<MyObject>
using an XmlSerializer
without keeping all the objects in memory.
我正在尝试IEnumerable<MyObject>
使用 an序列化一个非常大XmlSerializer
的对象,而不将所有对象保留在内存中。
The IEnumerable<MyObject>
is actually lazy..
该IEnumerable<MyObject>
实际上是懒..
I'm looking for a streaming solution that will:
我正在寻找一种流媒体解决方案,它将:
- Take an object from the
IEnumerable<MyObject>
Serialize it to the underlying stream using the standard serialization (I don't want to handcraft the XML here!) - Discard the in memory data and move to the next
IEnumerable<MyObject>
使用标准序列化将对象从序列化到底层流(我不想在这里手工制作 XML!)- 丢弃内存数据并移动到下一个
I'm trying with this code:
我正在尝试使用此代码:
using (var writer = new StreamWriter(filePath))
{
var xmlSerializer = new XmlSerializer(typeof(MyObject));
foreach (var myObject in myObjectsIEnumerable)
{
xmlSerializer.Serialize(writer, myObject);
}
}
but I'm getting multiple XML headers and I cannot specify a root tag <MyObjects>
so my XML is invalid.
但是我得到了多个 XML 标头,而且我无法指定根标记,<MyObjects>
因此我的 XML 无效。
Any idea?
任何的想法?
Thanks
谢谢
采纳答案by Dour High Arch
The XmlWriter
class is a fast streaming API for XML generation. It is rather low-level, MSDN has an articleon instantiating a validating XmlWriter using XmlWriter.Create()
.
该XmlWriter
班是XML生成一个快速流API。这是相当低级的,MSDN 有一篇关于使用XmlWriter.Create()
.
Edit: link fixed. Here is sample code from the article:
编辑:链接已修复。这是文章中的示例代码:
async Task TestWriter(Stream stream)
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Async = true;
using (XmlWriter writer = XmlWriter.Create(stream, settings)) {
await writer.WriteStartElementAsync("pf", "root", "http://ns");
await writer.WriteStartElementAsync(null, "sub", null);
await writer.WriteAttributeStringAsync(null, "att", null, "val");
await writer.WriteStringAsync("text");
await writer.WriteEndElementAsync();
await writer.WriteCommentAsync("cValue");
await writer.WriteCDataAsync("cdata value");
await writer.WriteEndElementAsync();
await writer.FlushAsync();
}
}
回答by Chris Doggett
Here's what I use:
这是我使用的:
using System;
using System.Collections.Generic;
using System.Xml;
using System.Xml.Serialization;
using System.Text;
using System.IO;
namespace Utils
{
public class XMLSerializer
{
public static Byte[] StringToUTF8ByteArray(String xmlString)
{
return new UTF8Encoding().GetBytes(xmlString);
}
public static String SerializeToXML<T>(T objectToSerialize)
{
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings =
new XmlWriterSettings {Encoding = Encoding.UTF8, Indent = true};
using (XmlWriter xmlWriter = XmlWriter.Create(sb, settings))
{
if (xmlWriter != null)
{
new XmlSerializer(typeof(T)).Serialize(xmlWriter, objectToSerialize);
}
}
return sb.ToString();
}
public static void DeserializeFromXML<T>(string xmlString, out T deserializedObject) where T : class
{
XmlSerializer xs = new XmlSerializer(typeof (T));
using (MemoryStream memoryStream = new MemoryStream(StringToUTF8ByteArray(xmlString)))
{
deserializedObject = xs.Deserialize(memoryStream) as T;
}
}
}
}
Then just call:
然后只需调用:
string xml = Utils.SerializeToXML(myObjectsIEnumerable);
I haven't tried it with, for example, an IEnumerable
that fetches objects one at a time remotely, or any other weird use cases, but it works perfectly for List<T>
and other collections that are in memory.
我还没有尝试过,例如,IEnumerable
远程一次获取一个对象,或任何其他奇怪的用例,但它非常适用List<T>
于内存中的其他集合。
EDIT: Based on your comments in response to this, you could use XmlDocument.LoadXml
to load the resulting XML string into an XmlDocument
, save the first one to a file, and use that as your master XML file. For each item in the IEnumerable
, use LoadXml
again to create a new in-memory XmlDocument
, grab the nodes you want, append them to the master document, and save it again, getting rid of the new one.
编辑:根据您对此的评论,您可以使用XmlDocument.LoadXml
将生成的 XML 字符串加载到 中XmlDocument
,将第一个保存到文件中,并将其用作主 XML 文件。对于 中的每个项目IEnumerable
,LoadXml
再次使用以创建新的内存中XmlDocument
,获取您想要的节点,将它们附加到主文档,然后再次保存,摆脱新的。
After you're finished, there may be a way to wrap all of the nodes in your root tag. You could also use XSL and XslCompiledTransform
to write another XML file with the objects properly wrapped in the root tag.
完成后,可能有一种方法可以将所有节点包装在根标记中。您还可以使用 XSLXslCompiledTransform
编写另一个 XML 文件,并将对象正确包装在根标记中。
回答by John Saunders
You can do this by implementing the IXmlSerializable
interface on the large class. The implementation of the WriteXml
method can write the start tag, then simply loop over the IEnumerable<MyObject>
and serialize each MyObject
to the same XmlWriter
, one at a time.
您可以通过IXmlSerializable
在大型类上实现接口来做到这一点。该WriteXml
方法的实现可以编写开始标记,然后简单地循环遍历IEnumerable<MyObject>
并将每个序列化为MyObject
相同的XmlWriter
,一次一个。
In this implementation, there won't beany in-memory data to get rid of (past what the garbage collector will collect).
在这个实现中,不会有任何内存数据需要清除(超过垃圾收集器将收集的数据)。