C# 如何使用 XmlWriter 将编码属性添加到 utf-16 以外的 xml?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/427725/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to put an encoding attribute to xml other that utf-16 with XmlWriter?
提问by agnieszka
I've got a function creating some XmlDocument:
我有一个创建一些 XmlDocument 的函数:
public string CreateOutputXmlString(ICollection<Field> fields)
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.Encoding = Encoding.GetEncoding("windows-1250");
StringBuilder builder = new StringBuilder();
XmlWriter writer = XmlWriter.Create(builder, settings);
writer.WriteStartDocument();
writer.WriteStartElement("data");
foreach (Field field in fields)
{
writer.WriteStartElement("item");
writer.WriteAttributeString("name", field.Id);
writer.WriteAttributeString("value", field.Value);
writer.WriteEndElement();
}
writer.WriteEndElement();
writer.Flush();
writer.Close();
return builder.ToString();
}
I set an encoding but after i create XmlWriter it does have utf-16 encoding. I know it's because strings (and StringBuilder i suppose) are encoded in utf-16 and you can't change it.
So how can I easily create this xml with the encoding attribute set to "windows-1250"? it doesn't even have to be encoded in this encoding, it just has to have the specified attribute.
我设置了一个编码,但在我创建 XmlWriter 之后,它确实有 utf-16 编码。我知道这是因为字符串(和 StringBuilder 我想)是用 utf-16 编码的,你不能改变它。
那么如何轻松地创建这个 xml 并将编码属性设置为“windows-1250”呢?它甚至不必以这种编码进行编码,它只需要具有指定的属性。
edit: it has to be in .Net 2.0 so any new framework elements cannot be used.
编辑:它必须在 .Net 2.0 中,因此不能使用任何新的框架元素。
采纳答案by Jon Skeet
You need to use a StringWriter with the appropriate encoding. Unfortunately StringWriter doesn't let you specify the encoding directly, so you need a class like this:
您需要使用具有适当编码的 StringWriter。不幸的是 StringWriter 不允许你直接指定编码,所以你需要一个这样的类:
public sealed class StringWriterWithEncoding : StringWriter
{
private readonly Encoding encoding;
public StringWriterWithEncoding (Encoding encoding)
{
this.encoding = encoding;
}
public override Encoding Encoding
{
get { return encoding; }
}
}
(This questionis similar but not quite a duplicate.)
(这个问题很相似,但并不完全重复。)
EDIT: To answer the comment: pass the StringWriterWithEncoding to XmlWriter.Createinstead of the StringBuilder, then call ToString() on it at the end.
编辑:回答评论:将 StringWriterWithEncoding 传递给XmlWriter.Create而不是 StringBuilder,然后在最后调用 ToString() 。
回答by agnieszka
I actually solved the problem with MemoryStream:
我实际上用 MemoryStream 解决了这个问题:
public static string CreateOutputXmlString(ICollection<Field> fields)
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.Encoding = Encoding.GetEncoding("windows-1250");
MemoryStream memStream = new MemoryStream();
XmlWriter writer = XmlWriter.Create(memStream, settings);
writer.WriteStartDocument();
writer.WriteStartElement("data");
foreach (Field field in fields)
{
writer.WriteStartElement("item");
writer.WriteAttributeString("name", field.Id);
writer.WriteAttributeString("value", field.Value);
writer.WriteEndElement();
}
writer.WriteEndElement();
writer.Flush();
writer.Close();
writer.Flush();
writer.Close();
string xml = Encoding.GetEncoding("windows-1250").GetString(memStream.ToArray());
memStream.Close();
memStream.Dispose();
return xml;
}
回答by Laurent LA RIZZA
Just some extra explanations to why this is so.
只是对为什么会这样的一些额外解释。
Strings are sequences of characters, not bytes. Strings, per se, are not "encoded", because they are using characters, which are stored as Unicode codepoints. Encoding DOES NOT MAKE SENSE at String level.
字符串是字符序列,而不是字节。字符串本身没有“编码”,因为它们使用字符,这些字符存储为 Unicode 代码点。编码在字符串级别没有意义。
An encoding is a mapping from a sequence of codepoints (characters) to a sequence of bytes (for storage on byte-based systems like filesystems or memory). The framework does not let you specify encodings, unless there is a compelling reason to, like to make 16-bit codepoints fit on byte-based storage.
编码是从代码点(字符)序列到字节序列(用于存储在文件系统或内存等基于字节的系统上)的映射。该框架不允许您指定编码,除非有令人信服的理由,例如使 16 位代码点适合基于字节的存储。
So when you're trying to write your XML into a StringBuilder, you're actually building an XML sequence of characters and writing them as a sequence of characters, so no encoding is performed. Therefore, no Encoding field.
因此,当您尝试将 XML 写入 StringBuilder 时,实际上是在构建一个 XML 字符序列并将它们编写为字符序列,因此不执行任何编码。因此,没有 Encoding 字段。
If you want to use an encoding, the XmlWriter has to write to a Stream.
如果要使用编码,则 XmlWriter 必须写入 Stream。
About the solution that you found with the MemoryStream, no offense intended, but it's just flapping around arms and moving hot air. You're encoding your codepoints with 'windows-1252', and then parsing it back to codepoints. The only change that may occur is that characters not defined in windows-1252 get converted to a '?' character in the process.
关于您使用 MemoryStream 找到的解决方案,无意冒犯,但它只是在手臂周围拍打并移动热空气。您正在使用“windows-1252”对代码点进行编码,然后将其解析回代码点。唯一可能发生的变化是 windows-1252 中未定义的字符被转换为“?” 过程中的性格。
To me, the right solution might be the following one. Depending on what your function is used for, you could pass a Stream as a parameter to your function, so that the caller decides whether it should be written to memory or to a file. So it would be written like this:
对我来说,正确的解决方案可能是以下一个。根据您的函数的用途,您可以将 Stream 作为参数传递给您的函数,以便调用者决定是将其写入内存还是文件。所以它会写成这样:
public static void WriteFieldsAsXmlDocument(ICollection fields, Stream outStream)
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.Encoding = Encoding.GetEncoding("windows-1250");
using(XmlWriter writer = XmlWriter.Create(outStream, settings)) {
writer.WriteStartDocument();
writer.WriteStartElement("data");
foreach (Field field in fields)
{
writer.WriteStartElement("item");
writer.WriteAttributeString("name", field.Id);
writer.WriteAttributeString("value", field.Value);
writer.WriteEndElement();
}
writer.WriteEndElement();
}
}
回答by EddiG
MemoryStream memoryStream = new MemoryStream();
XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.Encoding = Encoding.UTF8;
XmlWriter xmlWriter = XmlWriter.Create(memoryStream, xmlWriterSettings);
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement("root", "http://www.timvw.be/ns");
xmlWriter.WriteEndElement();
xmlWriter.WriteEndDocument();
xmlWriter.Flush();
xmlWriter.Close();
string xmlString = Encoding.UTF8.GetString(memoryStream.ToArray());
回答by SEFL
I solved mine by outputting the string to a variable then replacing any references to utf-16 with utf-8 (my app needed UTF8 encoding). Since you're using a function, you could do something similar. I use VB.net mostly, but I think the C# would look something like this.
我通过将字符串输出到变量然后用 utf-8 替换对 utf-16 的任何引用来解决我的问题(我的应用程序需要 UTF8 编码)。由于您使用的是函数,因此您可以执行类似的操作。我主要使用 VB.net,但我认为 C# 看起来像这样。
return builder.ToString().Replace("utf-16", "utf-8");