asp.net-mvc 如何在 C# 中使用带有 BOM 的 UTF8 编码 GetBytes()?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4414088/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to GetBytes() in C# with UTF8 encoding with BOM?
提问by Nebojsa Veron
I'm having a problem with UTF8 encoding in my asp.net mvc 2 application in C#. I'm trying let user download a simple text file from a string. I am trying to get bytes array with the following line:
我在 C# 中的 asp.net mvc 2 应用程序中遇到了 UTF8 编码问题。我正在尝试让用户从字符串下载一个简单的文本文件。我正在尝试使用以下行获取字节数组:
var x = Encoding.UTF8.GetBytes(csvString);
var x = Encoding.UTF8.GetBytes(csvString);
but when I return it for download using:
但是当我返回下载时使用:
return File(x, ..., ...);
return File(x, ..., ...);
I get a file which is without BOM so I don't get Croatian characters shown up correctly. This is because my bytes array does not include BOM after encoding. I triend inserting those bytes manually and then it shows up correctly, but that's not the best way to do it.
我得到一个没有 BOM 的文件,所以我没有正确显示克罗地亚字符。这是因为我的字节数组在编码后不包含 BOM。我尝试手动插入这些字节,然后它会正确显示,但这不是最好的方法。
I also tried creating UTF8Encoding class instance and passing a boolean value (true) to its constructor to include BOM, but it doesn't work either.
我还尝试创建 UTF8Encoding 类实例并将布尔值 (true) 传递给其构造函数以包含 BOM,但它也不起作用。
Anyone has a solution? Thanks!
有人有解决方案吗?谢谢!
回答by Darin Dimitrov
Try like this:
像这样尝试:
public ActionResult Download()
{
var data = Encoding.UTF8.GetBytes("some data");
var result = Encoding.UTF8.GetPreamble().Concat(data).ToArray();
return File(result, "application/csv", "foo.csv");
}
The reason is that the UTF8Encoding constructor that takes a boolean parameter doesn't do what you would expect:
原因是采用布尔参数的 UTF8Encoding 构造函数不符合您的预期:
byte[] bytes = new UTF8Encoding(true).GetBytes("a");
The resulting array would contain a single byte with the value of 97. There's no BOM because UTF8 doesn't require a BOM.
结果数组将包含值为 97 的单个字节。没有 BOM,因为 UTF8 不需要 BOM。
回答by Hovhannes Hakobyan
I created a simple extension to convert any string in any encoding to its representation of byte array when it is written to a file or stream:
我创建了一个简单的扩展,用于在将任何编码的任何字符串写入文件或流时将其转换为字节数组的表示形式:
public static class StreamExtensions
{
public static byte[] ToBytes(this string value, Encoding encoding)
{
using (var stream = new MemoryStream())
using (var sw = new StreamWriter(stream, encoding))
{
sw.Write(value);
sw.Flush();
return stream.ToArray();
}
}
}
Usage:
用法:
stringValue.ToBytes(Encoding.UTF8)
This will work also for other encodings like UTF-16 which requires the BOM.
这也适用于需要 BOM 的其他编码,如 UTF-16。
回答by yfeldblum
UTF-8 does not require a BOM, because it is a sequence of 1-byte words. UTF-8 = UTF-8BE = UTF-8LE.
UTF-8 不需要 BOM,因为它是一个 1 字节字的序列。UTF-8 = UTF-8BE = UTF-8LE。
In contrast, UTF-16 requires a BOM at the beginning of the stream to identify whether the remainder of the stream is UTF-16BE or UTF-16LE, because UTF-16 is a sequence of 2-byte words and the BOM identifies whether the bytes in the words are BE or LE.
相比之下,UTF-16 需要在流的开头有一个 BOM 来标识流的其余部分是 UTF-16BE 还是 UTF-16LE,因为 UTF-16 是一个 2 字节单词的序列,BOM 标识了流的其余部分是 UTF-16BE 还是 UTF-16LE字中的字节是 BE 或 LE。
The problem does not lie with the Encoding.UTF8
class. The problem lies with whatever program you are using to view the files.
问题不在于Encoding.UTF8
班级。问题在于您用来查看文件的任何程序。
回答by Daniel Pe?alba
Remember that .NET strings are all unicode while there stay in memory, so if you can see your csvString correctly with the debugger the problem is writing the file.
请记住,.NET 字符串在内存中时都是 unicode,因此如果您可以使用调试器正确地看到 csvString,则问题在于写入文件。
In my opinion you should return a FileResult
with the same encoding that the files. Try setting the returning File encoding,
在我看来,你应该返回一个FileResult
与文件相同的编码。尝试设置返回的文件编码,