C# 如何将 UTF-8 byte[] 转换为字符串?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1003275/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert UTF-8 byte[] to string?
提问by BCS
I have a byte[]
array that is loaded from a file that I happen to known contains UTF-8.
我有一个byte[]
从我碰巧知道包含UTF-8的文件加载的数组。
In some debugging code, I need to convert it to a string. Is there a one liner that will do this?
在一些调试代码中,我需要将其转换为字符串。有没有一种衬里可以做到这一点?
Under the covers it should be just an allocation and a memcopy, so even if it is not implemented, it should be possible.
在幕后,它应该只是一个分配和一个memcopy,所以即使它没有实现,也应该是可能的。
采纳答案by Zanoni
string result = System.Text.Encoding.UTF8.GetString(byteArray);
回答by detale
There're at least four different ways doing this conversion.
至少有四种不同的方式进行这种转换。
Encoding's GetString
, but you won't be able to get the original bytes back if those bytes have non-ASCII characters.BitConverter.ToString
The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.Convert.ToBase64String
You can easily convert the output string back to byte array by usingConvert.FromBase64String
.
Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.HttpServerUtility.UrlTokenEncode
You can easily convert the output string back to byte array by usingHttpServerUtility.UrlTokenDecode
. The output string is already URL friendly! The downside is it needsSystem.Web
assembly if your project is not a web project.
编码的 GetString
,但如果这些字节具有非 ASCII 字符,您将无法取回原始字节。BitConverter.ToString
输出是一个“-”分隔的字符串,但没有 .NET 内置方法将字符串转换回字节数组。Convert.ToBase64String
您可以使用 轻松地将输出字符串转换回字节数组Convert.FromBase64String
。
注意:输出字符串可以包含“+”、“/”和“=”。如果要在 URL 中使用字符串,则需要对其进行显式编码。HttpServerUtility.UrlTokenEncode
您可以使用 轻松地将输出字符串转换回字节数组HttpServerUtility.UrlTokenDecode
。输出字符串已经是 URL 友好的!缺点是System.Web
如果您的项目不是 Web 项目,则需要组装。
A full example:
一个完整的例子:
byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters
string s1 = Encoding.UTF8.GetString(bytes); // ???
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1); // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results
string s2 = BitConverter.ToString(bytes); // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes
string s3 = Convert.ToBase64String(bytes); // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes
string s4 = HttpServerUtility.UrlTokenEncode(bytes); // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes
回答by Er?in Dedeo?lu
Definition:
定义:
public static string ConvertByteToString(this byte[] source)
{
return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}
Using:
使用:
string result = input.ConvertByteToString();
回答by metadings
Using (byte)b.ToString("x2")
, Outputs b4b5dfe475e58b67
使用(byte)b.ToString("x2")
, 输出b4b5dfe475e58b67
public static class Ext {
public static string ToHexString(this byte[] hex)
{
if (hex == null) return null;
if (hex.Length == 0) return string.Empty;
var s = new StringBuilder();
foreach (byte b in hex) {
s.Append(b.ToString("x2"));
}
return s.ToString();
}
public static byte[] ToHexBytes(this string hex)
{
if (hex == null) return null;
if (hex.Length == 0) return new byte[0];
int l = hex.Length / 2;
var b = new byte[l];
for (int i = 0; i < l; ++i) {
b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
}
return b;
}
public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
{
if (bytes == null && bytesToCompare == null) return true; // ?
if (bytes == null || bytesToCompare == null) return false;
if (object.ReferenceEquals(bytes, bytesToCompare)) return true;
if (bytes.Length != bytesToCompare.Length) return false;
for (int i = 0; i < bytes.Length; ++i) {
if (bytes[i] != bytesToCompare[i]) return false;
}
return true;
}
}
回答by AndrewJE
Converting a byte[]
to a string
seems simple but any kind of encoding is likely to mess up the output string. This little function just works without any unexpected results:
将 a 转换byte[]
为 astring
似乎很简单,但任何类型的编码都可能弄乱输出字符串。这个小函数正常工作,没有任何意外结果:
private string ToString(byte[] bytes)
{
string response = string.Empty;
foreach (byte b in bytes)
response += (Char)b;
return response;
}
回答by P.K.
There is also class UnicodeEncoding, quite simple in usage:
还有一个UnicodeEncoding类,使用起来很简单:
ByteConverter = new UnicodeEncoding();
string stringDataForEncoding = "My?Secret?Data!";
byte[] dataEncoded = ByteConverter.GetBytes(stringDataForEncoding);
Console.WriteLine("Data after decoding: {0}", ByteConverter.GetString(dataEncoded));
回答by Nir
A general solution to convert from byte array to string when you don't know the encoding:
当您不知道编码时,从字节数组转换为字符串的通用解决方案:
static string BytesToStringConverted(byte[] bytes)
{
using (var stream = new MemoryStream(bytes))
{
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd();
}
}
}
回答by Fehr
Alternatively:
或者:
var byteStr = Convert.ToBase64String(bytes);
回答by Nyerguds
A Linq one-liner for converting a byte array byteArrFilename
read from a file to a pure ascii C-style zero-terminated string would be this: Handy for reading things like file index tables in old archive formats.
用于byteArrFilename
将从文件读取的字节数组转换为纯 ascii C 样式的零终止字符串的 Linq 单行如下: 方便读取旧存档格式中的文件索引表等内容。
String filename = new String(byteArrFilename.TakeWhile(x => x != 0)
.Select(x => x < 128 ? (Char)x : '?').ToArray());
I use '?'
as default char for anything not pure ascii here, but that can be changed, of course. If you want to be sure you can detect it, just use '\0'
instead, since the TakeWhile
at the start ensures that a string built this way cannot possibly contain '\0'
values from the input source.
我'?'
在这里使用任何非纯 ascii 的默认字符作为默认字符,但当然可以更改。如果您想确保可以检测到它,请改用它,'\0'
因为TakeWhile
在开始时确保以这种方式构建的字符串不可能包含'\0'
来自输入源的值。