C# 如何将扩展的 ascii 转换为 System.String?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/666385/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 12:30:36  来源:igfitidea点击:

How can I convert extended ascii to a System.String?

c#.netextended-ascii

提问by rtremaine

For example: "?" or ASCII DEC 189. When I read the bytes from a text file the byte[] contains the valid value, in this case 189.

例如: ”?” 或 ASCII DEC 189。当我从文本文件中读取字节时,byte[] 包含有效值,在本例中为 189。

Converting to Unicode results in the Unicode replacement character 65533.

转换为 Unicode 会产生 Unicode 替换字符 65533。

UnicodeEncoding.Unicode.GetString(b);

UnicodeEncoding.Unicode.GetString(b);

Converting to ASCII results in 63 or "?"

转换为 ASCII 结果为 63 或“?”

ASCIIEncoding.ASCII.GetString(b);

ASCIIEncoding.ASCII.GetString(b);

If this isn't possible what is the best way to handle this data? I'd like to be able to perform string functions like Replace().

如果这是不可能的,处理这些数据的最佳方法是什么?我希望能够执行像 Replace() 这样的字符串函数。

采纳答案by Richard

Byte 189 represents a "?" in iso-8859-1 (aka "Latin-1"), so the following is maybe what you want:

字节189代表一个“?” 在iso-8859-1(又名“Latin-1”)中,所以以下可能是您想要的:

var e = Encoding.GetEncoding("iso-8859-1");
var s = e.GetString(new byte[] { 189 });

All strings and chars in .NET are UTF-16 encoded, so you need to use an encoder/decoder to convert anything else, sometimes this is defaulted (e.g. UTF-8 for FileStream instances) but good practice is to always specify.

.NET 中的所有字符串和字符都是 UTF-16 编码的,因此您需要使用编码器/解码器来转换其他任何内容,有时这是默认设置(例如 FileStream 实例的 UTF-8),但好的做法是始终指定。

You will need some form of implicit or (better) explicit metadata to supply you with the information about which encoding.

您将需要某种形式的隐式或(更好的)显式元数据来为您提供有关哪种编码的信息。

回答by Jon Skeet

It depends on exactly what the encoding is.

这完全取决于编码是什么。

There's no such thing as "ASCII 189" - ASCII only goes up to 127. There are many encodings which a 8-bit encodings using ASCII for the first 128 values.

没有像“ASCII 189”这样的东西 - ASCII 只能达到 127。有许多编码,其中前 128 个值使用 ASCII 的 8 位编码。

You maywant Encoding.Default(which is the default encoding for your particular system), but it's hard to know for sure. Where did your data come from?

可能想要Encoding.Default(这是您特定系统的默认编码),但很难确定。你的数据从哪里来?

回答by Tom Wilson

The old PC-8 or Extended ASCII character set was around before IBM and Microsoft introduced the idea of Code Pages to the PC world. This WAS Extended ASCII - in 1982. In fact, it was the ONLY character set available on PC's at the time, up until the EGA card allowed you to load other fonts in to VRAM.

旧的 PC-8 或扩展 ASCII 字符集出现在 IBM 和 Microsoft 将代码页的概念引入 PC 世界之前。这是 1982 年的扩展 ASCII。事实上,它是当时 PC 上唯一可用的字符集,直到 EGA 卡允许您将其他字体加载到 VRAM 中。

This was also the default standard for ANSI terminals, and nearly every BBS I dialed up to in the 80's and early 90's used this character set for displaying menus and boxes.

这也是 ANSI 终端的默认标准,我在 80 年代和 90 年代初拨打的几乎每个 BBS 都使用此字符集来显示菜单和框。

Here's the code to turn 8-bit Extended ASCII in to Unicode text. Note the key bit of code: the GetEncoding("437"). That used Code Page 437 to translate the 8-bit ASCII text to the Unicode equivalent.

这是将 8 位扩展 ASCII 转换为 Unicode 文本的代码。注意代码的关键部分:GetEncoding("437")。使用代码页 437 将 8 位 ASCII 文本转换为等效的 Unicode。

    string ASCII8ToString(byte[] ASCIIData)
    {
        var e = Encoding.GetEncoding("437");
        return e.GetString(ASCIIData);
    }

回答by Ritwik

System.String[]can not store characters with ASCII > 127if you are trying to work on any extended ASCII characters such as ? ¢ ? ?hereis the method to convert it into their binary and decimal equivalent

System.String[]ASCII > 127如果您尝试处理任何扩展的 ASCII字符,则无法存储字符,例如? ¢ ? ?这里是将其转换为二进制和十进制等效值的方法