C++ 1字节!= 8位的系统?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5516044/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
System where 1 byte != 8 bit?
提问by Xeo
All the time I read sentences like
我一直在读这样的句子
don't rely on 1 byte being 8 bit in size
use
CHAR_BIT
instead of 8 as a constant to convert between bits and bytes
不要依赖 1 个字节的大小为 8 位
使用
CHAR_BIT
而不是 8 作为常量在位和字节之间进行转换
et cetera. What real life systems are there today, where this holds true? (I'm not sure if there are differences between C and C++ regarding this, or if it's actually language agnostic. Please retag if neccessary.)
等等。今天有哪些现实生活系统,这是否适用? (我不确定 C 和 C++ 在这方面是否存在差异,或者它是否实际上与语言无关。如有必要,请重新标记。)
采纳答案by Jerry Coffin
On older machines, codes smaller than 8 bits were fairly common, but most of those have been dead and gone for years now.
在较旧的机器上,小于 8 位的代码相当普遍,但其中大部分已经消失多年了。
C and C++ have mandated a minimumof 8 bits for char
, at least as far back as the C89 standard. [Edit: For example, C90, §5.2.4.2.1 requires CHAR_BIT
>= 8 and UCHAR_MAX
>= 255. C89 uses a different section number (I believethat would be §2.2.4.2.1) but identical content]. They treat "char" and "byte" as essentially synonymous [Edit: for example, CHAR_BIT
is described as: "number of bits for the smallest object that is not a bitfield (byte)".]
C 和 C++ 要求至少8 位char
,至少可以追溯到 C89 标准。[编辑:例如,C90,§5.2.4.2.1 要求CHAR_BIT
>= 8 和UCHAR_MAX
>= 255。C89使用不同的部分编号(我认为这将是 §2.2.4.2.1)但内容相同]。他们将“char”和“byte”视为本质上是同义词[编辑:例如,CHAR_BIT
被描述为:“不是位域(字节)的最小对象的位数”。]
There are, however, current machines (mostly DSPs) where the smallest type is larger than 8 bits -- a minimum of 12, 14, or even 16 bits is fairly common. Windows CE does roughly the same: its smallest type (at least with Microsoft's compiler) is 16 bits. They do not, however, treat a char
as 16 bits -- instead they take the (non-conforming) approach of simply not supporting a type named char
at all.
然而,目前的机器(主要是 DSP)的最小类型大于 8 位——至少 12、14 甚至 16 位是相当常见的。Windows CE 的做法大致相同:它的最小类型(至少对于 Microsoft 的编译器)是 16 位。然而,他们并不将 achar
视为 16 位——而是采用(不符合)的方法,根本不支持命名的类型char
。
回答by John R. Strohm
TODAY, in the world of C++ on x86 processors, it is pretty safe to rely on one byte being 8 bits. Processors where the word size is not a power of 2 (8, 16, 32, 64) are very uncommon.
今天,在 x86 处理器上的 C++ 世界中,依靠一个字节为 8 位是非常安全的。字长不是 2 的幂(8、16、32、64)的处理器非常罕见。
IT WAS NOT ALWAYS SO.
并非总是如此。
The Control Data 6600 (and its brothers) Central Processor used a 60-bit word, and could only address a word at a time. In one sense, a "byte" on a CDC 6600 was 60 bits.
Control Data 6600(及其兄弟)中央处理器使用 60 位字,一次只能寻址一个字。在某种意义上,CDC 6600 上的“字节”是 60 位。
The DEC-10 byte pointer hardware worked with arbitrary-size bytes. The byte pointer included the byte size in bits. I don't remember whether bytes could span word boundaries; I think they couldn't, which meant that you'd have a few waste bits per word if the byte size was not 3, 4, 9, or 18 bits. (The DEC-10 used a 36-bit word.)
DEC-10 字节指针硬件使用任意大小的字节。字节指针包括以位为单位的字节大小。我不记得字节是否可以跨越字边界;我认为他们不能,这意味着如果字节大小不是 3、4、9 或 18 位,每个字就会有一些浪费位。(DEC-10 使用 36 位字。)
回答by R.. GitHub STOP HELPING ICE
Unless you're writing code that could be useful on a DSP, you're completely entitled to assume bytes are 8 bits. All the world may not be a VAX (or an Intel), but all the world has to communicate, share data, establish common protocols, and so on. We live in the internet age built on protocols built on octets, and any C implementation where bytes are not octets is going to have a really hard time using those protocols.
除非您正在编写可能对 DSP 有用的代码,否则您完全有权假设字节是 8 位。全世界可能不是 VAX(或英特尔),但全世界都必须进行通信、共享数据、建立通用协议等等。我们生活在建立在基于八位字节的协议之上的互联网时代,任何字节不是八位字节的 C 实现都将很难使用这些协议。
It's also worth noting that both POSIX and Windows have (and mandate) 8-bit bytes. That covers 100% of interesting non-embedded machines, and these days a large portion of non-DSP embedded systems as well.
还值得注意的是,POSIX 和 Windows 都有(并强制要求)8 位字节。这涵盖了 100% 有趣的非嵌入式机器,如今也涵盖了很大一部分非 DSP 嵌入式系统。
回答by Daniel A. White
From Wikipedia:
来自维基百科:
The size of a byte was at first selected to be a multiple of existing teletypewriter codes, particularly the 6-bit codes used by the U.S. Army (Fieldata) and Navy. In 1963, to end the use of incompatible teleprinter codes by different branches of the U.S. government, ASCII, a 7-bit code, was adopted as a Federal Information Processing Standard, making 6-bit bytes commercially obsolete. In the early 1960s, AT&T introduced digital telephony first on long-distance trunk lines. These used the 8-bit μ-law encoding. This large investment promised to reduce transmission costs for 8-bit data. The use of 8-bit codes for digital telephony also caused 8-bit data "octets" to be adopted as the basic data unit of the early Internet.
字节的大小最初被选择为现有电传打字机代码的倍数,特别是美国陆军 (Fieldata) 和海军使用的 6 位代码。1963 年,为了结束美国政府不同部门使用不兼容的电传打字机代码,7 位代码 ASCII 被采纳为联邦信息处理标准,使 6 位字节在商业上过时。1960 年代初期,AT&T 首先在长途干线上引入了数字电话。这些使用了 8 位 μ-law 编码。这笔巨额投资有望降低 8 位数据的传输成本。数字电话使用 8 位代码也导致 8 位数据“八位字节”被采用作为早期互联网的基本数据单元。
回答by Alexander Gessler
As an average programmer on mainstream platforms, you do notneed to worry too much about one byte not being 8 bit. However, I'd still use the CHAR_BIT
constant in my code and assert
(or better static_assert
) any locations where you rely on 8 bit bytes. That should put you on the safe side.
作为主流平台的平均程序员,你就不会需要太担心一个字节不是8位。但是,我仍然会CHAR_BIT
在我的代码和assert
(或更好static_assert
)依赖 8 位字节的任何位置使用常量。那应该让你处于安全的一面。
(I am not aware of any relevant platform where it doesn't hold true).
(我不知道任何不适用的相关平台)。
回答by AnT
Firstly, the number of bits in char
does not formally depend on the "system" or on "machine", even though this dependency is usually implied by common sense. The number of bits in char
depends only on the implementation(i.e. on the compiler). There's no problem implementing a compiler that will have more than 8 bits in char
for any "ordinary" system or machine.
首先,位的数量char
并不正式依赖于“系统”或“机器”,即使这种依赖通常是常识所暗示的。输入的位数char
仅取决于实现(即编译器)。char
为任何“普通”系统或机器实现一个超过 8 位的编译器是没有问题的。
Secondly, there are several embedded platforms where sizeof(char) == sizeof(short) == sizeof(int)
, each having 16 bits (I don't remember the exact names of these platforms). Also, the well-known Cray machines had similar properties with all these types having 32 bits in them.
其次,有几个嵌入式平台,其中sizeof(char) == sizeof(short) == sizeof(int)
每个都有 16 位(我不记得这些平台的确切名称)。此外,众所周知的 Cray 机器具有相似的特性,所有这些类型都具有 32 位。
回答by John Leidegren
In history, there's existed a bunch of odd architectures that where not using native word sizes that where multiples of 8. If you ever come across any of these today, let me know.
在历史上,存在着一堆奇怪的架构,它们不使用 8 的倍数的原生字长。如果你今天遇到任何这些,请告诉我。
- The first commerical CPU by Intel was the Intel 4004(4-bit)
- PDP-8(12-bit)
- Intel 的第一个商用 CPU 是Intel 4004(4 位)
- PDP-8(12 位)
The size of the byte has historically been hardware dependent and no definitive standards exist that mandate the size.
字节的大小历来取决于硬件,并且不存在规定大小的明确标准。
It might just be a good thing to keep in mind if your doing lots of embedded stuff.
如果你做了很多嵌入的东西,记住这可能是一件好事。
回答by dubnde
I do a lot of embedded and currently working on DSP code with CHAR_BIT of 16
我做了很多嵌入式工作,目前正在处理 CHAR_BIT 为 16 的 DSP 代码