C语言 对 C 中有符号和无符号变量的解释?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19842215/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Wrap around explanation for signed and unsigned variables in C?
提问by orustammanapov
I read a bit in C spec that unsigned variables(in particular unsigned short int) perform some so called wrap aroundon integer overflow, although I couldn't find anything on signed variables except that I left with undefined behavior.
我在 C 规范中读到了一些无符号变量(特别是unsigned shortint)对整数溢出执行一些所谓的环绕,尽管我在有符号变量上找不到任何东西,除了我留下了未定义的行为。
My professor told me that their values also get wrapped around (maybe he just meant gcc). I thought the bits just get truncated and the bits I left with give me some weird value!
我的教授告诉我,他们的价值观也被包裹了(也许他只是指 gcc)。我认为这些位只是被截断了,而我留下的位给了我一些奇怪的价值!
What wrap around is and how is it different from just truncating bits.
环绕是什么以及它与仅截断位有何不同。
回答by AnT
Signed integer variables do not have wrap-around behavior in C language. Signed integer overflow during arithmetic computations produces undefined behavior. Note BTW that GCC compiler you mentioned is known for implementing strict overflow semanticsin optimizations, meaning that it takes advantage of the freedom provided by such undefined behavior situations: GCC compiler assumes that signed integer values never wrap around. That means that GCC actually happens to be one of the compilers in which you cannotrely on wrap-around behavior of signed integer types.
有符号整数变量在 C 语言中没有环绕行为。算术计算期间的有符号整数溢出会产生未定义的行为。请注意顺便说一句,您提到的 GCC 编译器以在优化中实现严格的溢出语义而闻名,这意味着它利用了此类未定义行为情况提供的自由:GCC 编译器假定有符号整数值永远不会回绕。这意味着 GCC 实际上恰好是您不能依赖有符号整数类型的环绕行为的编译器之一。
For example, GCC compiler can assume that for variable int ithe following condition
例如,GCC 编译器可以假设变量int i满足以下条件
if (i > 0 && i + 1 > 0)
is equivalent to a mere
相当于仅仅
if (i > 0)
This is exactly what strict overflow semanticsmeans.
这正是严格溢出语义的含义。
Unsigned integer types implement modulo arithmetic. The modulo is equal 2^Nwhere Nis the number of bits in the value representation of the type. For this reason unsigned integer types do indeed appear to wrap around on overflow.
无符号整数类型实现模运算。模数等于2^N其中N在该类型的值表示比特的数量。出于这个原因,无符号整数类型确实似乎在溢出时环绕。
However, C language never performs arithmetic computations in domains smaller than that of int/unsigned int. Type unsigned short intthat you mention in your question will typically be promoted to type intin expressions before any computations begin (assuming that the range of unsigned shortfits into the range of int). Which means that 1) the computations with unsigned short intwill be preformed in the domain of int, with overflow happening when intoverflows, 2) overflow during such computations will lead to undefined behavior, not to wrap-around behavior.
但是,C 语言永远不会在小于int/ 的域中执行算术计算unsigned int。unsigned short int您在问题中提到的类型通常会int在任何计算开始之前提升为输入表达式(假设范围unsigned short适合 的范围int)。这意味着 1) 的计算unsigned short int将在 的域中进行int,溢出时会发生int溢出,2) 此类计算期间的溢出将导致未定义的行为,而不是环绕行为。
For example, this code produces a wrap around
例如,此代码产生一个环绕
unsigned i = USHRT_MAX;
i *= INT_MAX; /* <- unsigned arithmetic, overflows, wraps around */
while this code
而这段代码
unsigned short i = USHRT_MAX;
i *= INT_MAX; /* <- signed arithmetic, overflows, produces undefined behavior */
leads to undefined behavior.
导致未定义的行为。
If no intoverflow happens and the result is converted back to an unsigned short inttype, it is again reduced by modulo 2^N, which will appear as if the value has wrapped around.
如果没有int发生溢出并且结果被转换回一个unsigned short int类型,它会再次被 modulo 减少2^N,这看起来好像值已经回绕了。
回答by John Bode
Imagine you have a data type that's only 3 bits wide. This allows you to represent 8 distinct values, from 0 through 7. If you add 1 to 7, you will "wrap around" back to 0, because you don't have enough bits to represent the value 8 (1000).
假设您有一个只有 3 位宽的数据类型。这允许您表示 8 个不同的值,从 0 到 7。如果您将 1 加到 7,您将“环绕”回到 0,因为您没有足够的位来表示值 8 (1000)。
This behavior is well-defined for unsigned types. It is notwell-defined for signed types, because there are multiple methods for representing signed values, and the result of an overflow will be interpreted differently based on that method.
对于无符号类型,此行为是明确定义的。对于有符号类型,它没有明确定义,因为有多种方法可以表示有符号值,并且溢出的结果将根据该方法进行不同的解释。
Sign-magnitude: the uppermost bit represents the sign; 0 for positive, 1 for negative. If my type is three bits wide again, then I can represent signed values as follows:
Sign-magnitude:最高位代表符号;0 为正,1 为负。如果我的类型又是 3 位宽,那么我可以按如下方式表示有符号值:
000 = 0
001 = 1
010 = 2
011 = 3
100 = -0
101 = -1
110 = -2
111 = -3
Since one bit is taken up for the sign, I only have two bits to encode a value from 0 to 3. If I add 1 to 3, I'll overflow with -0 as the result. Yes, there are two representations for 0, one positive and one negative. You won't encounter sign-magnitude representation all that often.
由于符号占用了一位,因此我只有两位来编码从 0 到 3 的值。如果我将 1 加到 3,结果会溢出 -0。是的,0有两种表示,一种是正的,一种是负的。您不会经常遇到符号大小表示。
One's-complement: the negative value is the bitwise-inverse of the positive value. Again, using the three-bit type:
补码:负值是正值的按位倒数。再次使用三位类型:
000 = 0
001 = 1
010 = 2
011 = 3
100 = -3
101 = -2
110 = -1
111 = -0
I have three bits to encode my values, but the range is [-3, 3]. If I add 1 to 3, I'll overflow with -3 as the result. This is different from the sign-magnitude result above. Again, there are two encodings for 0 using this method.
我有三个位来编码我的值,但范围是 [-3, 3]。如果我将 1 加到 3,结果会溢出 -3。这与上面的符号大小结果不同。同样,使用此方法有两种 0 编码。
Two's-complement: the negative value is the bitwise inverse of the positive value, plus 1. In the three-bit system:
二进制补码:负值是正值的按位倒数加1。在三位系统中:
000 = 0
001 = 1
010 = 2
011 = 3
100 = -4
101 = -3
110 = -2
111 = -1
If I add 1 to 3, I'll overflow with -4 as a result, which is different from the previous two methods. Note that we have a slightly larger range of values [-4, 3] and only one representation for 0.
如果我将1加到3,结果会溢出-4,这与前两种方法不同。请注意,我们有一个稍大的值 [-4, 3] 范围,并且只有一种表示 0。
Two's complement is probably the most common method of representing signed values, but it's not the only one, hence the C standard can't make any guarantees of what will happen when you overflow a signed integer type. So it leaves the behavior undefinedso the compiler doesn't have to deal with interpreting multiple representations.
二进制补码可能是表示有符号值的最常用方法,但它不是唯一的方法,因此 C 标准无法保证溢出有符号整数类型时会发生什么。所以它使行为未定义,因此编译器不必处理解释多个表示。
回答by diapir
The undefined behaviorcomes from early portability issues when signed integer types could be represented either as sign & magnitude, one's complement or two's complement.
在不确定的行为来自于早期的便携性问题,当符号整型可以既表现为符号和幅度,一个补码或二进制补码。
Nowadays, all architectures represent integers as two's complement that do wrap around. But be careful : since your compiler is right to assume you won't be running undefined behavior, you might encounter weird bugs when optimisation is on.
如今,所有架构都将整数表示为可以环绕的二进制补码。但要小心:因为你的编译器假设你不会运行未定义的行为是正确的,所以当优化打开时你可能会遇到奇怪的错误。
回答by NL - Apologize to Monica
In a signed 8-bit integer, the intuitive definition of wrap around might look like going from +127 to -128 -- in two's complement binary: 0111111 (127) and 1000000 (-128). As you can see, that is the natural progress of incrementing the binary data--without considering it to represent an integer, signed or unsigned. Counter intuitively, the actual overflow takes place when moving from -1 (11111111) to 0 (00000000) in the unsigned integer's sense of wrap-around.
在有符号的 8 位整数中,环绕的直观定义可能看起来像从 +127 到 -128 —— 二进制补码:0111111 (127) 和 1000000 (-128)。正如您所看到的,这是递增二进制数据的自然过程——不考虑它代表一个整数,有符号或无符号。与直觉相反,实际溢出发生在从 -1 (11111111) 移动到 0 (00000000) 时,无符号整数的环绕意义。
This doesn't answer the deeper question of what the correct behavior is when a signed integer overflows because there is no "correct" behavior according to the standard.
这并没有回答更深层次的问题,即当有符号整数溢出时正确的行为是什么,因为根据标准没有“正确”的行为。

