C语言有符号和无符号整数之间的转换是否保持内存中变量的精确位模式？

Question

提问by Flash

I want to pass a 32-bit signed integer xthrough a socket. In order that the receiver knows which byte order to expect, I am calling htonl(x)before sending. htonlexpects a uint32_tthough and I want to be sure of what happens when I cast my int32_tto a uint32_t.

我想x通过套接字传递一个 32 位有符号整数。为了让接收者知道期望的字节顺序，我htonl(x)在发送之前调用。htonl期望 auint32_t但我想确定当我将 my 转换int32_t为 a时会发生什么uint32_t。

int32_t x = something;
uint32_t u = (uint32_t) x;

Is it always the case that the bytes in xand ueach will be exactly the same? What about casting back:

中的字节x和u每个字节是否总是完全相同？回退怎么样：

uint32_t u = something;
int32_t x = (int32_t) u;

I realise that negative values cast to large unsigned values but that doesn't matter since I'm just casting back on the other end. However if the cast messes with the actual bytes then I can't be sure casting back will return the same value.

我意识到负值会转换为大的无符号值，但这并不重要，因为我只是在另一端转换。但是，如果转换与实际字节混淆，那么我不能确定转换会返回相同的值。

Answer 1

采纳答案by Christoph

In general, casting in C is specified in terms of values, not bit patterns - the former will be preserved (if possible), but the latter not necessarily so. In case of two's complement representations without padding - which is mandatory for the fixed-with integer types - this distinction does not matter and the cast will indeed be a noop.

通常，C 中的强制转换是根据值而不是位模式指定的 - 前者将被保留（如果可能），但后者不一定如此。在没有填充的二进制补码表示的情况下 - 这对于固定整数类型是强制性的 - 这种区别无关紧要，转换确实是一个 noop。

But even if the conversion from signed to unsigned would have changed the bit pattern, converting it back again would have restored the original value - with the caveat that out-of-range unsigned to signed conversion is implementation-defined and may raise a signal on overflow.

但是，即使从有符号到无符号的转换会改变位模式，再次将其转换回来也会恢复原始值 - 需要注意的是，超出范围的无符号到有符号转换是实现定义的，并且可能会在溢出。

For full portability (which will probably be overkill), you'll need to use type punning instead of conversion. This can be done in one of two ways:

为了完全的可移植性（这可能有点矫枉过正），您需要使用类型双关而不是转换。这可以通过以下两种方式之一完成：

Via pointer casts, ie

通过指针转换，即

uint32_t u = *(uint32_t*)&x;

which you should be careful with as it may violate effective typing rules (but is fine for signed/unsigned variants of integer types) or via unions, ie

你应该小心，因为它可能违反有效的类型规则（但对于整数类型的有符号/无符号变体很好）或通过联合，即

uint32_t u = ((union { int32_t i; uint32_t u; }){ .i = x }).u;

which can also be used to eg convert from doubleto uint64_t, which you may not do with pointer casts if you want to avoid undefined behaviour.

这也可以用于例如从doubletouint64_t转换，如果您想避免未定义的行为，您可能不会使用指针强制转换。

Answer 2

回答by Filipe Gon?alves

Casts are used in C to mean both "type conversion" and "type disambiguation". If you have something like

在 C 中使用强制转换来表示“类型转换”和“类型消歧”。如果你有类似的东西

(float) 3

Then it's a type conversion, and the actual bits change. If you say

然后是类型转换，实际位会发生变化。如果你说

(float) 3.0

it's a type disambiguation.

这是一种类型消歧。

Assuming a 2's complement representation(see comments below), when you cast an intto unsigned int, the bit pattern is not changed, only its semantical meaning; if you cast it back, the result will always be correct. It falls into the case of type disambiguation because no bits are changed, only the way that the computer interprets them.

假设一个 2 的补码表示（见下面的评论），当你将一个转换int为 to 时unsigned int，位模式没有改变，只有它的语义含义；如果你把它扔回去，结果将永远是正确的。它属于类型消歧的情况，因为没有比特被改变，只有计算机解释它们的方式。

Note that, in theory, 2's complement may not be used, and unsignedand signedcan have very different representations, and the actual bit pattern can change in that case.

请注意，理论上，可能不使用 2 的补码，unsigned并且signed可能具有非常不同的表示形式，在这种情况下，实际位模式可能会发生变化。

However, from C11 (the current C standard), you actually are guaranteed that sizeof(int) == sizeof(unsigned int):

但是，从 C11（当前的 C 标准）开始，您实际上可以保证sizeof(int) == sizeof(unsigned int)：

(§6.2.5/6) For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements [...]

（第 6.2.5/6 节）对于每一种有符号整数类型，都有一个对应的（但不同的）无符号整数类型（用关键字 unsigned 指定），它使用相同的存储量（包括符号信息）并具有相同的对齐要求 [...]

I would say that in practice, you can assume it is safe.

我想说的是，在实践中，你可以假设它是安全的。

Answer 3

回答by Jens Gustedt

This should always be safe, because the intXX_ttypes are guaranteed to be in two's complement ifthey exist:

这应该是安全的，因为该intXX_t类型是保证在补，如果它们存在：

7.20.1.1 Exact-width integer types The typedef name intN_t designates a signed integer type with width N , no padding bits, and a two's complement representation. Thus, int8_t denotes such a signed integer type with a width of exactly 8 bits.

7.20.1.1 精确宽度整数类型 typedef 名称 intN_t 指定宽度为 N、无填充位和二进制补码表示的有符号整数类型。因此， int8_t 表示宽度正好为 8 位的带符号整数类型。

Theoretically, the back-conversion from uint32_tto int32_tis implementation defined, as for all unsignedto signedconversions. But I can't much imagine that a platform would do differently than what you expect.

理论上，从uint32_tto的反向转换int32_t是实现定义的，对于所有的unsignedtosigned转换。但是我无法想象一个平台的表现会与您期望的不同。

If you want to be really sure of this you still could to that conversion manually. You'd just have to test a value for > INT32_MAXand then do a little bit of math. Even if you do that systematically, a decent compiler should be able to detect that and optimize it out.

如果您想真正确定这一点，您仍然可以手动进行该转换。你只需要测试一个值> INT32_MAX，然后做一些数学运算。即使您系统地这样做，一个体面的编译器也应该能够检测到并优化它。

C语言有符号和无符号整数之间的转换是否保持内存中变量的精确位模式？

提问by Flash

采纳答案by Christoph

回答by Filipe Gon?alves

回答by Jens Gustedt

相关推荐

最近更新

标签

C语言 有符号和无符号整数之间的转换是否保持内存中变量的精确位模式？

提问by Flash

采纳答案by Christoph

回答by Filipe Gon?alves

回答by Jens Gustedt

相关推荐

C语言 得到错误：预期标识符或 '(' before '{' 标记

C语言 strcmp 行为

C语言 如何使用 GCC 在 mac 终端上调试 C 程序？

C语言 处理链表数组

相关推荐

最近更新

标签

C语言有符号和无符号整数之间的转换是否保持内存中变量的精确位模式？

C语言得到错误：预期标识符或 '(' before '{' 标记

C语言如何使用 GCC 在 mac 终端上调试 C 程序？

C语言处理链表数组