C/C++:强制位域顺序和对齐
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1490092/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C/C++: Force Bit Field Order and Alignment
提问by dewald
I read that the order of bit fields within a struct is platform specific. What about if I use different compiler-specific packing options, will this guarantee data is stored in the proper order as they are written? For example:
我读到结构中位字段的顺序是特定于平台的。如果我使用不同的特定于编译器的打包选项,这会保证数据在写入时以正确的顺序存储吗?例如:
struct Message
{
unsigned int version : 3;
unsigned int type : 1;
unsigned int id : 5;
unsigned int data : 6;
} __attribute__ ((__packed__));
On an Intel processor with the GCC compiler, the fields were laid out in memory as they are shown. Message.version
was the first 3 bits in the buffer, and Message.type
followed. If I find equivalent struct packing options for various compilers, will this be cross-platform?
在带有 GCC 编译器的 Intel 处理器上,字段在内存中的布局如图所示。Message.version
是缓冲区中的前 3 位,然后Message.type
是。如果我为各种编译器找到等效的 struct 打包选项,这是否是跨平台的?
回答by Stephen Canon
No, it will not be fully-portable. Packing options for structs are extensions, and are themselves not fully portable. In addition to that, C99 §6.7.2.1, paragraph 10 says: "The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined."
不,它不会是完全便携的。结构体的打包选项是扩展,并且本身不是完全可移植的。除此之外,C99 §6.7.2.1 第 10 段说:“单元内位域的分配顺序(高阶到低阶或低阶到高阶)是实现定义的。”
Even a single compiler might lay the bit field out differently depending on the endianness of the target platform, for example.
例如,即使是单个编译器也可能根据目标平台的字节顺序以不同方式布置位字段。
回答by Joshua
Bit fields vary widely from compiler to compiler, sorry.
位字段因编译器而异,抱歉。
With GCC, big endian machines lay out the bits big end first and little endian machines lay out the bits little end first.
使用 GCC,大端机器首先布置大端,小端机器首先布置小端。
K&R says "Adjacent [bit-]field members of structures are packed into implementation-dependent storage units in an implementation-dependent direction. When a field following another field will not fit ... it may be split between units or the unit may be padded. An unnamed field of width 0 forces this padding..."
K&R 说“结构的相邻 [位] 字段成员被打包到依赖于实现的方向上的依赖于实现的存储单元中。当另一个字段后面的字段不适合时......它可能会在单元之间拆分,或者该单元可能是已填充。宽度为 0 的未命名字段强制此填充...”
Therefore, if you need machine independent binary layout you must do it yourself.
因此,如果您需要与机器无关的二进制布局,则必须自己进行。
This last statement also applies to non-bitfields due to padding -- however all compilers seem to have some way of forcing byte packing of a structure, as I see you already discovered for GCC.
由于填充,这最后一条语句也适用于非位域——但是,所有编译器似乎都有某种方法来强制对结构进行字节打包,正如我看到您已经为 GCC 发现的那样。
回答by Michael Burr
Bitfields should be avoided - they aren't very portable between compilers even for the same platform. from the C99 standard 6.7.2.1/10 - "Structure and union specifiers" (there's similar wording in the C90 standard):
应该避免位域 - 即使对于同一平台,它们在编译器之间也不是很可移植。来自 C99 标准 6.7.2.1/10 - “结构和联合说明符”(C90 标准中有类似的措辞):
An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
一个实现可以分配足够大的任何可寻址存储单元来保存一个位域。如果剩余足够的空间,紧跟在结构中另一个位域之后的位域应被打包到同一单元的相邻位中。如果剩余空间不足,则不适合的位字段是否放入下一个单元或与相邻单元重叠是实现定义的。单元内位域的分配顺序(高阶到低阶或低阶到高阶)是实现定义的。可寻址存储单元的对齐方式未指定。
You cannot guarantee whether a bit field will 'span' an int boundary or not and you can't specify whether a bitfield starts at the low-end of the int or the high end of the int (this is independant of whether the processor is big-endian or little-endian).
你不能保证一个位域是否会“跨越”一个 int 边界,你不能指定一个位域是从 int 的低端开始还是从 int 的高端开始(这与处理器是否是大端或小端)。
Prefer bitmasks. Use inlines (or even macros) to set, clear and test the bits.
首选位掩码。使用内联(甚至宏)来设置、清除和测试位。
回答by pierrotlefou
endianness are talking about byte orders not bit orders. Nowadays, it is 99% sure that bit orders are fixed. However, when using bitfields, endianness should be taken in count. See the example below.
字节序是在谈论字节顺序而不是位顺序。如今,99% 的确定位顺序是固定的。但是,在使用位域时,应考虑字节顺序。请参阅下面的示例。
#include <stdio.h>
typedef struct tagT{
int a:4;
int b:4;
int c:8;
int d:16;
}T;
int main()
{
char data[]={0x12,0x34,0x56,0x78};
T *t = (T*)data;
printf("a =0x%x\n" ,t->a);
printf("b =0x%x\n" ,t->b);
printf("c =0x%x\n" ,t->c);
printf("d =0x%x\n" ,t->d);
return 0;
}
//- big endian : mips24k-linux-gcc (GCC) 4.2.3 - big endian
a =0x1
b =0x2
c =0x34
d =0x5678
1 2 3 4 5 6 7 8
\_/ \_/ \_____/ \_____________/
a b c d
// - little endian : gcc (Ubuntu 4.3.2-1ubuntu11) 4.3.2
a =0x2
b =0x1
c =0x34
d =0x7856
7 8 5 6 3 4 1 2
\_____________/ \_____/ \_/ \_/
d c b a
回答by Bob Murphy
Most of the time, probably, but don't bet the farm on it, because if you're wrong, you'll lose big.
大多数时候,可能,但不要把赌注押在农场上,因为如果你错了,你会损失惨重。
If you really, really need to have identical binary information, you'll need to create bitfields with bitmasks - e.g. you use an unsigned short (16 bit) for Message, and then make things like versionMask = 0xE000 to represent the three topmost bits.
如果您真的,真的需要具有相同的二进制信息,则需要创建带有位掩码的位域 - 例如,您对 Message 使用无符号短整型(16 位),然后使用 versionMask = 0xE000 之类的东西来表示最高的三个位。
There's a similar problem with alignment within structs. For instance, Sparc, PowerPC, and 680x0 CPUs are all big-endian, and the common default for Sparc and PowerPC compilers is to align struct members on 4-byte boundaries. However, one compiler I used for 680x0 only aligned on 2-byte boundaries - and there was no option to change the alignment!
结构内的对齐也存在类似的问题。例如,Sparc、PowerPC 和 680x0 CPU 都是 big-endian,Sparc 和 PowerPC 编译器的常见默认设置是在 4 字节边界上对齐结构成员。但是,我用于 680x0 的一个编译器仅在 2 字节边界上对齐 - 并且没有更改对齐方式的选项!
So for some structs, the sizes on Sparc and PowerPC are identical, but smaller on 680x0, and some of the members are in different memory offsets within the struct.
因此,对于某些结构体,Sparc 和 PowerPC 上的大小相同,但在 680x0 上较小,并且某些成员在结构体中位于不同的内存偏移量中。
This was a problem with one project I worked on, because a server process running on Sparc would query a client and find out it was big-endian, and assume it could just squirt binary structs out on the network and the client could cope. And that worked fine on PowerPC clients, and crashed big-time on 680x0 clients. I didn't write the code, and it took quite a while to find the problem. But it was easy to fix once I did.
这是我参与的一个项目的问题,因为在 Sparc 上运行的服务器进程会查询客户端并发现它是大端的,并假设它可以在网络上喷出二进制结构并且客户端可以应对。这在 PowerPC 客户端上运行良好,并在 680x0 客户端上崩溃。代码没写,找了好久才找到问题。但是一旦我做到了,它很容易修复。
回答by Duncan Roe
Thanks @BenVoigt for your very useful comment starting
感谢@BenVoigt 开始提供非常有用的评论
No, they were created to save memory.
不,它们是为了节省内存而创建的。
Linux source doesuse a bit field to match to an external structure: /usr/include/linux/ip.hhas this code for the first byte of an IP datagram
Linux 源代码确实使用位字段来匹配外部结构:/usr/include/linux/ip.h具有 IP 数据报的第一个字节的此代码
struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
__u8 ihl:4,
version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
__u8 version:4,
ihl:4;
#else
#error "Please fix <asm/byteorder.h>"
#endif
However in light of your comment I'm giving up trying to get this to work for the multi-byte bit field frag_off.
但是,根据您的评论,我放弃尝试使其适用于多字节位字段frag_off。
回答by 99999999
Of course the best answer is to use a class which reads/writes bit fields as a stream. Using the C bit field structure is just not guaranteed. Not to mention it is considered unprofessional/lazy/stupid to use this in real world coding.
当然,最好的答案是使用将位字段作为流读/写的类。不能保证使用 C 位域结构。更不用说在现实世界的编码中使用它被认为是不专业/懒惰/愚蠢的。