C语言 在 C 中的结构中填充

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6968468/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 09:19:27  来源:igfitidea点击:

Padding in structures in C

cstructurepadding

提问by letsc

This is an interview question. Till now, I used to think such questions were purely compiler dependent and shouldn't worry me, but now, I am rather curious about it.

这是一道面试题。直到现在,我曾经认为这些问题纯粹是依赖于编译器的,我不应该担心,但是现在,我对此感到很好奇。

Suppose you are given two structures as:

假设你有两个结构:

struct A {  
  int* a;  
  char b;  
 }  

and ,

和 ,

struct B {  
  char a;  
  int* b;  
}  

So which one would you prefer and why? My answer went like this (though I was somewhat shooting in the dark) that the first structure should be preferred since the compiler allocates space for a structure in some multiples of the word size (which is the size of the pointer - 4 bytes on 32 bit machines and 8 bytes on 64 bit ones). So, for both the structures the compiler would allocate 8 bytes(assuming its a 32 bit machine). But, in the first case, the padding would be done after all my variables(i.e. after a and b). So even if by some chance, b gets some value that overflows and destroys my next padded bytes, but my a is still safe.

那么你更喜欢哪一个,为什么?我的回答是这样的(虽然我有点在黑暗中拍摄)应该首选第一个结构,因为编译器以字大小的一些倍数(这是指针的大小 - 32 上的 4 个字节)为结构分配空间位机器和 64 位机器上的 8 个字节)。因此,对于这两种结构,编译器都会分配 8 个字节(假设它是 32 位机器)。但是,在第一种情况下,填充将在我所有的变量之后(即在 a 和 b 之后)完成。因此,即使有一些机会, b 会得到一些溢出并破坏我的下一个填充字节的值,但我的 a 仍然是安全的。

He didn't seemed much pleased and asked for one disadvantage of the first structure over the second. I didn't have much to say. :D

他似乎不太高兴,并要求第一个结构比第二个结构有一个缺点。我没什么好说的。:D

Please help me with the answers.

请帮我解答。

采纳答案by MByD

I don't think there's an advantage for any of this structures. There is one(!) constant in this equation. The order of the members of the struct is guaranteed to be as declared.

我认为这些结构中的任何一个都没有优势。这个方程中有一个(!)常数。结构成员的顺序保证与声明的一样。

So in case like the following, the second structure mighthave an advantage, since it probably has a smaller size, but not in your example, as they will probably have the same size:

因此,在以下情况下,第二个结构可能具有优势,因为它可能具有较小的尺寸,但在您的示例中并非如此,因为它们可能具有相同的尺寸:

struct {
    char a;
    int b;
    char c;
} X;

Vs.

对比

struct {
    char a;
    char b;
    int c;
} Y;

A little more explanation regarding comments below:

关于以下评论的更多解释:

All the below is not a 100%, but the common way the structs will be constructed in 32 bits system where int is 32 bits:

以下所有内容都不是 100%,而是在 32 位系统中构造结构的常见方式,其中 int 是 32 位:

Struct X:

结构 X:

|     |     |     |     |     |     |     |     |     |     |     |     |
 char  pad    pad   pad   ---------int---------- char   pad   pad   pad   = 12 bytes

struct Y:

结构 Y:

|     |     |     |     |     |     |     |     |
 char  char  pad   pad   ---------int----------        = 8 bytes

回答by cnicutar

Some machines access data more efficientlywhen the values aligned to some boundary. Some requiredata to be aligned.

当值与某个边界对齐时,某些机器会更有效地访问数据。有些需要对齐数据。

On modern 32-bit machines like the SPARC or the Intel [34]86, or any Motorola chip from the 68020 up, each data iten must usually be ``self-aligned'', beginning on an address that is a multiple of its type size. Thus, 32-bit types must begin on a 32-bit boundary, 16-bit types on a 16-bit boundary, 8-bit types may begin anywhere, struct/array/union types have the alignment of their most restrictive member.

在像 SPARC 或 Intel [34]86 这样的现代 32 位机器上,或者从 68020 起的任何摩托罗拉芯片上,每个数据项通常必须是“自对齐的”,从一个地址开始,该地址是它的整数倍。字体大小。因此,32 位类型必须在 32 位边界上开始,16 位类型在 16 位边界上开始,8 位类型可以在任何地方开始,结构/数组/联合类型具有其最严格成员的对齐方式。

So you could have

所以你可以有

struct B {  
    char a;
    /* 3 bytes of padding ? More ? */
    int* b;
}

A simple rule that minimize padding in the ``self-aligned'' case (and does no harm in most others) is to order your struct members by decreasing size.

在“自对齐”的情况下最小化填充的一个简单规则(并且在大多数其他情况下没有害处)是通过减小大小来对结构成员进行排序。

Personally I see not disadvantage with the first struct when compared to the second.

与第二个结构相比,我个人认为第一个结构没有劣势。

回答by Steve Jessop

I can't think of a disadvantage of the first structure over the second in this particular case, but it's possible to come up with examples where there are disadvantages to the general rule of putting the largest members first:

在这种特殊情况下,我想不出第一个结构相对于第二个结构的缺点,但是可以举出一些例子,说明将最大成员放在首位的一般规则存在缺点:

struct A {  
    int* a;
    short b;
    A(short num) : b(2*num+1), a(new int[b]) {} 
    // OOPS, `b` is used uninitialized, and a good compiler will warn. 
    // The only way to get `b` initialized before `a` is to declare 
    // it first in the class, or of course we could repeat `2*num+1`.
}

I also heard about quite a complicated case for large structs, where the CPU has fast addressing modes for accessing pointer+offset, for small values of offset (up to 8 bits, for example, or some other limit of an immediate value). You best micro-optimize a large structure by putting as many of the most commonly-used fields as possible within range of the fastest instructions.

我还听说过大型结构体的一个相当复杂的情况,其中 CPU 具有用于访问指针 + 偏移量的快速寻址模式,用于小偏移量值(例如,最多 8 位,或其他一些立即值限制)。通过将尽可能多的最常用字段放在最快指令的范围内,您最好对大型结构进行微优化。

The CPU might even have fast addressing for pointer+offset and pointer+4*offset. Then suppose you had 64 char fields and 64 int fields: if you put the char fields first then all fields of both types can be addressed using the best instructions, whereas if you put the int fields first then the char fields that aren't 4-aligned will just have to be accessed differently, perhaps by loading a constant into a register rather than with an immediate value, because they're outside the 256-byte limit.

CPU 甚至可能对指针+偏移和指针+4*偏移进行快速寻址。然后假设您有 64 个 char 字段和 64 个 int 字段:如果您首先放置 char 字段,则可以使用最佳指令寻址这两种类型的所有字段,而如果您首先放置 int 字段,则不是 4 的 char 字段-aligned 只需要以不同的方式访问,也许通过将常量加载到寄存器而不是立即值,因为它们超出了 256 字节的限制。

Never had to do it myself, and for instance x86 allows big immediate values anyway. It's not the sort of optimization that anyone would normally think about unless they spend a lot of time staring at assembly.

从来不必自己做,例如 x86 无论如何都允许大的直接值。这不是任何人通常会想到的优化,除非他们花很多时间盯着组装。

回答by alecov

Briefly, there's no advantage in choosing either in the general case. The only situation where the choice would matter in practice is if structure packing is enabled, in the case struct Awould be a better choice (since both fields would be aligned in memory, while in struct Bthe bfield would be located at an odd offset). Structure packing means that no padding bytes are inserted inside the structure.

简而言之,在一般情况下选择两者都没有优势。在实践中唯一重要的情况是结构打包是否启用,在这种情况下struct A将是更好的选择(因为两个字段都将在内存中对齐,而在struct Bb字段中将位于奇数偏移量处)。结构打包意味着没有填充字节插入结构内。

However, this is a rather uncommon scenario: structure packing is generally only enabled in specific situations. It is not a concern on most programs. And it is also not controllable through any portable construction in the C standard.

然而,这是一个相当不常见的情况:结构包装通常只在特定情况下启用。对于大多数程序来说,这不是一个问题。而且它也无法通过 C 标准中的任何便携式结构进行控制。

回答by Kevin

This is also something of a guess, but most compilers have a misalign option that will explicitly not add padding bytes. This then requires (on some platforms) a runtime fixup (hardware trap) to align accesses on the fly (with corresponding performance penalty). If I remember right HPUX fell into this category. So the first struct the fields are still aligned even when misalign compiler options are used (because as you said the padding would be at the end).

这也是一种猜测,但大多数编译器都有一个 misalign 选项,该选项不会明确添加填充字节。这然后需要(在某些平台上)运行时修复(硬件陷阱)以动态对齐访问(具有相应的性能损失)。如果我没记错的话,HPUX 就属于这一类。因此,即使使用了 misalign 编译器选项,第一个 struct 字段仍然对齐(因为正如您所说,填充将在最后)。