C++ 为什么结构的 sizeof 不等于每个成员的 sizeof 之和？

Question

提问by Kevin

Why does the sizeofoperator return a size larger for a structure than the total sizes of the structure's members?

为什么sizeof运算符返回的结构大小大于结构成员的总大小？

Answer 1

采纳答案by Kevin

This is because of padding added to satisfy alignment constraints. Data structure alignmentimpacts both performance and correctness of programs:

这是因为添加了填充以满足对齐约束。数据结构对齐影响程序的性能和正确性：

Mis-aligned access might be a hard error (often SIGBUS).
Mis-aligned access might be a soft error.
- Either corrected in hardware, for a modest performance-degradation.
- Or corrected by emulation in software, for a severe performance-degradation.
- In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.

未对齐的访问可能是一个硬错误（通常SIGBUS）。
未对齐的访问可能是软错误。
- 要么在硬件中纠正，以实现适度的性能下降。
- 或者通过软件中的仿真进行更正，以导致严重的性能下降。
- 此外，原子性和其他并发保证可能会被破坏，从而导致细微的错误。

Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):

下面是一个使用 x86 处理器典型设置的示例（均使用 32 位和 64 位模式）：

struct X
{
    short s; /* 2 bytes */
             /* 2 padding bytes */
    int   i; /* 4 bytes */
    char  c; /* 1 byte */
             /* 3 padding bytes */
};

struct Y
{
    int   i; /* 4 bytes */
    char  c; /* 1 byte */
             /* 1 padding byte */
    short s; /* 2 bytes */
};

struct Z
{
    int   i; /* 4 bytes */
    short s; /* 2 bytes */
    char  c; /* 1 byte */
             /* 1 padding byte */
};

const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */

One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Zin the example above).

可以通过按对齐对成员进行排序来最小化结构的大小（按大小排序对于基本类型就足够了）（Z如上例中的结构）。

IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragmastatements to change the structure alignment settings.

重要说明：C 和 C++ 标准都声明结构对齐是实现定义的。因此，每个编译器可能会选择不同地对齐数据，从而导致不同且不兼容的数据布局。因此，在处理将由不同编译器使用的库时，了解编译器如何对齐数据非常重要。一些编译器具有命令行设置和/或特殊#pragma语句来更改结构对齐设置。

Answer 2

回答by EmmEff

Packing and byte alignment, as described in the C FAQ here:

包装和字节对齐，如在C FAQ描述在这里：

It's for alignment. Many processors can't access 2- and 4-byte quantities (e.g. ints and long ints) if they're crammed in every-which-way.
Suppose you have this structure:
struct {
    char a[3];
    short int b;
    long int c;
    char d[3];
};
Now, you might think that it ought to be possible to pack this structure into memory like this:
+-------+-------+-------+-------+
|           a           |   b   |
+-------+-------+-------+-------+
|   b   |           c           |
+-------+-------+-------+-------+
|   c   |           d           |
+-------+-------+-------+-------+
But it's much, much easier on the processor if the compiler arranges it like this:
+-------+-------+-------+
|           a           |
+-------+-------+-------+
|       b       |
+-------+-------+-------+-------+
|               c               |
+-------+-------+-------+-------+
|           d           |
+-------+-------+-------+
In the packed version, notice how it's at least a little bit hard for you and me to see how the b and c fields wrap around? In a nutshell, it's hard for the processor, too. Therefore, most compilers will pad the structure (as if with extra, invisible fields) like this:
+-------+-------+-------+-------+
|           a           | pad1  |
+-------+-------+-------+-------+
|       b       |     pad2      |
+-------+-------+-------+-------+
|               c               |
+-------+-------+-------+-------+
|           d           | pad3  |
+-------+-------+-------+-------+

是为了对齐。许多处理器无法访问 2 字节和 4 字节的数量（例如整数和长整数），如果它们塞满了各个方向。
假设你有这样的结构：
struct {
    char a[3];
    short int b;
    long int c;
    char d[3];
};
现在，您可能认为应该可以像这样将此结构打包到内存中：
+-------+-------+-------+-------+
|           a           |   b   |
+-------+-------+-------+-------+
|   b   |           c           |
+-------+-------+-------+-------+
|   c   |           d           |
+-------+-------+-------+-------+
但是如果编译器这样安排它，在处理器上会容易得多：
+-------+-------+-------+
|           a           |
+-------+-------+-------+
|       b       |
+-------+-------+-------+-------+
|               c               |
+-------+-------+-------+-------+
|           d           |
+-------+-------+-------+
在打包版本中，请注意您和我至少有点难以看到 b 和 c 字段如何环绕？简而言之，处理器也很难。因此，大多数编译器会像这样填充结构（好像有额外的、不可见的字段）：
+-------+-------+-------+-------+
|           a           | pad1  |
+-------+-------+-------+-------+
|       b       |     pad2      |
+-------+-------+-------+-------+
|               c               |
+-------+-------+-------+-------+
|           d           | pad3  |
+-------+-------+-------+-------+

Answer 3

回答by INS

If you want the structure to have a certain size with GCC for example use __attribute__((packed)).

例如，如果您希望结构具有特定大小的 GCC，请使用__attribute__((packed)).

On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.

在 Windows 上，当使用带有/Zp 选项的 cl.exe 编译器时，您可以将对齐设置为一个字节。

Usually it is easier for the CPU to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.

通常，CPU 更容易访问 4（或 8）的倍数的数据，具体取决于平台和编译器。

So it is a matter of alignment basically.

所以基本上是对齐的问题。

You need to have good reasons to change it.

你需要有充分的理由来改变它。

Answer 4

回答by Kyle Burton

This can be due to byte alignment and padding so that the structure comes out to an even number of bytes (or words) on your platform. For example in C on Linux, the following 3 structures:

这可能是由于字节对齐和填充导致结构在您的平台上出现偶数个字节（或字）。例如在 Linux 上的 C 中，有以下 3 个结构：

#include "stdio.h"


struct oneInt {
  int x;
};

struct twoInts {
  int x;
  int y;
};

struct someBits {
  int x:2;
  int y:6;
};


int main (int argc, char** argv) {
  printf("oneInt=%zu\n",sizeof(struct oneInt));
  printf("twoInts=%zu\n",sizeof(struct twoInts));
  printf("someBits=%zu\n",sizeof(struct someBits));
  return 0;
}

Have members who's sizes (in bytes) are 4 bytes (32 bits), 8 bytes (2x 32 bits) and 1 byte (2+6 bits) respectively. The above program (on Linux using gcc) prints the sizes as 4, 8, and 4 - where the last structure is padded so that it is a single word (4 x 8 bit bytes on my 32bit platform).

成员的大小（以字节为单位）分别为 4 字节（32 位）、8 字节（2x 32 位）和 1 字节（2+6 位）。上面的程序（在 Linux 上使用 gcc）将大小打印为 4、8 和 4 - 其中最后一个结构被填充，以便它是单个字（在我的 32 位平台上为 4 x 8 位字节）。

oneInt=4
twoInts=8
someBits=4

Answer 5

回答by lkanab

回答by sid1138

The size of a structure is greater than the sum of its parts because of what is called packing. A particular processor has a preferred data size that it works with. Most modern processors' preferred size if 32-bits (4 bytes). Accessing the memory when data is on this kind of boundary is more efficient than things that straddle that size boundary.

由于所谓的堆积，结构的大小大于其各部分的总和。特定的处理器具有与其一起工作的首选数据大小。大多数现代处理器的首选大小是 32 位（4 字节）。当数据处于这种边界时访问内存比跨越该大小边界的事物更有效。

For example. Consider the simple structure:

例如。考虑简单的结构：

struct myStruct
{
   int a;
   char b;
   int c;
} data;

If the machine is a 32-bit machine and data is aligned on a 32-bit boundary, we see an immediate problem (assuming no structure alignment). In this example, let us assume that the structure data starts at address 1024 (0x400 - note that the lowest 2 bits are zero, so the data is aligned to a 32-bit boundary). The access to data.a will work fine because it starts on a boundary - 0x400. The access to data.b will also work fine, because it is at address 0x404 - another 32-bit boundary. But an unaligned structure would put data.c at address 0x405. The 4 bytes of data.c are at 0x405, 0x406, 0x407, 0x408. On a 32-bit machine, the system would read data.c during one memory cycle, but would only get 3 of the 4 bytes (the 4th byte is on the next boundary). So, the system would have to do a second memory access to get the 4th byte,

如果机器是 32 位机器并且数据在 32 位边界上对齐，我们会看到一个直接的问题（假设没有结构对齐）。在这个例子中，让我们假设结构数据从地址 1024 开始（0x400 - 请注意最低 2 位为零，因此数据与 32 位边界对齐）。对 data.a 的访问将正常工作，因为它从边界开始 - 0x400。对 data.b 的访问也可以正常工作，因为它位于地址 0x404 - 另一个 32 位边界。但是未对齐的结构会将 data.c 放在地址 0x405 处。data.c 的 4 个字节位于 0x405、0x406、0x407、0x408。在 32 位机器上，系统将在一个内存周期内读取 data.c，但只会获得 4 个字节中的 3 个（第 4 个字节在下一个边界上）。因此，系统必须进行第二次内存访问才能获得第 4 个字节，

Now, if instead of putting data.c at address 0x405, the compiler padded the structure by 3 bytes and put data.c at address 0x408, then the system would only need 1 cycle to read the data, cutting access time to that data element by 50%. Padding swaps memory efficiency for processing efficiency. Given that computers can have huge amounts of memory (many gigabytes), the compilers feel that the swap (speed over size) is a reasonable one.

现在，如果不是将 data.c 放在地址 0x405，编译器将结构填充 3 个字节并将 data.c 放在地址 0x408，那么系统将只需要 1 个周期来读取数据，从而减少对该数据元素的访问时间50%。填充将内存效率交换为处理效率。鉴于计算机可以拥有大量内存（许多 GB），编译器认为交换（速度超过大小）是合理的。

Unfortunately, this problem becomes a killer when you attempt to send structures over a network or even write the binary data to a binary file. The padding inserted between elements of a structure or class can disrupt the data sent to the file or network. In order to write portable code (one that will go to several different compilers), you will probably have to access each element of the structure separately to ensure the proper "packing".

不幸的是，当您尝试通过网络发送结构或什至将二进制数据写入二进制文件时，这个问题就会成为一个杀手。在结构或类的元素之间插入的填充会破坏发送到文件或网络的数据。为了编写可移植的代码（一个将进入几个不同的编译器），您可能必须分别访问结构的每个元素以确保正确的“打包”。

On the other hand, different compilers have different abilities to manage data structure packing. For example, in Visual C/C++ the compiler supports the #pragma pack command. This will allow you to adjust data packing and alignment.

另一方面，不同的编译器在管理数据结构打包方面有不同的能力。例如，在 Visual C/C++ 中，编译器支持 #pragma pack 命令。这将允许您调整数据打包和对齐。

For example:

例如：

#pragma pack 1
struct MyStruct
{
    int a;
    char b;
    int c;
    short d;
} myData;

I = sizeof(myData);

I should now have the length of 11. Without the pragma, I could be anything from 11 to 14 (and for some systems, as much as 32), depending on the default packing of the compiler.

我现在应该有 11 的长度。如果没有编译指示，我可以是 11 到 14（对于某些系统，多达 32），这取决于编译器的默认打包。

Answer 7

回答by Orion Adrian

It can do so if you have implicitly or explicitly set the alignment of the struct. A struct that is aligned 4 will always be a multiple of 4 bytes even if the size of its members would be something that's not a multiple of 4 bytes.

如果您已隐式或显式设置结构的对齐方式，则可以这样做。对齐 4 的结构将始终是 4 字节的倍数，即使其成员的大小不是 4 字节的倍数。

Also a library may be compiled under x86 with 32-bit ints and you may be comparing its components on a 64-bit process would would give you a different result if you were doing this by hand.

此外，一个库可以在 x86 下用 32 位整数编译，如果您手动执行此操作，您可能会在 64 位进程上比较其组件会给您不同的结果。

Answer 8

回答by JohnMcG

In addition to the other answers, a struct can (but usually doesn't) have virtual functions, in which case the size of the struct will also include the space for the vtbl.

除了其他答案之外，结构体可以（但通常没有）具有虚函数，在这种情况下，结构体的大小还将包括 vtbl 的空间。

Answer 9

回答by bruziuz

C language leaves compiler some freedom about the location of the structural elements in the memory:

C 语言让编译器在内存中结构元素的位置方面有一些自由：

memory holes may appear between any two components, and after the last component. It was due to the fact that certain types of objects on the target computer may be limited by the boundaries of addressing
"memory holes" size included in the result of sizeof operator. The sizeof only doesn't include size of the flexible array, which is available in C/C++
Some implementations of the language allow you to control the memory layout of structures through the pragma and compiler options

内存孔可能出现在任何两个组件之间，以及最后一个组件之后。这是因为目标计算机上的某些类型的对象可能会受到寻址边界的限制
sizeof 运算符结果中包含的“内存孔”大小。sizeof only 不包括弹性数组的大小，可在 C/C++ 中使用
该语言的某些实现允许您通过编译指示和编译器选项控制结构的内存布局

The C language provides some assurance to the programmer of the elements layout in the structure:

C 语言为程序员在结构中的元素布局提供了一些保证：

compilers required to assign a sequence of components increasing memory addresses
Address of the first component coincides with the start address of the structure
unnamed bit fields may be included in the structure to the required address alignments of adjacent elements

编译器需要分配一系列增加内存地址的组件
第一个组件的地址与结构的起始地址重合
未命名的位字段可以包含在结构中，以要求相邻元素的地址对齐

Problems related to the elements alignment:

与元素对齐相关的问题：

Different computers line the edges of objects in different ways
Different restrictions on the width of the bit field
Computers differ on how to store the bytes in a word (Intel 80x86 and Motorola 68000)

不同的计算机以不同的方式排列对象的边缘
位域宽度的不同限制
计算机在如何将字节存储在一个字中的方式不同（Intel 80x86 和 Motorola 68000）

How alignment works:

对齐的工作原理：

The volume occupied by the structure is calculated as the size of the aligned single element of an array of such structures. The structure should end so that the first element of the next following structure does not the violate requirements of alignment

结构占用的体积计算为此类结构阵列的对齐单个元素的大小。结构应该结束，以便下一个结构的第一个元素不违反对齐要求

p.s More detailed info are available here: "Samuel P.Harbison, Guy L.Steele C A Reference, (5.6.2 - 5.6.7)"

ps 此处提供更多详细信息：“Samuel P.Harbison，Guy L.Steele CA 参考，(5.6.2 - 5.6.7)”

Answer 10

回答by DigitalRoss

The idea is that for speed and cache considerations, operands should be read from addresses aligned to their natural size. To make this happen, the compiler pads structure members so the following member or following struct will be aligned.

这个想法是为了速度和缓存考虑，应该从与其自然大小对齐的地址读取操作数。为实现这一点，编译器会填充结构成员，以便将下一个成员或下一个结构对齐。

struct pixel {
    unsigned char red;   // 0
    unsigned char green; // 1
    unsigned int alpha;  // 4 (gotta skip to an aligned offset)
    unsigned char blue;  // 8 (then skip 9 10 11)
};

// next offset: 12

The x86 architecture has always been able to fetch misaligned addresses. However, it's slower and when the misalignment overlaps two different cache lines, then it evicts two cache lines when an aligned access would only evict one.

x86 架构一直能够获取未对齐的地址。但是，它更慢，并且当未对齐与两个不同的缓存线重叠时，当对齐的访问只会驱逐一个时，它会驱逐两个缓存线。

Some architectures actually have to trap on misaligned reads and writes, and early versions of the ARM architecture (the one that evolved into all of today's mobile CPUs) ... well, they actually just returned bad data on for those. (They ignored the low-order bits.)

某些架构实际上不得不陷入未对齐的读取和写入，以及 ARM 架构的早期版本（演变为当今所有移动 CPU 的架构）……好吧，它们实际上只是为那些返回了错误数据。（他们忽略了低位。）

Finally, note that cache lines can be arbitrarily large, and the compiler doesn't attempt to guess at those or make a space-vs-speed tradeoff. Instead, the alignment decisions are part of the ABI and represent the minimum alignment that will eventually evenly fill up a cache line.

最后，请注意缓存行可以任意大，编译器不会尝试猜测这些行或进行空间与速度的权衡。相反，对齐决策是 ABI 的一部分，代表最终均匀填充缓存行的最小对齐。

TL;DR:alignment is important.

TL;DR：对齐很重要。

C++ 为什么结构的 sizeof 不等于每个成员的 sizeof 之和？

提问by Kevin

采纳答案by Kevin

回答by EmmEff

回答by INS

回答by Kyle Burton

回答by lkanab

回答by sid1138

回答by Orion Adrian

回答by JohnMcG

回答by bruziuz

回答by DigitalRoss

相关推荐

最近更新

标签

C++ 为什么结构的 sizeof 不等于每个成员的 sizeof 之和？

提问by Kevin

采纳答案by Kevin

回答by EmmEff

回答by INS

回答by Kyle Burton

回答by lkanab

回答by sid1138

回答by Orion Adrian

回答by JohnMcG

回答by bruziuz

回答by DigitalRoss

相关推荐

C++ 错误：“int 之前的预期主表达式”

C++ 指针与引用

C++0x 和 C++11 有什么区别？

托管 c++ 和 c++ 之间的区别

相关推荐

最近更新

标签