C语言 结构填充和包装

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4306186/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 07:14:39  来源:igfitidea点击:

Structure padding and packing

cstructstructurepaddingpacking

提问by Manu

Consider:

考虑:

struct mystruct_A
{
   char a;
   int b;
   char c;
} x;

struct mystruct_B
{
   int b;
   char a;
} y;

The sizes of the structures are 12 and 8 respectively.

结构的大小分别为 12 和 8。

Are these structures padded or packed?

这些结构是填充的还是填充的?

When does padding or packing take place?

什么时候进行填充或包装?

回答by Nikolai Fetissov

Paddingalignsstructure members to "natural" address boundaries - say, intmembers would have offsets, which are mod(4) == 0on 32-bit platform. Padding is on by default. It inserts the following "gaps" into your first structure:

填充结构成员与“自然”地址边界对齐- 例如,int成员将具有偏移量,这些偏移量mod(4) == 0位于 32 位平台上。默认情况下填充是打开的。它将以下“间隙”插入到您的第一个结构中:

struct mystruct_A {
    char a;
    char gap_0[3]; /* inserted by compiler: for alignment of b */
    int b;
    char c;
    char gap_1[3]; /* -"-: for alignment of the whole struct in an array */
} x;

Packing, on the other hand prevents compiler from doing padding - this has to be explicitly requested - under GCC it's __attribute__((__packed__)), so the following:

Packing,另一方面防止编译器进行填充 - 这必须明确要求 - 在 GCC 下它是__attribute__((__packed__)),所以以下内容:

struct __attribute__((__packed__)) mystruct_A {
    char a;
    int b;
    char c;
};

would produce structure of size 6on a 32-bit architecture.

6在 32 位架构上生成大小的结构。

A note though - unaligned memory access is slower on architectures that allow it (like x86 and amd64), and is explicitly prohibited on strict alignment architectureslike SPARC.

但请注意 - 在允许它的架构(如 x86 和 amd64)上未对齐的内存访问速度较慢,并且在严格对齐的架构(如 SPARC)上被明确禁止。

回答by Eric Wang

(The above answers explained the reason quite clearly, but seems not totally clear about the size of padding, so, I will add an answer according to what I learned from The Lost Art of Structure Packing, it has evolved to not limit to C, but also applicable to Go, Rust.)

上面的答案很清楚地解释了原因,但似乎并没有完全清楚填充的大小,因此,我将根据我从结构包装的失落艺术中学到知识添加一个答案,它已经发展到不限于C,但是也适用于Go, Rust.)



Memory align (for struct)

内存对齐(用于结构)

Rules:

规则:

  • Before each individual member, there will be padding so that to make it start at an address that is divisible by its size.
    e.g on 64 bit system,intshould start at address divisible by 4, and longby 8, shortby 2.
  • charand char[]are special, could be any memory address, so they don't need padding before them.
  • For struct, other than the alignment need for each individual member, the size of whole struct itself will be aligned to a size divisible by size of largest individual member, by padding at end.
    e.g if struct's largest member is longthen divisible by 8, intthen by 4, shortthen by 2.
  • 在每个单独的成员之前,会有填充,以便使其从一个可被其大小整除的地址开始。
    例如在64位的系统,int应该通过在4整除地址开始,并long通过如图8所示,short由2。
  • char并且char[]是特殊的,可以是任何内存地址,因此它们不需要在它们之前填充。
  • 对于struct,除了每个单独成员的对齐需要之外,整个结构本身的大小将对齐到可被最大单个成员的大小整除的大小,通过在末尾填充。
    例如,如果结构上最大的部件long然后通过8整除,int然后通过4,short然后通过2。

Order of member:

会员顺序:

  • The order of member might affect actual size of struct, so take that in mind. e.g the stu_cand stu_dfrom example below have the same members, but in different order, and result in different size for the 2 structs.
  • 成员的顺序可能会影响结构的实际大小,因此请记住这一点。例如,下面的示例stu_cstu_d来自示例具有相同的成员,但顺序不同,并导致 2 个结构的大小不同。


Address in memory (for struct)

内存中的地址(用于结构)

Rules:

规则:

  • 64 bit system
    Struct address starts from (n * 16)bytes. (You can see in the example below, all printed hex addresses of structs end with 0.)
    Reason: the possible largest individual struct member is 16 bytes (long double).
  • (Update)If a struct only contains a charas member, its address could start at any address.
  • 64 位系统
    结构地址从(n * 16)字节开始。(您可以在下面的示例中看到,结构的所有打印十六进制地址都以 结尾0
    原因:可能的最大单个结构成员为 16 字节 ( long double)。
  • (更新)如果结构仅包含charas 成员,则其地址可以从任何地址开始。

Empty space:

空白空间

  • Empty space between 2 structs could be used by non-struct variables that could fit in.
    e.g in test_struct_address()below, the variable xresides between adjacent struct gand h.
    No matter whether xis declared, h's address won't change, xjust reused the empty space that gwasted.
    Similar case for y.
  • 2 个结构体之间的空白空间可以被可以放入的非结构体变量使用。
    例如,在test_struct_address()下面,变量x位于相邻的结构体gh.
    无论是否x声明,h的地址都不会改变,x只是重用了g浪费的空白空间。
    的类似情况y


Example

例子

(for 64 bit system)

对于 64 位系统

memory_align.c:

memory_align.c:

/**
 * Memory align & padding - for struct.
 * compile: gcc memory_align.c
 * execute: ./a.out
 */ 
#include <stdio.h>

// size is 8, 4 + 1, then round to multiple of 4 (int's size),
struct stu_a {
    int i;
    char c;
};

// size is 16, 8 + 1, then round to multiple of 8 (long's size),
struct stu_b {
    long l;
    char c;
};

// size is 24, l need padding by 4 before it, then round to multiple of 8 (long's size),
struct stu_c {
    int i;
    long l;
    char c;
};

// size is 16, 8 + 4 + 1, then round to multiple of 8 (long's size),
struct stu_d {
    long l;
    int i;
    char c;
};

// size is 16, 8 + 4 + 1, then round to multiple of 8 (double's size),
struct stu_e {
    double d;
    int i;
    char c;
};

// size is 24, d need align to 8, then round to multiple of 8 (double's size),
struct stu_f {
    int i;
    double d;
    char c;
};

// size is 4,
struct stu_g {
    int i;
};

// size is 8,
struct stu_h {
    long l;
};

// test - padding within a single struct,
int test_struct_padding() {
    printf("%s: %ld\n", "stu_a", sizeof(struct stu_a));
    printf("%s: %ld\n", "stu_b", sizeof(struct stu_b));
    printf("%s: %ld\n", "stu_c", sizeof(struct stu_c));
    printf("%s: %ld\n", "stu_d", sizeof(struct stu_d));
    printf("%s: %ld\n", "stu_e", sizeof(struct stu_e));
    printf("%s: %ld\n", "stu_f", sizeof(struct stu_f));

    printf("%s: %ld\n", "stu_g", sizeof(struct stu_g));
    printf("%s: %ld\n", "stu_h", sizeof(struct stu_h));

    return 0;
}

// test - address of struct,
int test_struct_address() {
    printf("%s: %ld\n", "stu_g", sizeof(struct stu_g));
    printf("%s: %ld\n", "stu_h", sizeof(struct stu_h));
    printf("%s: %ld\n", "stu_f", sizeof(struct stu_f));

    struct stu_g g;
    struct stu_h h;
    struct stu_f f1;
    struct stu_f f2;
    int x = 1;
    long y = 1;

    printf("address of %s: %p\n", "g", &g);
    printf("address of %s: %p\n", "h", &h);
    printf("address of %s: %p\n", "f1", &f1);
    printf("address of %s: %p\n", "f2", &f2);
    printf("address of %s: %p\n", "x", &x);
    printf("address of %s: %p\n", "y", &y);

    // g is only 4 bytes itself, but distance to next struct is 16 bytes(on 64 bit system) or 8 bytes(on 32 bit system),
    printf("space between %s and %s: %ld\n", "g", "h", (long)(&h) - (long)(&g));

    // h is only 8 bytes itself, but distance to next struct is 16 bytes(on 64 bit system) or 8 bytes(on 32 bit system),
    printf("space between %s and %s: %ld\n", "h", "f1", (long)(&f1) - (long)(&h));

    // f1 is only 24 bytes itself, but distance to next struct is 32 bytes(on 64 bit system) or 24 bytes(on 32 bit system),
    printf("space between %s and %s: %ld\n", "f1", "f2", (long)(&f2) - (long)(&f1));

    // x is not a struct, and it reuse those empty space between struts, which exists due to padding, e.g between g & h,
    printf("space between %s and %s: %ld\n", "x", "f2", (long)(&x) - (long)(&f2));
    printf("space between %s and %s: %ld\n", "g", "x", (long)(&x) - (long)(&g));

    // y is not a struct, and it reuse those empty space between struts, which exists due to padding, e.g between h & f1,
    printf("space between %s and %s: %ld\n", "x", "y", (long)(&y) - (long)(&x));
    printf("space between %s and %s: %ld\n", "h", "y", (long)(&y) - (long)(&h));

    return 0;
}

int main(int argc, char * argv[]) {
    test_struct_padding();
    // test_struct_address();

    return 0;
}

Execution result - test_struct_padding():

执行结果 - test_struct_padding()

stu_a: 8
stu_b: 16
stu_c: 24
stu_d: 16
stu_e: 16
stu_f: 24
stu_g: 4
stu_h: 8

Execution result - test_struct_address():

执行结果 - test_struct_address()

stu_g: 4
stu_h: 8
stu_f: 24
address of g: 0x7fffd63a95d0  // struct variable - address dividable by 16,
address of h: 0x7fffd63a95e0  // struct variable - address dividable by 16,
address of f1: 0x7fffd63a95f0 // struct variable - address dividable by 16,
address of f2: 0x7fffd63a9610 // struct variable - address dividable by 16,
address of x: 0x7fffd63a95dc  // non-struct variable - resides within the empty space between struct variable g & h.
address of y: 0x7fffd63a95e8  // non-struct variable - resides within the empty space between struct variable h & f1.
space between g and h: 16
space between h and f1: 16
space between f1 and f2: 32
space between x and f2: -52
space between g and x: 12
space between x and y: 12
space between h and y: 8

Thus address start for each variable is g:d0 x:dc h:e0 y:e8

因此每个变量的地址开始是 g:d0 x:dc h:e0 y:e8

enter image description here

在此处输入图片说明

回答by IanC

I know this question is old and most answers here explains padding really well, but while trying to understand it myself I figured having a "visual" image of what is happening helped.

我知道这个问题很老,这里的大多数答案都很好地解释了填充,但是在尝试自己理解它时,我认为对正在发生的事情有一个“视觉”图像有帮助。

The processor reads the memory in "chunks" of a definite size (word). Say the processor word is 8 bytes long. It will look at the memory as a big row of 8 bytes building blocks. Every time it needs to get some information from the memory, it will reach one of those blocks and get it.

处理器以一定大小(字)的“块”读取内存。假设处理器字长 8 个字节。它将内存视为一大排 8 字节构建块。每次它需要从内存中获取一些信息时,它都会到达其中一个块并获取它。

Variables Alignment

变量对齐

As seem in the image above, doesn't matter where a Char (1 byte long) is, since it will be inside one of those blocks, requiring the CPU to process only 1 word.

如上图所示,Char(1 字节长)的位置无关紧要,因为它将位于这些块之一内,只需要 CPU 处理 1 个字。

When we deal with data larger than one byte, like a 4 byte int or a 8 byte double, the way they are aligned in the memory makes a difference on how many words will have to be processed by the CPU. If 4-byte chunks are aligned in a way they always fit the inside of a block (memory address being a multiple of 4) only one word will have to be processed. Otherwise a chunk of 4-bytes could have part of itself on one block and part on another, requiring the processor to process 2 words to read this data.

当我们处理大于 1 字节的数据时,例如 4 字节的 int 或 8 字节的双精度数,它们在内存中的对齐方式会影响 CPU 必须处理的字数。如果 4 字节块以某种方式对齐,它们总是适合块的内部(内存地址是 4 的倍数),则只需要处理一个字。否则,一大块 4 字节可能会在一个块上有一部分,在另一个块上有一部分,需要处理器处理 2 个字来读取此数据。

The same applies to a 8-byte double, except now it must be in a memory address multiple of 8 to guarantee it will always be inside a block.

这同样适用于 8 字节的 double,除了现在它必须位于 8 的内存地址倍数中,以确保它始终位于块内。

This considers a 8-byte word processor, but the concept applies to other sizes of words.

这考虑了一个 8 字节的字处理器,但这个概念适用于其他大小的字。

The padding works by filling the gaps between those data to make sure they are aligned with those blocks, thus improving the performance while reading the memory.

填充通过填充这些数据之间的间隙来工作,以确保它们与这些块对齐,从而提高读取内存时的性能。

However, as stated on others answers, sometimes the space matters more then performance itself. Maybe you are processing lots of data on a computer that doesn't have much RAM (swap space could be used but it is MUCH slower). You could arrange the variables in the program until the least padding is done (as it was greatly exemplified in some other answers) but if that's not enough you could explicitly disable padding, which is what packingis.

但是,正如其他答案所述,有时空间比性能本身更重要。也许您正在一台没有太多 RAM 的计算机上处​​理大量数据(可以使用交换空间,但速度要慢得多)。您可以在程序中安排变量,直到完成最少的填充(因为它在其他一些答案中得到了很好的例证),但如果这还不够,您可以明确禁用填充,这就是包装

回答by user2083050

Structure packing suppresses structure padding, padding used when alignment matters most, packing used when space matters most.

结构填充抑制结构填充,在对齐最重要时使用填充,在空间最重要时使用填充。

Some compilers provide #pragmato suppress padding or to make it packed to n number of bytes. Some provide keywords to do this. Generally pragma which is used for modifying structure padding will be in the below format (depends on compiler):

一些编译器提供#pragma抑制填充或使其压缩为 n 个字节。有些提供关键字来做到这一点。通常用于修改结构填充的编译指示将采用以下格式(取决于编译器):

#pragma pack(n)

For example ARM provides the __packedkeyword to suppress structure padding. Go through your compiler manual to learn more about this.

例如,ARM 提供了__packed抑制结构填充的关键字。阅读您的编译器手册以了解有关此内容的更多信息。

So a packed structure is a structure without padding.

所以压缩结构是没有填充的结构。

Generally packed structures will be used

一般会使用填充结构

  • to save space

  • to format a data structure to transmit over network using some protocol (this is not a good practice of course because you need to
    deal with endianness)

  • 节省空间

  • 使用某种协议格式化数据结构以通过网络传输(这当然不是一个好习惯,因为您需要
    处理字节序)

回答by casablanca

Padding and packing are just two aspects of the same thing:

填充和包装只是同一件事的两个方面:

  • packing or alignment is the size to which each member is rounded off
  • padding is the extra space added to match the alignment
  • 包装或对齐是每个成员四舍五入的大小
  • padding 是为匹配对齐而添加的额外空间

In mystruct_A, assuming a default alignment of 4, each member is aligned on a multiple of 4 bytes. Since the size of charis 1, the padding for aand cis 4 - 1 = 3 bytes while no padding is required for int bwhich is already 4 bytes. It works the same way for mystruct_B.

在 中mystruct_A,假设默认对齐为 4,则每个成员以 4 字节的倍数对齐。由于 的大小char为 1,因此a和的填充为c4 - 1 = 3 个字节,而无需填充int b已经是 4 个字节。它的工作方式与mystruct_B.

回答by nmichaels

Structure packing is only done when you tell your compiler explicitly to pack the structure. Padding is what you're seeing. Your 32-bit system is padding each field to word alignment. If you had told your compiler to pack the structures, they'd be 6 and 5 bytes, respectively. Don't do that though. It's not portable and makes compilers generate much slower (and sometimes even buggy) code.

只有当您明确告诉编译器打包结构时,才会进行结构打包。填充就是你所看到的。您的 32 位系统将每个字段填充到字对齐。如果您告诉编译器打包结构,它们将分别为 6 和 5 个字节。不要那样做。它不是可移植的,并且使编译器生成更慢(有时甚至是错误)的代码。

回答by snr

There are no buts about it!Who want to grasp the subject must do the following ones,

没有什么可说的!想掌握题目的人必须做到以下几点,

回答by AlphaGoku

Rules for padding:

填充规则:

  1. Every member of the struct should be at an address divisible by its size. Padding is inserted between elements or at the end of the struct to make sure this rule is met. This is done for easier and more efficient Bus access by the hardware.
  2. Padding at the end of the struct is decided based on the size of the largest member of the struct.
  1. 结构体的每个成员都应该位于一个可被其大小整除的地址。填充插入元素之间或结构的末尾以确保满足此规则。这样做是为了让硬件更容易、更高效地访问总线。
  2. 结构末尾的填充取决于结构中最大成员的大小。

Why Rule 2: Consider the following struct,

为什么规则 2:考虑以下结构,

Struct 1

结构 1

If we were to create an array(of 2 structs) of this struct, No padding will be required at the end:

如果我们要创建此结构的数组(2 个结构),则最后不需要填充:

Struct1 array

结构 1 数组

Therefore, size of struct = 8 bytes

因此,结构体的大小 = 8 字节

Assume we were to create another struct as below:

假设我们要创建另一个结构,如下所示:

Struct 2

结构 2

If we were to create an array of this struct, there are 2 possibilities, of the number of bytes of padding required at the end.

如果我们要创建此结构的数组,则有 2 种可能性,即最后需要填充的字节数。

A. If we add 3 bytes at the end and align it for int and not Long:

A. 如果我们在末尾添加 3 个字节并将其对齐为 int 而不是 Long:

Struct2 array aligned to int

与 int 对齐的 Struct2 数组

B. If we add 7 bytes at the end and align it for Long:

B、如果我们在末尾添加7个字节,并为Long对齐:

Struct2 array aligned to Long

与 Long 对齐的 Struct2 数组

The start address of the second array is a multiple of 8(i.e 24). The size of the struct = 24 bytes

第二个数组的起始地址是8的倍数(即24)。结构体的大小 = 24 字节

Therefore, by aligning the start address of the next array of the struct to a multiple of the largest member(i.e if we were to create an array of this struct, the first address of the second array must start at an address which is a multiple of the largest member of the struct. Here it is, 24(3 * 8)), we can calculate the number of padding bytes required at the end.

因此,通过将结构的下一个数组的起始地址与最大成员的倍数对齐(即,如果我们要创建此结构的数组,则第二个数组的第一个地址必须从一个倍数的地址开始)结构体的最大成员,这里是24(3 * 8)),我们可以计算出最后需要填充的字节数。

回答by manoj yadav

Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding. When a modern computer reads from or writes to a memory address, it will do this in word sized chunks (e.g. 4 byte chunks on a 32-bit system) or larger. Data alignment means putting the data at a memory address equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding.

数据结构对齐是数据在计算机内存中排列和访问的方式。它由两个独立但相关的问题组成:数据对齐和数据结构填充。当现代计算机读取或写入内存地址时,它将以字大小的块(例如,32 位系统上的 4 字节块)或更大的块来执行此操作。数据对齐意味着将数据放在等于字大小的某个倍数的内存地址上,由于 CPU 处理内存的方式,这会提高系统的性能。为了对齐数据,可能需要在最后一个数据结构的末尾和下一个数据结构的开始之间插入一些无意义的字节,这就是数据结构填充。

  1. In order to align the data in memory, one or more empty bytes (addresses) are inserted (or left empty) between memory addresses which are allocated for other structure members while memory allocation. This concept is called structure padding.
  2. Architecture of a computer processor is such a way that it can read 1 word (4 byte in 32 bit processor) from memory at a time.
  3. To make use of this advantage of processor, data are always aligned as 4 bytes package which leads to insert empty addresses between other member's address.
  4. Because of this structure padding concept in C, size of the structure is always not same as what we think.
  1. 为了对齐内存中的数据,在分配内存时,在为其他结构成员分配的内存地址之间插入(或留空)一个或多个空字节(地址)。这个概念称为结构填充。
  2. 计算机处理器的架构是这样一种方式,它可以一次从内存中读取 1 个字(32 位处理器中的 4 个字节)。
  3. 为了利用处理器的这一优势,数据总是对齐为 4 个字节的包,这导致在其他成员的地址之间插入空地址。
  4. 由于 C 中的这种结构填充概念,结构的大小总是与我们想象的不一样。