C语言 C中的结构内存布局

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2748995/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 05:20:16  来源:igfitidea点击:

Struct memory layout in C

cstructmemory-layout

提问by Eonil

I have a C# background. I am very much a newbie to a low-level language like C.

我有 C# 背景。我是像 C 这样的低级语言的新手。

In C#, struct's memory is laid out by the compiler by default. The compiler can re-order data fields or pad additional bits between fields implicitly. So, I had to specify some special attribute to override this behavior for exact layout.

在 C# 中,struct的内存默认由编译器布置。编译器可以对数据字段重新排序或在字段之间隐式填充附加位。所以,我必须指定一些特殊的属性来覆盖这个行为以获得精确的布局。

AFAIK, C does not reorder or align memory layout of a structby default. However, I heard there's a little exception that is very hard to find.

AFAIK,struct默认情况下,C 不会重新排序或对齐 a 的内存布局。但是,我听说有一个很难找到的小例外。

What is C's memory layout behavior? What should be re-ordered/aligned and not?

C 的内存布局行为是什么?什么应该重新排序/对齐而不是?

回答by dan04

It's implementation-specific, but in practice the rule (in the absence of #pragma packor the like) is:

它是特定于实现的,但实际上规则(在没有#pragma pack或类似的情况下)是:

  • Struct members are stored in the order they are declared. (This is required by the C99 standard, as mentioned here earlier.)
  • If necessary, padding is added before each struct member, to ensure correct alignment.
  • Each primitive type T requires an alignment of sizeof(T)bytes.
  • 结构成员按它们声明的顺序存储。(这是 C99 标准所要求的,如前所述。)
  • 如有必要,在每个结构成员之前添加填充,以确保正确对齐。
  • 每个原始类型 T 都需要sizeof(T)字节对齐。

So, given the following struct:

因此,给定以下结构:

struct ST
{
   char ch1;
   short s;
   char ch2;
   long long ll;
   int i;
};
  • ch1is at offset 0
  • a padding byte is inserted to align...
  • sat offset 2
  • ch2is at offset 4, immediately after s
  • 3 padding bytes are inserted to align...
  • llat offset 8
  • iis at offset 16, right after ll
  • 4 padding bytes are added at the end so that the overall struct is a multiple of 8 bytes. I checked this on a 64-bit system: 32-bit systems may allow structs to have 4-byte alignment.
  • ch1位于偏移量 0
  • 插入一个填充字节以对齐...
  • s在偏移 2
  • ch2位于偏移量 4,紧跟在 s 之后
  • 插入 3 个填充字节以对齐...
  • ll在偏移 8
  • i位于 ll 之后的偏移量 16
  • 最后添加了 4 个填充字节,因此整个结构是 8 个字节的倍数。我在 64 位系统上检查了这一点:32 位系统可能允许结构具有 4 字节对齐。

So sizeof(ST)is 24.

所以sizeof(ST)是24。

It can be reduced to 16 bytes by rearranging the members to avoid padding:

通过重新排列成员以避免填充,它可以减少到 16 个字节:

struct ST
{
   long long ll; // @ 0
   int i;        // @ 8
   short s;      // @ 12
   char ch1;     // @ 14
   char ch2;     // @ 15
} ST;

回答by Potatoswatter

In C, the compiler is allowed to dictate some alignment for every primitive type. Typically the alignment is the size of the type. But it's entirely implementation-specific.

在 C 中,允许编译器为每个原始类型指定一些对齐方式。通常,对齐是类型的大小。但这完全是特定于实现的。

Padding bytes are introduced so every object is properly aligned. Reordering is not allowed.

引入了填充字节,因此每个对象都正确对齐。不允许重新排序。

Possibly every remotely modern compiler implements #pragma packwhich allows control over padding and leaves it to the programmer to comply with the ABI. (It is strictly nonstandard, though.)

可能每个远程现代编译器都实现#pragma pack了允许控制填充并将其留给程序员以遵守 ABI。(不过,这完全是非标准的。)

From C99 §6.7.2.1:

从 C99 §6.7.2.1 开始:

12 Each non-bit-field member of a structure or union object is aligned in an implementation- defined manner appropriate to its type.

13 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

12 结构或联合对象的每个非位域成员都以适合其类型的实现定义方式对齐。

13 在结构对象中,非位域成员和位域所在的单元的地址按它们声明的顺序增加。指向结构对象的指针,经过适当转换,指向其初始成员(或者如果该成员是位域,则指向它所在的单元),反之亦然。结构对象内可能有未命名的填充,但不是在其开头。

回答by jschmier

You can start by reading the data structure alignment wikipedia articleto get a better understanding of data alignment.

您可以从阅读数据结构对齐维基百科文章开始,以更好地理解数据对齐。

From the wikipedia article:

来自维基百科文章

Data alignment means putting the data at a memory offset equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding.

数据对齐意味着将数据放在等于字大小的某个倍数的内存偏移处,由于 CPU 处理内存的方式,这会提高系统性能。为了对齐数据,可能需要在最后一个数据结构的末尾和下一个数据结构的开始之间插入一些无意义的字节,这就是数据结构填充。

From 6.54.8 Structure-Packing Pragmasof the GCC documentation:

来自GCC 文档的6.54.8 Structure-Packing Pragmas

For compatibility with Microsoft Windows compilers, GCC supports a set of #pragma directives which change the maximum alignment of members of structures (other than zero-width bitfields), unions, and classes subsequently defined. The n value below always is required to be a small power of two and specifies the new alignment in bytes.

  1. #pragma pack(n)simply sets the new alignment.
  2. #pragma pack()sets the alignment to the one that was in effect when compilation started (see also command line option -fpack-struct[=] see Code Gen Options).
  3. #pragma pack(push[,n])pushes the current alignment setting on an internal stack and then optionally sets the new alignment.
  4. #pragma pack(pop)restores the alignment setting to the one saved at the top of the internal stack (and removes that stack entry). Note that #pragma pack([n])does not influence this internal stack; thus it is possible to have #pragma pack(push)followed by multiple #pragma pack(n)instances and finalized by a single #pragma pack(pop).

Some targets, e.g. i386 and powerpc, support the ms_struct #pragmawhich lays out a structure as the documented __attribute__ ((ms_struct)).

  1. #pragma ms_struct onturns on the layout for structures declared.
  2. #pragma ms_struct offturns off the layout for structures declared.
  3. #pragma ms_struct resetgoes back to the default layout.

为了与 Microsoft Windows 编译器兼容,GCC 支持一组 #pragma 指令,这些指令可更改结构(零宽度位域除外)、联合和随后定义的类成员的最大对齐方式。下面的 n 值总是要求是 2 的小幂,并以字节为单位指定新的对齐方式。

  1. #pragma pack(n)只需设置新的对齐方式。
  2. #pragma pack()将对齐设置为编译开始时生效的对齐(另请参见命令行选项 -fpack-struct[=] 参见代码生成选项)。
  3. #pragma pack(push[,n])将当前对齐设置推送到内部堆栈上,然后可选地设置新对齐。
  4. #pragma pack(pop)将对齐设置恢复为保存在内部堆栈顶部的对齐设置(并删除该堆栈条目)。注意, #pragma pack([n])不会影响这个内部堆栈;因此有可能#pragma pack(push)跟随多个#pragma pack(n)实例并由单个 #pragma pack(pop).

一些目标,例如 i386 和 powerpc,支持 ms_struct #pragma,它按照记录的 __attribute__ ((ms_struct)).

  1. #pragma ms_struct on打开已声明结构的布局。
  2. #pragma ms_struct off关闭已声明结构的布局。
  3. #pragma ms_struct reset回到默认布局。