C/C++ 中联合的大小

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/740577/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 17:03:04  来源:igfitidea点击:

sizeof a union in C/C++

c++csizeofunions

提问by Naveen

What is the sizeof the union in C/C++? Is it the sizeof the largest datatype inside it? If so, how does the compiler calculate how to move the stack pointer if one of the smaller datatype of the union is active?

C/C++ 中联合的大小是多少?它是里面最大的数据类型的 sizeof 吗?如果是这样,如果联合的较小数据类型之一处于活动状态,编译器如何计算如何移动堆栈指针?

采纳答案by Johannes Schaub - litb

The Standard answers all questions in section 9.5 of the C++ standard, or section 6.5.2.3 paragraph 5 of the C99 standard (or paragraph 6 of the C11 standard, or section 6.7.2.1 paragraph 16 of the C18 standard):

该标准回答了 C++ 标准第 9.5 节或 C99 标准第 6.5.2.3 节第 5 节(或 C11 标准第 6 节,或 C18 标准第 6.7.2.1 节第 16 节)中的所有问题:

In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time. [Note: one special guarantee is made in order to simplify the use of unions: If a POD-union contains several POD-structs that share a common initial sequence (9.2), and if an object of this POD-union type contains one of the POD-structs, it is permitted to inspect the common initial sequence of any of POD-struct members; see 9.2. ] The size of a union is sufficient to contain the largest of its data members. Each data member is allocated as if it were the sole member of a struct.

在一个联合中,最多有一个数据成员可以在任何时候处于活动状态,即最多一个数据成员的值可以随时存储在联合中。[注意:为了简化联合的使用,做了一个特殊的保证:如果一个 POD 联合包含多个共享一个公共初始序列的 POD 结构(9.2),并且如果这个 POD 联合类型的对象包含以下之一POD-structs,允许检查任何 POD-struct 成员的公共初始序列;见 9.2。] 联合的大小足以包含其最大的数据成员。每个数据成员都被分配为好像它是结构的唯一成员。

That means each member share the same memory region. There isat most one member active, but you can't find out which one. You will have to store that information about the currently active member yourself somewhere else. Storing such a flag in addition to the union (for example having a struct with an integer as the type-flag and an union as the data-store) will give you a so called "discriminated union": An union which knows what type in it is currently the "active one".

这意味着每个成员共享相同的内存区域。这里活跃在最一个成员,但你不能找出哪一个。您必须自己将有关当前活动成员的信息存储在其他地方。除了联合之外存储这样一个标志(例如,将一个带有整数作为类型标志的结构和一个联合作为数据存储)会给你一个所谓的“有区别的联合”:一个知道什么类型的联合它目前是“活跃的”。

One common use is in lexers, where you can have different tokens, but depending on the token, you have different informations to store (putting lineinto each struct to show what a common initial sequence is):

一个常见的用途是在词法分析器中,你可以有不同的标记,但根据标记,你有不同的信息要存储(放入line每个结构以显示共同的初始序列是什么):

struct tokeni {
    int token; /* type tag */
    union {
        struct { int line; } noVal;
        struct { int line; int val; } intVal;
        struct { int line; struct string val; } stringVal;
    } data;
};

The Standard allows you to access lineof each member, because that's the common initial sequence of each one.

标准允许您访问line每个成员,因为这是每个成员的共同初始序列。

There exist compiler extensions that allow accessing all members disregarding which one currently has its value stored. That allows efficient reinterpretation of stored bits with different types among each of the members. For example, the following may be used to dissect a float variable into 2 unsigned shorts:

存在允许访问所有成员的编译器扩展,而不管当前存储的是哪个成员的值。这允许有效地重新解释每个成员之间具有不同类型的存储位。例如,以下内容可用于将浮点变量分解为 2 个无符号短整型:

union float_cast { unsigned short s[2]; float f; };

That can come quite handy when writing low-level code. If the compiler does not support that extension, but you do it anyway, you write code whose results are not defined. So be certain your compiler has support for it if you use that trick.

这在编写低级代码时会非常方便。如果编译器不支持该扩展,但您仍然支持,则您编写的代码的结果未定义。因此,如果您使用该技巧,请确保您的编译器支持它。

回答by Mehrdad Afshari

A unionalways takes up as much space as the largest member. It doesn't matter what is currently in use.

Aunion总是占用与最大成员一样多的空间。目前使用的是什么并不重要。

union {
  short x;
  int y;
  long long z;
}

An instance of the above unionwill always take at least a long longfor storage.

上面的一个实例union总是至少需要一个long long来存储。

Side note: As noted by Stefano, the actual space any type (union, struct, class) will take does depend on other issues such as alignment by the compiler. I didn't go through this for simplicity as I just wanted to tell that a union takes the biggest item into account. It's important to know that the actual size doesdepend on alignment.

旁注:正如Stefano所指出的,任何类型 ( union, struct, class) 所占用的实际空间确实取决于其他问题,例如编译器的对齐方式。为简单起见,我没有经历这个,因为我只是想告诉工会考虑最大的项目。重要的是要知道实际大小确实取决于对齐

回答by Stefano Borini

It depends on the compiler, and on the options.

这取决于编译器和选项。

int main() {
  union {
    char all[13];
    int foo;
  } record;

printf("%d\n",sizeof(record.all));
printf("%d\n",sizeof(record.foo));
printf("%d\n",sizeof(record));

}

This outputs:

这输出:

13 4 16

13 4 16

If I remember correctly, it depends on the alignment that the compiler puts into the allocated space. So, unless you use some special option, the compiler will put padding into your union space.

如果我没记错的话,这取决于编译器放入分配空间的对齐方式。因此,除非您使用某些特殊选项,否则编译器会将填充放入您的联合空间。

edit: with gcc you need to use a pragma directive

编辑:使用 gcc,您需要使用 pragma 指令

int main() {
#pragma pack(push, 1)
      union {
           char all[13];
           int foo;
      } record;
#pragma pack(pop)

      printf("%d\n",sizeof(record.all));
      printf("%d\n",sizeof(record.foo));
      printf("%d\n",sizeof(record));

}

this outputs

这输出

13 4 13

13 4 13

You can also see it from the disassemble (removed some printf, for clarity)

您也可以从反汇编中看到它(为了清楚起见,删除了一些 printf)

  0x00001fd2 <main+0>:    push   %ebp             |  0x00001fd2 <main+0>:    push   %ebp
  0x00001fd3 <main+1>:    mov    %esp,%ebp        |  0x00001fd3 <main+1>:    mov    %esp,%ebp
  0x00001fd5 <main+3>:    push   %ebx             |  0x00001fd5 <main+3>:    push   %ebx
  0x00001fd6 <main+4>:    sub    
struct ifreq 
{
#define IFHWADDRLEN 6
    union
    {
        char    ifrn_name[IFNAMSIZ];        /* if name, e.g. "en0" */
    } ifr_ifrn;

    union {
        struct  sockaddr ifru_addr;
        struct  sockaddr ifru_dstaddr;
        struct  sockaddr ifru_broadaddr;
        struct  sockaddr ifru_netmask;
        struct  sockaddr ifru_hwaddr;
        short   ifru_flags;
        int ifru_ivalue;
        int ifru_mtu;
        struct  ifmap ifru_map;
        char    ifru_slave[IFNAMSIZ];   /* Just fits the size */
        char    ifru_newname[IFNAMSIZ];
        void *  ifru_data;
        struct  if_settings ifru_settings;
    } ifr_ifru;
};
x24,%esp | 0x00001fd6 <main+4>: sub
struct ONE_OF_MANY {
    enum FLAG { FLAG_SHORT, FLAG_INT, FLAG_LONG_LONG } flag;
    union { short x; int y; long long z; };
};
x24,%esp 0x00001fd9 <main+7>: call 0x1fde <main+12> | 0x00001fd9 <main+7>: call 0x1fde <main+12> 0x00001fde <main+12>: pop %ebx | 0x00001fde <main+12>: pop %ebx 0x00001fdf <main+13>: movl
#include<stdio.h>

union un
{
    char c;
    int i;
    float f;
    double d;
};

int main()
{
    union un u1;
    printf("sizeof union u1 : %ld\n",sizeof(u1));
    return 0;
}
xd,0x4(%esp) | 0x00001fdf <main+13>: movl
sizeof union u1 : 8
sizeof double d : 8
x10,0x4(%esp) 0x00001fe7 <main+21>: lea 0x1d(%ebx),%eax | 0x00001fe7 <main+21>: lea 0x1d(%ebx),%eax 0x00001fed <main+27>: mov %eax,(%esp) | 0x00001fed <main+27>: mov %eax,(%esp) 0x00001ff0 <main+30>: call 0x3005 <printf> | 0x00001ff0 <main+30>: call 0x3005 <printf> 0x00001ff5 <main+35>: add ##代码##x24,%esp | 0x00001ff5 <main+35>: add ##代码##x24,%esp 0x00001ff8 <main+38>: pop %ebx | 0x00001ff8 <main+38>: pop %ebx 0x00001ff9 <main+39>: leave | 0x00001ff9 <main+39>: leave 0x00001ffa <main+40>: ret | 0x00001ffa <main+40>: ret

Where the only difference is in main+13, where the compiler allocates on the stack 0xd instead of 0x10

唯一的区别在于 main+13,编译器在堆栈上分配 0xd 而不是 0x10

回答by mouviciel

There is no notion of active datatype for a union. You are free to read and write any 'member' of the union: this is up to you to interpret what you get.

联合没有活动数据类型的概念。您可以自由地阅读和书写工会的任何“成员”:这由您来解释您得到的东西。

Therefore, the sizeof a union is always the sizeof its largest datatype.

因此,联合的大小始终是其最大数据类型的大小。

回答by mouviciel

The size will be at least that of the largest composing type. There is no concept of an "active" type.

大小将至少是最大的组合类型的大小。没有“活动”类型的概念。

回答by amo-ej1

You should really look at a union as a container for the largest datatype inside it combined with a shortcut for a cast. When you use one of the smaller members, the unused space is still there, but it simply stays unused.

您真的应该将联合视为其中最大数据类型的容器,并结合了强制转换的快捷方式。当您使用较小的成员之一时,未使用的空间仍然存在,但它只是保持未使用状态。

You often see this used in combination with ioctl() calls under in Unix, all ioctl() calls will pass the same struct, which contains a union of all possible responses. E.g. this example comes from /usr/include/linux/if.h and this struct is used in ioctl()'s for configuring/querying the state of an ethernet interface, the request parameters defines which part of the union is actually in use:

在 Unix 下,您经常看到它与 ioctl() 调用结合使用,所有 ioctl() 调用都将传递相同的结构,其中包含所有可能响应的联合。例如,这个例子来自 /usr/include/linux/if.h 并且这个结构在 ioctl() 中用于配置/查询以太网接口的状态,请求参数定义了联合的哪个部分实际在使用:

##代码##

回答by pyon

  1. The size of the largest member.

  2. This is why unions usually make sense inside a struct that has a flag that indicates which is the "active" member.

  1. 最大成员的大小。

  2. 这就是为什么联合通常在具有指示哪个是“活动”成员的标志的结构中有意义的原因。

Example:

例子:

##代码##

回答by msc

What is the sizeof the union in C/C++? Is it the sizeof the largest datatype inside it?

C/C++ 中联合的大小是多少?它是里面最大的数据类型的 sizeof 吗?

Yes, The size of the union is the size of its biggest member.

是的,工会的规模是其最大成员的规模。

For Example :

例如 :

##代码##

Output :

输出 :

##代码##

Here biggest member is double. Both have size 8. So, as sizeofcorrectly told you, the size of the union is indeed 8.

这里最大的成员是double. 两者都有大小8。因此,正如sizeof正确告诉您的那样,联合的大小确实是8.

how does the compiler calculate how to move the stack pointer if one of the smaller datatype of the union is active?

如果联合的较小数据类型之一处于活动状态,编译器如何计算如何移动堆栈指针?

It internally handles by the compiler. Suppose we are accessing one of the data member of union then we cannot access other data member since we can access single data member of union because each data member shares same memory. By Using Union we can Save Lot of Valuable Space.

它由编译器内部处理。假设我们正在访问 union 的数据成员之一,那么我们无法访问其他数据成员,因为我们可以访问 union 的单个数据成员,因为每个数据成员共享相同的内存。通过使用 Union,我们可以节省大量宝贵的空间。