什么时候会有人使用工会?它是 C-only 时代的残余吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4788965/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
When would anyone use a union? Is it a remnant from the C-only days?
提问by Russel
I have learned but don't really get unions. Every C or C++ text I go through introduces them (sometimes in passing), but they tend to give very few practical examples of why or where to use them. When would unions be useful in a modern (or even legacy) case? My only two guesses would be programming microprocessors when you have very limited space to work with, or when you're developing an API (or something similar) and you want to force the end user to have only one instance of several objects/types at one time. Are these two guesses even close to right?
我已经学会了,但并没有真正得到工会。我读过的每一本 C 或 C++ 文本都介绍了它们(有时是顺便说一下),但它们往往很少给出为什么或在哪里使用它们的实际例子。在现代(甚至遗留)案例中,工会什么时候有用?我唯一的两个猜测是,当您使用的空间非常有限时,或者当您正在开发 API(或类似的东西)并且您想强制最终用户只有多个对象/类型的一个实例时,对微处理器进行编程一度。这两个猜测是否接近正确?
回答by vz0
Unions are usually used with the company of a discriminator: a variable indicating which of the fields of the union is valid. For example, let's say you want to create your own Varianttype:
联合通常与鉴别器的公司一起使用:一个变量,指示联合的哪个字段是有效的。例如,假设您想创建自己的Variant类型:
struct my_variant_t {
int type;
union {
char char_value;
short short_value;
int int_value;
long long_value;
float float_value;
double double_value;
void* ptr_value;
};
};
Then you would use it such as:
然后你会使用它,例如:
/* construct a new float variant instance */
void init_float(struct my_variant_t* v, float initial_value) {
v->type = VAR_FLOAT;
v->float_value = initial_value;
}
/* Increments the value of the variant by the given int */
void inc_variant_by_int(struct my_variant_t* v, int n) {
switch (v->type) {
case VAR_FLOAT:
v->float_value += n;
break;
case VAR_INT:
v->int_value += n;
break;
...
}
}
This is actually a pretty common idiom, specially on Visual Basic internals.
这实际上是一个非常常见的习语,特别是在 Visual Basic 内部。
For a real example see SDL's SDL_Event union. (actual source code here). There is a type
field at the top of the union, and the same field is repeated on every SDL_*Event struct. Then, to handle the correct event you need to check the value of the type
field.
有关真实示例,请参阅 SDL 的SDL_Event union。(实际源代码在这里)。type
联合的顶部有一个字段,并且在每个 SDL_*Event 结构上重复相同的字段。然后,要处理正确的事件,您需要检查该type
字段的值。
The benefits are simple: there is one single data type to handle all event types without using unnecessary memory.
好处很简单:只有一种数据类型可以处理所有事件类型,而无需使用不必要的内存。
回答by jrsala
I find C++ unions pretty cool. It seems that people usually only think of the use case where one wants to change the value of a union instance "in place" (which, it seems, serves only to save memory or perform doubtful conversions).
我发现 C++ 联合很酷。似乎人们通常只考虑想要“就地”更改联合实例的值的用例(这似乎只是为了节省内存或执行可疑的转换)。
In fact, unions can be of great power as a software engineering tool, even when you never change the value of any union instance.
事实上,联合作为一种软件工程工具可以发挥很大的作用,即使你从未改变任何联合实例的值。
Use case 1: the chameleon
用例 1:变色龙
With unions, you can regroup a number of arbitrary classes under one denomination, which isn't without similarities with the case of a base class and its derived classes. What changes, however, is what you can and can't do with a given union instance:
使用联合,您可以将多个任意类重新组合到一个名称下,这与基类及其派生类的情况并非没有相似之处。然而,改变的是你可以和不能对给定的联合实例做什么:
struct Batman;
struct BaseballBat;
union Bat
{
Batman brucewayne;
BaseballBat club;
};
ReturnType1 f(void)
{
BaseballBat bb = {/* */};
Bat b;
b.club = bb;
// do something with b.club
}
ReturnType2 g(Bat& b)
{
// do something with b, but how do we know what's inside?
}
Bat returnsBat(void);
ReturnType3 h(void)
{
Bat b = returnsBat();
// do something with b, but how do we know what's inside?
}
It appears that the programmer has to be certain of the type of the content of a given union instance when he wants to use it. It is the case in function f
above. However, if a function were to receive a union instance as a passed argument, as is the case with g
above, then it wouldn't know what to do with it. The same applies to functions returning a union instance, see h
: how does the caller know what's inside?
看起来程序员在使用给定联合实例时必须确定其内容的类型。f
上面的函数就是这种情况。然而,如果一个函数接收一个联合实例作为传递的参数,就像g
上面的情况一样,那么它不知道如何处理它。这同样适用于返回联合实例的函数,请参阅h
:调用者如何知道里面有什么?
If a union instance never gets passed as an argument or as a return value, then it's bound to have a very monotonous life, with spikes of excitement when the programmer chooses to change its content:
如果一个联合实例从来没有作为参数或返回值传递,那么它肯定会过着非常单调的生活,当程序员选择改变它的内容时会兴奋不已:
Batman bm = {/* */};
Baseball bb = {/* */};
Bat b;
b.brucewayne = bm;
// stuff
b.club = bb;
And that's the most (un)popular use case of unions. Another use case is when a union instance comes along with something that tells you its type.
这是工会最(不)流行的用例。另一个用例是当联合实例伴随着一些告诉你它的类型的东西时。
Use case 2: "Nice to meet you, I'm object
, from Class
"
用例 2:“很高兴认识你,我object
来自Class
”
Suppose a programmer elected to always pair up a union instance with a type descriptor (I'll leave it to the reader's discretion to imagine an implementation for one such object). This defeats the purpose of the union itself if what the programmer wants is to save memory and that the size of the type descriptor is not negligible with respect to that of the union. But let's suppose that it's crucial that the union instance could be passed as an argument or as a return value with the callee or caller not knowing what's inside.
假设程序员选择始终将联合实例与类型描述符配对(我将让读者自行决定想象一个此类对象的实现)。如果程序员想要节省内存并且类型描述符的大小相对于联合的大小不可忽略,那么这违背了联合本身的目的。但是让我们假设,在被调用者或调用者不知道内部内容的情况下,可以将联合实例作为参数或作为返回值传递是至关重要的。
Then the programmer has to write a switch
control flow statement to tell Bruce Wayne apart from a wooden stick, or something equivalent. It's not too bad when there are only two types of contents in the union but obviously, the union doesn't scale anymore.
然后程序员必须编写一个switch
控制流语句来告诉布鲁斯韦恩与木棍或类似的东西不同。当联合中只有两种类型的内容时还不错,但显然联合不再可扩展。
Use case 3:
用例 3:
As the authors of a recommendation for the ISO C++ Standardput it back in 2008,
正如ISO C++ 标准建议的作者在 2008 年提出的那样,
Many important problem domains require either large numbers of objects or limited memory resources. In these situations conserving space is very important, and a union is often a perfect way to do that. In fact, a common use case is the situation where a union never changes its active member during its lifetime. It can be constructed, copied, and destructed as if it were a struct containing only one member. A typical application of this would be to create a heterogeneous collection of unrelated types which are not dynamically allocated (perhaps they are in-place constructed in a map, or members of an array).
许多重要的问题域需要大量对象或有限的内存资源。在这些情况下,节省空间非常重要,而联合通常是做到这一点的完美方式。事实上,一个常见的用例是联合在其生命周期内从未改变其活动成员的情况。它可以被构造、复制和销毁,就好像它是一个只包含一个成员的结构一样。一个典型的应用是创建一个不相关类型的异构集合,这些类型不是动态分配的(也许它们是在映射中就地构造的,或者是数组的成员)。
And now, an example, with a UML class diagram:
现在,举一个带有 UML 类图的示例:
The situation in plain English: an object of class A canhave objects of any class among B1, ..., Bn, and at most one of each type, with nbeing a pretty big number, say at least 10.
简单的英语情况:A 类的对象可以包含 B1、...、Bn 中的任何类的对象,并且每种类型最多一个,其中n是一个相当大的数字,例如至少 10。
We don't want to add fields (data members) to A like so:
我们不想像这样向 A 添加字段(数据成员):
private:
B1 b1;
.
.
.
Bn bn;
because nmight vary (we might want to add Bx classes to the mix), and because this would cause a mess with constructors and because A objects would take up a lot of space.
因为n可能会有所不同(我们可能希望将 Bx 类添加到组合中),并且因为这会导致构造函数混乱并且因为 A 对象会占用大量空间。
We could use a wacky container of void*
pointers to Bx
objects with casts to retrieve them, but that's fugly and so C-style... but more importantly that would leave us with the lifetimes of many dynamically allocated objects to manage.
我们可以使用一个古怪的void*
指向Bx
对象的指针容器来检索它们,但这很笨拙,而且是 C 风格的……但更重要的是,这会给我们留下许多动态分配对象的生命周期来管理。
Instead, what can be done is this:
相反,可以做的是:
union Bee
{
B1 b1;
.
.
.
Bn bn;
};
enum BeesTypes { TYPE_B1, ..., TYPE_BN };
class A
{
private:
std::unordered_map<int, Bee> data; // C++11, otherwise use std::map
public:
Bee get(int); // the implementation is obvious: get from the unordered map
};
Then, to get the content of a union instance from data
, you use a.get(TYPE_B2).b2
and the likes, where a
is a class A
instance.
然后,要从 中获取联合实例的内容data
,您可以使用a.get(TYPE_B2).b2
等等,其中a
是一个类A
实例。
This is all the more powerful since unions are unrestricted in C++11. See the document linked to aboveor this articlefor details.
回答by Kevin
One example is in the embedded realm, where each bit of a register may mean something different. For example, a union of an 8-bit integer and a structure with 8 separate 1-bit bitfields allows you to either change one bit or the entire byte.
一个例子是在嵌入式领域,其中寄存器的每一位可能意味着不同的东西。例如,一个 8 位整数和一个具有 8 个单独的 1 位位域的结构的联合允许您更改一位或整个字节。
回答by Joseph Quinsey
Herb Sutterwrote in GOTWabout six years ago, with emphasisadded:
大约六年前,赫伯·萨特( Herb Sutter) 在GOTW 中写道,并强调了以下内容:
"But don't think that unions are only a holdover from earlier times. Unions are perhaps most useful for saving space by allowing data to overlap, and this is still desirable in C++and in today's modern world. For example, some of the most advanced C++standard library implementations in the world now use just this technique for implementing the "small string optimization," a great optimization alternative that reuses the storage inside a string object itself: for large strings, space inside the string object stores the usual pointer to the dynamically allocated buffer and housekeeping information like the size of the buffer; for small strings, the same space is instead reused to store the string contents directly and completely avoid any dynamic memory allocation. For more about the small string optimization (and other string optimizations and pessimizations in considerable depth), see... ."
“但是不要认为联合只是早期的保留。联合对于通过允许数据重叠来节省空间最有用,这在 C++和当今的现代世界中仍然是可取的。例如,一些最高级C++世界上的标准库实现现在只使用这种技术来实现“小字符串优化”,这是一个很好的优化替代方案,它重用了字符串对象本身的存储空间:对于大字符串,字符串对象内部的空间存储了指向动态对象的常用指针分配的缓冲区和内务信息,如缓冲区的大小;对于小字符串,相同的空间被重用来直接存储字符串内容,完全避免任何动态内存分配。有关小字符串优化(以及相当深度的其他字符串优化和悲观化)的更多信息,请参阅...。”
And for a less useful example, see the long but inconclusive question gcc, strict-aliasing, and casting through a union.
对于一个不太有用的示例,请参阅冗长但不确定的问题gcc、strict-aliasing 和 cast through a union。
回答by Joseph Quinsey
Well, one example use case I can think of is this:
好吧,我能想到的一个示例用例是:
typedef union
{
struct
{
uint8_t a;
uint8_t b;
uint8_t c;
uint8_t d;
};
uint32_t x;
} some32bittype;
You can then access the 8-bit separate parts of that 32-bit block of data; however, prepare to potentially be bitten by endianness.
然后,您可以访问该 32 位数据块的 8 位独立部分;然而,准备好可能被字节序咬住。
This is just one hypothetical example, but whenever you want to split data in a field into component parts like this, you could use a union.
这只是一个假设示例,但是每当您想将字段中的数据拆分为这样的组成部分时,您都可以使用联合。
That said, there is also a method which is endian-safe:
也就是说,还有一种方法是字节序安全的:
uint32_t x;
uint8_t a = (x & 0xFF000000) >> 24;
For example, since that binary operation will be converted by the compiler to the correct endianness.
例如,由于该二进制操作将由编译器转换为正确的字节序。
回答by wallyk
Some uses for unions:
联合的一些用途:
- Provide a general endianness interface to an unknown external host.
- Manipulate foreign CPU architecture floating point data, such as accepting VAX G_FLOATSfrom a network link and converting them to IEEE 754 long realsfor processing.
- Provide straightforward bit twiddling access to a higher-level type.
- 为未知的外部主机提供通用的字节序接口。
- 操纵外部 CPU 架构浮点数据,例如从网络链接接受VAX G_FLOATS并将它们转换为IEEE 754 长实数进行处理。
- 提供对更高级别类型的直接位操作访问。
union { unsigned char byte_v[16]; long double ld_v; }
With this declaration, it is simple to display the hex byte values of a
long double
, change the exponent's sign, determine if it is a denormal value, or implement long double arithmetic for a CPU which does not support it, etc.
union { unsigned char byte_v[16]; long double ld_v; }
使用此声明,可以简单地显示 a 的十六进制字节值
long double
、更改指数符号、确定它是否为非正规值,或者为不支持它的 CPU 实现 long double 算法等。
Saving storage space when fields are dependent on certain values:
class person { string name; char gender; // M = male, F = female, O = other union { date vasectomized; // for males int pregnancies; // for females } gender_specific_data; }
Grep the include files for use with your compiler. You'll find dozens to hundreds of uses of
union
:[wally@zenetfedora ~]$ cd /usr/include [wally@zenetfedora include]$ grep -w union * a.out.h: union argp.h: parsing options, getopt is called with the union of all the argp bfd.h: union bfd.h: union bfd.h:union internal_auxent; bfd.h: (bfd *, struct bfd_symbol *, int, union internal_auxent *); bfd.h: union { bfd.h: /* The value of the symbol. This really should be a union of a bfd.h: union bfd.h: union bfdlink.h: /* A union of information depending upon the type. */ bfdlink.h: union bfdlink.h: this field. This field is present in all of the union element bfdlink.h: the union; this structure is a major space user in the bfdlink.h: union bfdlink.h: union curses.h: union db_cxx.h:// 4201: nameless struct/union elf.h: union elf.h: union elf.h: union elf.h: union elf.h:typedef union _G_config.h:typedef union gcrypt.h: union gcrypt.h: union gcrypt.h: union gmp-i386.h: union { ieee754.h:union ieee754_float ieee754.h:union ieee754_double ieee754.h:union ieee854_long_double ifaddrs.h: union jpeglib.h: union { ldap.h: union mod_vals_u { ncurses.h: union newt.h: union { obstack.h: union pi-file.h: union { resolv.h: union { signal.h:extern int sigqueue (__pid_t __pid, int __sig, __const union sigval __val) stdlib.h:/* Lots of hair to allow traditional BSD use of `union wait' stdlib.h: (__extension__ (((union { __typeof(status) __in; int __i; }) \ stdlib.h:/* This is the type of the argument to `wait'. The funky union stdlib.h: causes redeclarations with either `int *' or `union wait *' to be stdlib.h:typedef union stdlib.h: union wait *__uptr; stdlib.h: } __WAIT_STATUS __attribute__ ((__transparent_union__)); thread_db.h: union thread_db.h: union tiffio.h: union { wchar.h: union xf86drm.h:typedef union _drmVBlank {
当字段依赖于某些值时节省存储空间:
class person { string name; char gender; // M = male, F = female, O = other union { date vasectomized; // for males int pregnancies; // for females } gender_specific_data; }
Grep 包含文件以供您的编译器使用。你会发现几十到几百种用法
union
:[wally@zenetfedora ~]$ cd /usr/include [wally@zenetfedora include]$ grep -w union * a.out.h: union argp.h: parsing options, getopt is called with the union of all the argp bfd.h: union bfd.h: union bfd.h:union internal_auxent; bfd.h: (bfd *, struct bfd_symbol *, int, union internal_auxent *); bfd.h: union { bfd.h: /* The value of the symbol. This really should be a union of a bfd.h: union bfd.h: union bfdlink.h: /* A union of information depending upon the type. */ bfdlink.h: union bfdlink.h: this field. This field is present in all of the union element bfdlink.h: the union; this structure is a major space user in the bfdlink.h: union bfdlink.h: union curses.h: union db_cxx.h:// 4201: nameless struct/union elf.h: union elf.h: union elf.h: union elf.h: union elf.h:typedef union _G_config.h:typedef union gcrypt.h: union gcrypt.h: union gcrypt.h: union gmp-i386.h: union { ieee754.h:union ieee754_float ieee754.h:union ieee754_double ieee754.h:union ieee854_long_double ifaddrs.h: union jpeglib.h: union { ldap.h: union mod_vals_u { ncurses.h: union newt.h: union { obstack.h: union pi-file.h: union { resolv.h: union { signal.h:extern int sigqueue (__pid_t __pid, int __sig, __const union sigval __val) stdlib.h:/* Lots of hair to allow traditional BSD use of `union wait' stdlib.h: (__extension__ (((union { __typeof(status) __in; int __i; }) \ stdlib.h:/* This is the type of the argument to `wait'. The funky union stdlib.h: causes redeclarations with either `int *' or `union wait *' to be stdlib.h:typedef union stdlib.h: union wait *__uptr; stdlib.h: } __WAIT_STATUS __attribute__ ((__transparent_union__)); thread_db.h: union thread_db.h: union tiffio.h: union { wchar.h: union xf86drm.h:typedef union _drmVBlank {
回答by YeenFei
Unions are useful when dealing with byte-level (low level) data.
联合在处理字节级(低级)数据时很有用。
One of my recent usage was on IP address modeling which looks like below :
我最近的使用之一是 IP 地址建模,如下所示:
// Composite structure for IP address storage
union
{
// IPv4 @ 32-bit identifier
// Padded 12-bytes for IPv6 compatibility
union
{
struct
{
unsigned char _reserved[12];
unsigned char _IpBytes[4];
} _Raw;
struct
{
unsigned char _reserved[12];
unsigned char _o1;
unsigned char _o2;
unsigned char _o3;
unsigned char _o4;
} _Octet;
} _IPv4;
// IPv6 @ 128-bit identifier
// Next generation internet addressing
union
{
struct
{
unsigned char _IpBytes[16];
} _Raw;
struct
{
unsigned short _w1;
unsigned short _w2;
unsigned short _w3;
unsigned short _w4;
unsigned short _w5;
unsigned short _w6;
unsigned short _w7;
unsigned short _w8;
} _Word;
} _IPv6;
} _IP;
回答by DannyK
An example when I've used a union:
我使用联合时的一个例子:
class Vector
{
union
{
double _coord[3];
struct
{
double _x;
double _y;
double _z;
};
};
...
}
this allows me to access my data as an array or the elements.
这允许我以数组或元素的形式访问我的数据。
I've used a union to have the different terms point to the same value. In image processing, whether I was working on columns or width or the size in the X direction, it can become confusing. To alleve this problem, I use a union so I know which descriptions go together.
我使用了一个联合来让不同的术语指向相同的值。在图像处理中,无论我是处理列、宽度还是 X 方向的大小,它都会变得混乱。为了解决这个问题,我使用了一个联合,这样我就知道哪些描述在一起了。
union { // dimension from left to right // union for the left to right dimension
uint32_t m_width;
uint32_t m_sizeX;
uint32_t m_columns;
};
union { // dimension from top to bottom // union for the top to bottom dimension
uint32_t m_height;
uint32_t m_sizeY;
uint32_t m_rows;
};
回答by Null Set
Unions provide polymorphism in C.
联合在 C 中提供了多态性。
回答by Matthieu M.
The union
keyword, while still used in C++031, is mostly a remnant of the C days. The most glaring issue is that it only works with POD1.
该union
关键字虽然仍在 C++03 1 中使用,但主要是 C 时代的残余。最明显的问题是它只适用于 POD 1。
The idea of the union, however, is still present, and indeed the Boost libraries feature a union-like class:
然而,联合的想法仍然存在,而且确实 Boost 库具有类似联合的类:
boost::variant<std::string, Foo, Bar>
Which has most of the benefits of the union
(if not all) and adds:
它具有union
(如果不是全部)的大部分好处并补充说:
- ability to correctly use non-POD types
- static type safety
- 正确使用非 POD 类型的能力
- 静态类型安全
In practice, it has been demonstrated that it was equivalent to a combination of union
+ enum
, and benchmarked that it was as fast (while boost::any
is more of the realm of dynamic_cast
, since it uses RTTI).
在实践中,已经证明它等效于union
+的组合enum
,并且基准测试表明它同样快(虽然boost::any
更多的是 的领域dynamic_cast
,因为它使用 RTTI)。
1Unions were upgraded in C++11 (unrestricted unions), and can now contain objects with destructors, although the user has to invoke the destructor manually (on the currently active union member). It's still much easier to use variants.
1联合在 C++11 中升级(不受限制的联合),现在可以包含带有析构函数的对象,尽管用户必须手动调用析构函数(在当前活动的联合成员上)。使用变体仍然容易得多。