C++ 为什么编译器不允许在联合中使用 std::string?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3521914/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why compiler doesn't allow std::string inside union?
提问by bjskishore123
i want to use string inside Union. if i write as below
我想在联盟内使用字符串。如果我写如下
union U
{
int i;
float f;
string s;
};
Compiler gives error saying U::S has copy constructor.
编译器报错说 U::S 有复制构造函数。
I read some other post for alternate ways for solving this issue. But i want to know why compiler doesn't allow this in the first place?
我阅读了其他一些帖子,了解解决此问题的替代方法。但我想知道为什么编译器首先不允许这样做?
EDIT: @KennyTM: In any union, if member is initialized others will have garbage values, if none is initialized all will have garbage values. I think, tagged union just provides some comfort to access valid values from Union. Your question: how do you or the compiler write a copy constructor for the union above without extra information? sizeof(string) gives 4 bytes. Based on this, compiler can compare other members sizes and allocate largest allocation(4bytes in our example). Internal string length doesn't matter because it will be stored in a seperate location. Let the string be of any length. All that Union has to know is invoking string class copy constructor with string parameter. In whichever way compiler finds that copy constructor has to be invoked in normal case, similar method as to be followed even when string is inside Union. So i am thinking compiler could do like, allocate 4 bytes. Then if any string is assigned to s, then string class will take care of allocation and copying of that string using its own allocator. So there is no chance of memory corruption as well.
编辑:@KennyTM:在任何联合中,如果成员被初始化,其他人将有垃圾值,如果没有被初始化,所有成员都会有垃圾值。我认为,标记的 union 只是为从 Union 访问有效值提供了一些安慰。您的问题:您或编译器如何在没有额外信息的情况下为上述联合编写复制构造函数?sizeof(string) 给出 4 个字节。基于此,编译器可以比较其他成员的大小并分配最大的分配(在我们的示例中为 4 字节)。内部字符串长度无关紧要,因为它将存储在单独的位置。让字符串具有任意长度。Union 所需要知道的就是使用字符串参数调用字符串类复制构造函数。无论以何种方式编译器发现在正常情况下都必须调用复制构造函数,即使字符串在 Union 内,也应遵循类似的方法。所以我认为编译器可以这样做,分配 4 个字节。然后,如果将任何字符串分配给 s,则字符串类将使用自己的分配器来处理该字符串的分配和复制。所以也没有内存损坏的可能性。
Is string not existed at the time of Union developement in compiler ? So the answer is not clear to me still. Am a new joinee in this site, if anything wrong, pls excuse me.
在编译器中进行 Union 开发时是否不存在字符串?所以答案对我来说仍然不清楚。我是这个网站的新加入者,如果有什么不对的,请原谅。
采纳答案by Puppy
Think about it. How does the compiler know what type is in the union?
想想看。编译器如何知道联合中的类型是什么?
It doesn't. The fundamental operation of a union is essentially a bitwise cast. Operations on values contained within unions are only safe when each type can essentially be filled with garbage. std::string
can't, because that would result in memory corruption. Use boost::variant
or boost::any
.
它没有。联合的基本操作本质上是按位转换。只有当每种类型本质上都可以用垃圾填充时,对联合中包含的值的操作才是安全的。std::string
不能,因为那会导致内存损坏。使用boost::variant
或boost::any
。
回答by kennytm
Because having a class with a non-trivial (copy/)constructor in a union doesn't make sense. Suppose we have
因为在联合中拥有一个具有非平凡(复制/)构造函数的类是没有意义的。假设我们有
union U {
string x;
vector<int> y;
};
U u; // <--
If U was a struct, u.x
and u.y
would be initialized to an empty string and empty vector respectively. But members of a union share the same address. So, if u.x
is initialized, u.y
will contain invalid data, and so is the reverse. If both of them are not initialized then they cannot be used. In any case, having these data in a union cannot be handled easily, so C++98 chooses to deny this: (§9.5/1):
如果 U 是一个结构体,u.x
并且u.y
将分别初始化为空字符串和空向量。但工会成员共享同一个地址。因此,如果u.x
被初始化,u.y
将包含无效数据,反之亦然。如果它们都没有初始化,那么它们就不能使用。无论如何,将这些数据放在一个联合中是不容易处理的,因此 C++98 选择否认这一点:(第 9.5/1 节):
An object of a class with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non-trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can an array of such objects.
具有非平凡构造函数 (12.1)、非平凡复制构造函数 (12.8)、非平凡析构函数 (12.4) 或非平凡复制赋值运算符 (13.5.3, 12.8) 的类的对象不能是联合的成员,也不能是此类对象的数组。
In C++0x this rule has been relaxed (§9.5/2):
在 C++0x 中,此规则已放宽(第 9.5/2 节):
At most one non-static data member of a union may have a brace-or-equal-initializer. [Note:if any non-static data member of a union has a non-trivial default constructor (12.1), copy constructor (12.8), move constructor (12.8), copy assignment operator (12.8), move assignment operator (12.8), or destructor (12.4), the corresponding member function of the union must be user-provided or it will be implicitly deleted (8.4.3) for the union. — end note]
联合的最多一个非静态数据成员可以有一个大括号或相等的初始化器。[注意:如果联合的任何非静态数据成员具有非平凡的默认构造函数 (12.1)、复制构造函数 (12.8)、移动构造函数 (12.8)、复制赋值运算符 (12.8)、移动赋值运算符 (12.8),或析构函数(12.4),联合体对应的成员函数必须是用户提供的,否则会为联合体隐式删除(8.4.3)。— 尾注]
but it is still a not possible to create (correct) con/destructors for the union, e.g. how do you or the compiler write a copy constructor for the union above without extra information? To ensure which member of the union is active, you need a tagged union, and you need to handle the construction and destruction manually e.g.
但是仍然无法为联合创建(正确)构造函数/析构函数,例如,您或编译器如何在没有额外信息的情况下为上述联合编写复制构造函数?为了确保工会的哪个成员处于活动状态,您需要一个标记的 union,并且您需要手动处理构造和销毁,例如
struct TU {
int type;
union {
int i;
float f;
std::string s;
} u;
TU(const TU& tu) : type(tu.type) {
switch (tu.type) {
case TU_STRING: new(&u.s)(tu.u.s); break;
case TU_INT: u.i = tu.u.i; break;
case TU_FLOAT: u.f = tu.u.f; break;
}
}
~TU() {
if (tu.type == TU_STRING)
u.s.~string();
}
...
};
But, as @DeadMGhas mentioned, this is already implemented as boost::variant
orboost::any
.
但是,正如@DeadMG所提到的,这已经实现为or。boost::variant
boost::any
回答by KeatsPeeks
In C++98/03, members of a union can't have constructors, destructors, virtual member functions, or base classes.
在 C++98/03 中,联合的成员不能有构造函数、析构函数、虚成员函数或基类。
So basically, you can only use built-in data types, or PODs
所以基本上,您只能使用内置数据类型或POD
Note that it is changing in C++0x: Unrestricted unions
请注意,它在 C++0x: Unrestricted unions 中发生了变化
union {
int z;
double w;
string s; // Illegal in C++98, legal in C++0x.
};
回答by FireAphis
From the C++ spec §9.5.1:
来自 C++ 规范 §9.5.1:
An object of a class with a non-trivial constructor, a non-trivial copy constructor, a non-trivial destructor, or a non-trivial copy assignment operator cannot be a member of a union.
具有非平凡构造函数、非平凡复制构造函数、非平凡析构函数或非平凡复制赋值运算符的类的对象不能是联合的成员。
The reason for this rule is that the compiler will never know which of the destructors/constructors call, since it never really knows which of the possible objects is inside the union.
这条规则的原因是编译器永远不会知道哪个析构函数/构造函数调用了,因为它永远不会真正知道哪个可能的对象在联合内部。
回答by Robert Risack
The garbage is introduced if you
垃圾被引入,如果你
- assign a string
- then assign an int or float
- then a string again
- 分配一个字符串
- 然后分配一个 int 或 float
- 然后又是一个字符串
string manages memory somewhere else. This information is most likely some pointer. This pointer is garbaged when assigning the int. Assigning a new string should destroy the old string, which is not possible.
字符串在其他地方管理内存。此信息很可能是某个指针。这个指针在分配 int 时被垃圾化。分配新字符串应该破坏旧字符串,这是不可能的。
The second step should destroy the string, but does not know, if there has been a string.
第二步应该是销毁字符串,但是不知道,是否已经有字符串了。
They obviously have found a solution for this problem in the meantime.
在此期间,他们显然已经找到了解决此问题的方法。
回答by TrisT
You can now do it.
Of course if you initialize any other member of the union first, or simply don't initialize the string at all, then there's a problem.
Since the string class overloads the assignment operator, you can't then initialize the string with an assignment operation:
你现在可以做到了。
当然,如果您首先初始化联合的任何其他成员,或者根本不初始化字符串,那么就会出现问题。
由于字符串类重载了赋值运算符,因此您不能使用赋值操作初始化字符串:
this->union_string = std::string("whatever");
Will fail because you're still using the assignment operator.
会失败,因为您仍在使用赋值运算符。
To properly initialize a union string after you've put something else in the union or not initialized it in the first place, you have to call the constructor directly on that memory:
要在将其他内容放入联合或首先未对其进行初始化之后正确初始化联合字符串,您必须直接在该内存上调用构造函数:
new(&this->union_string) std::string("whatever");
This way you're simply not using the assignment function at all.
这样你根本就没有使用赋值函数。
Another concern is your compiler should make you make a destructor, and if for some reason not, you should make it anyway. Since it's a union, by the end of your class's lifetime the compiler can't know whether that union memory is used by the string or something else, so your destructor should call the string's destructor if that's the case.
So if you don't do it, you'll have a memory leak since the constructor for the string is never called, and it never knows to release the memory it's using.
另一个问题是你的编译器应该让你创建一个析构函数,如果由于某种原因没有,你无论如何都应该创建它。由于它是一个联合,在您的类的生命周期结束时,编译器无法知道该联合内存是否被字符串或其他东西使用,因此如果是这种情况,您的析构函数应该调用字符串的析构函数。
所以如果你不这样做,你就会有内存泄漏,因为字符串的构造函数永远不会被调用,它永远不会知道释放它正在使用的内存。