为什么不能在 C++ 中的非 POD 结构上使用 offsetof?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1129894/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 18:56:38  来源:igfitidea点击:

Why can't you use offsetof on non-POD structures in C++?

c++offsetof

提问by Alex

I was researching how to get the memory offset of a member to a class in C++ and came across this on wikipedia:

我正在研究如何在 C++ 中获得一个成员的内存偏移量,并在维基百科上遇到了这个

In C++ code, you can not use offsetof to access members of structures or classes that are not Plain Old Data Structures.

在 C++ 代码中,您不能使用 offsetof 来访问非普通旧数据结构的结构或类的成员。

I tried it out and it seems to work fine.

我试了一下,它似乎工作正常。

class Foo
{
private:
    int z;
    int func() {cout << "this is just filler" << endl; return 0;}

public: 
    int x;
    int y;
    Foo* f;

    bool returnTrue() { return false; }
};

int main()
{
    cout << offsetof(Foo, x)  << " " << offsetof(Foo, y) << " " << offsetof(Foo, f);
    return 0;
}

I got a few warnings, but it compiled and when run it gave reasonable output:

我收到了一些警告,但它编译并运行时给出了合理的输出:

Laptop:test alex$ ./test
4 8 12

I think I'm either misunderstanding what a POD data structure is or I'm missing some other piece of the puzzle. I don't see what the problem is.

我想我要么误解了 POD 数据结构是什么,要么我错过了其他一些难题。我不明白有什么问题。

采纳答案by Bluehorn

Short answer: offsetof is a feature that is only in the C++ standard for legacy C compatibility. Therefore it is basically restricted to the stuff than can be done in C. C++ supports only what it must for C compatibility.

简短回答:offsetof 是仅在 C++ 标准中才具有的功能,用于与旧 C 兼容。因此,它基本上仅限于可以在 C 中完成的内容。C++ 仅支持 C 兼容性所必须的内容。

As offsetof is basically a hack (implemented as macro) that relies on the simple memory-model supporting C, it would take a lot of freedom away from C++ compiler implementors how to organize class instance layout.

由于 offsetof 基本上是一种依赖于支持 C 的简单内存模型的 hack(作为宏实现),因此 C++ 编译器实现者在如何组织类实例布局方面需要很大的自由。

The effect is that offsetof will often work (depending on source code and compiler used) in C++ even where not backed by the standard - except where it doesn't. So you should be very careful with offsetof usage in C++, especially since I do not know a single compiler that will generate a warning for non-POD use...Modern GCC and Clang will emit a warning if offsetofis used outside the standard (-Winvalid-offsetof).

结果是 offsetof 在 C++ 中通常会工作(取决于使用的源代码和编译器),即使在没有标准支持的情况下 - 除非它没有。所以你应该非常小心在 C++ 中使用 offsetof,特别是因为我不知道一个编译器会为非 POD 使用生成警告......如果offsetof在标准 ( -Winvalid-offsetof)之外使用现代 GCC 和 Clang 将发出警告.

Edit: As you asked for example, the following might clarify the problem:

编辑:例如,如您所问,以下内容可能会澄清问题:

#include <iostream>
using namespace std;

struct A { int a; };
struct B : public virtual A   { int b; };
struct C : public virtual A   { int c; };
struct D : public B, public C { int d; };

#define offset_d(i,f)    (long(&(i)->f) - long(i))
#define offset_s(t,f)    offset_d((t*)1000, f)

#define dyn(inst,field) {\
    cout << "Dynamic offset of " #field " in " #inst ": "; \
    cout << offset_d(&i##inst, field) << endl; }

#define stat(type,field) {\
    cout << "Static offset of " #field " in " #type ": "; \
    cout.flush(); \
    cout << offset_s(type, field) << endl; }

int main() {
    A iA; B iB; C iC; D iD;
    dyn(A, a); dyn(B, a); dyn(C, a); dyn(D, a);
    stat(A, a); stat(B, a); stat(C, a); stat(D, a);
    return 0;
}

This will crash when trying to locate the field ainside type Bstatically, while it works when an instance is available. This is because of the virtual inheritance, where the location of the base class is stored into a lookup table.

当尝试静态定位a类型内部的字段时,这将崩溃B,而在实例可用时它会工作。这是因为虚拟继承,其中基类的位置存储在查找表中。

While this is a contrived example, an implementation could use a lookup table also to find the public, protected and private sections of a class instance. Or make the lookup completely dynamic (use a hash table for fields), etc.

虽然这是一个人为的示例,但实现也可以使用查找表来查找类实例的公共、受保护和私有部分。或者使查找完全动态(对字段使用哈希表)等。

The standard just leaves all possibilities open by restricting offsetof to POD (IOW: no way to use a hash table for POD structs... :)

该标准只是通过将 offsetof 限制为 POD 来保留所有可能性(IOW:无法对 POD 结构使用哈希表...... :)

Just another note: I had to reimplement offsetof (here: offset_s) for this example as GCC actually errors out when I call offsetof for a field of a virtual base class.

另一个注意事项:我必须为此示例重新实现 offsetof(此处:offset_s),因为当我为虚拟基类的字段调用 offsetof 时,GCC 实际上会出错。

回答by Steve Jessop

Bluehorn's answer is correct, but for me it doesn't explain the reason for the problem in simplest terms. The way I understand it is as follows:

Bluehorn 的回答是正确的,但对我来说,它并没有用最简单的术语解释问题的原因。我的理解方式如下:

If NonPOD is a non-POD class, then when you do:

如果 NonPOD 是非 POD 类,那么当您执行以下操作时:

NonPOD np;
np.field;

the compiler does not necessarily access the field by adding some offset to the base pointer and dereferencing. For a POD class, the C++ Standard constrains it to do that(or something equivalent), but for a non-POD class it does not. The compiler might instead read a pointer out of the object, add an offset to thatvalue to give the storage location of the field, and then dereference. This is a common mechanism with virtual inheritance if the field is a member of a virtual base of NonPOD. But it is not restricted to that case. The compiler can do pretty much anything it likes. It could call a hidden compiler-generated virtual member function if it wants.

编译器不一定通过向基指针添加一些偏移量和取消引用来访问该字段。对于 POD 类,C++ 标准限制它这样做(或等效的东西),但对于非 POD 类则不然。编译器可能会从对象中读取一个指针,向该值添加一个偏移量以给出该字段的存储位置,然后取消引用。如果字段是 NonPOD 的虚拟基的成员,则这是具有虚拟继承的常见机制。但不限于这种情况。编译器几乎可以做任何它喜欢的事情。如果需要,它可以调用隐藏的编译器生成的虚拟成员函数。

In the complex cases, it is obviously not possible to represent the location of the field as an integer offset. So offsetofis not valid on non-POD classes.

在复杂的情况下,显然不可能将字段的位置表示为整数偏移量。所以offsetof对非 POD 类无效。

In cases where your compiler just so happens to store the object in a simple way (such as single inheritance, and normally even non-virtual multiple inheritance, and normally fields defined right in the class that you're referencing the object by as opposed to in some base class), then it will just so happen to work. There are probably cases which just so happen to work on every single compiler there is. This doesn't make it valid.

如果您的编译器恰好以简单的方式存储对象(例如单继承,通常甚至是非虚拟多继承,以及通常在您引用对象的类中定义的字段,而不是在某些基类中),那么它就会碰巧起作用。可能有些情况恰好适用于每个编译器。这并不使它有效。

Appendix: how does virtual inheritance work?

附录:虚拟继承是如何工作的?

With simple inheritance, if B is derived from A, the usual implementation is that a pointer to B is just a pointer to A, with B's additional data stuck on the end:

对于简单继承,如果 B 是从 A 派生的,通常的实现是指向 B 的指针只是指向 A 的指针,B 的附加数据卡在最后:

A* ---> field of A  <--- B*
        field of A
        field of B

With simple multiple inheritance, you generally assume that B's base classes (call 'em A1 and A2) are arranged in some order peculiar to B. But the same trick with the pointers can't work:

对于简单的多重继承,您通常假设 B 的基类(称为 A1 和 A2)以 B 特有的某种顺序排列。 但是使用指针的相同技巧不起作用:

A1* ---> field of A1
         field of A1
A2* ---> field of A2
         field of A2

A1 and A2 "know" nothing about the fact that they're both base classes of B. So if you cast a B* to A1*, it has to point to the fields of A1, and if you cast it to A2* it has to point to the fields of A2. The pointer conversion operator applies an offset. So you might end up with this:

A1 和 A2“不知道”它们都是 B 的基类这一事实。因此,如果将 B* 强制转换为 A1*,则它必须指向 A1 的字段,如果将其强制转换为 A2*必须指向 A2 的字段。指针转换运算符应用偏移量。所以你可能会得到这样的结果:

A1* ---> field of A1 <---- B*
         field of A1
A2* ---> field of A2
         field of A2
         field of B
         field of B

Then casting a B* to A1* doesn't change the pointer value, but casting it to A2* adds sizeof(A1)bytes. This is the "other" reason why, in the absence of a virtual destructor, deleting B through a pointer to A2 goes wrong. It doesn't just fail to call the destructor of B and A1, it doesn't even free the right address.

然后将 B* 转换为 A1* 不会改变指针值,但将其转换为 A2* 会增加sizeof(A1)字节。这就是为什么在没有虚拟析构函数的情况下通过指向 A2 的指针删除 B 会出错的“其他”原因。它不仅没有调用 B 和 A1 的析构函数,甚至没有释放正确的地址。

Anyway, B "knows" where all its base classes are, they're always stored at the same offsets. So in this arrangement offsetof would still work. The standard doesn't require implementations to do multiple inheritance this way, but they often do (or something like it). So offsetof might work in this case on your implementation, but it is not guaranteed to.

无论如何,B“知道”它的所有基类在哪里,它们总是存储在相同的偏移量处。所以在这种安排中 offsetof 仍然有效。该标准不要求实现以这种方式进行多重继承,但他们经常这样做(或类似的事情)。因此 offsetof 在这种情况下可能适用于您的实现,但不能保证。

Now, what about virtual inheritance? Suppose B1 and B2 both have A as a virtual base. This makes them single-inheritance classes, so you might think that the first trick will work again:

现在,虚拟继承呢?假设 B1 和 B2 都以 A 作为虚基。这使它们成为单继承类,因此您可能认为第一个技巧会再次起作用:

A* ---> field of A   <--- B1* A* ---> field of A   <--- B2* 
        field of A                    field of A
        field of B1                   field of B2

But hang on. What happens when C derives (non-virtually, for simplicity) from both B1 and B2? C must only contain 1 copy of the fields of A. Those fields can't immediately precede the fields of B1, and also immediately precede the fields of B2. We're in trouble.

但是坚持下去。当 C 从 B1 和 B2 派生(非虚拟地,为简单起见)时会发生什么?C 必须只包含 A 的字段的 1 个副本。这些字段不能紧接在 B1 的字段之前,也不能紧接在 B2 的字段之前。我们有麻烦了。

So what implementations might do instead is:

因此,实现可能会做的是:

// an instance of B1 looks like this, and B2 similar
A* --->  field of A
         field of A
B1* ---> pointer to A 
         field of B1

Although I've indicated B1* pointing to the first part of the object after the A subobject, I suspect (without bothering to check) the actual address won't be there, it'll be the start of A. It's just that unlike simple inheritance, the offsets between the actual address in the pointer, and the address I've indicated in the diagram, will neverbe used unless the compiler is certain of the dynamic type of the object. Instead, it will always go through the meta-information to reach A correctly. So my diagrams will point there, since that offset will always be applied for the uses we're interested in.

虽然我已经指出 B1* 指向 A 子对象之后对象的第一部分,但我怀疑(无需费心检查)实际地址不会在那里,它将是 A 的开始。这只是不同除非编译器确定对象的动态类型,否则简单继承、指针中实际地址和我在图中指示的地址之间的偏移量将永远不会被使用。相反,它将始终通过元信息正确到达 A。所以我的图表将指向那里,因为该偏移将始终应用于我们感兴趣的用途。

The "pointer" to A could be a pointer or an offset, it doesn't really matter. In an instance of B1, created as a B1, it points to (char*)this - sizeof(A), and the same in an instance of B2. But if we create a C, it can look like this:

指向 A 的“指针”可以是指针或偏移量,这并不重要。在 B1 的实例中,创建为 B1,它指向(char*)this - sizeof(A),在 B2 的实例中也是如此。但是如果我们创建一个 C,它看起来像这样:

A* --->  field of A
         field of A
B1* ---> pointer to A    // points to (char*)(this) - sizeof(A) as before
         field of B1
B2* ---> pointer to A    // points to (char*)(this) - sizeof(A) - sizeof(B1)
         field of B2
C* ----> pointer to A    // points to (char*)(this) - sizeof(A) - sizeof(B1) - sizeof(B2)
         field of C
         field of C

So to access a field of A using a pointer or reference to B2 requires more than just applying an offset. We must read the "pointer to A" field of B2, follow it, and only then apply an offset, because depending what class B2 is a base of, that pointer will have different values. There is no such thing as offsetof(B2,field of A): there can't be. offsetof will neverwork with virtual inheritance, on any implementation.

因此,使用指针或对 B2 的引用访问 A 的字段需要的不仅仅是应用偏移量。我们必须读取 B2 的“指向 A 的指针”字段,跟随它,然后才应用偏移量,因为根据 B2 是哪个类的基础,该指针将具有不同的值。没有这样的事情offsetof(B2,field of A):不可能。offsetof永远不会与虚拟继承一起工作,在任何实现上。

回答by AProgrammer

In general, when you ask "why is something undefined", the answer is "because the standard says so". Usually, the rational is along one or more reasons like:

一般来说,当你问“为什么有些东西未定义”时,答案是“因为标准是这样说的”。通常,理性是基于一个或多个原因,例如:

  • it is difficult to detect statically in which case you are.

  • corner cases are difficult to define and nobody took the pain of defining special cases;

  • its use is mostly covered by other features;

  • existing practices at the time of standardization varied and breaking existing implementation and programs depending on them was deemed more harmful that standardization.

  • 在这种情况下很难静态检测。

  • 极端情况很难定义,没有人愿意为特殊情况下定义;

  • 它的使用主要由其他功能覆盖;

  • 标准化时的现有做法各不相同,破坏现有的实施和方案被认为比标准化更有害。

Back to offsetof, the second reason is probably a dominant one. If you look at C++0X, where the standard was previously using POD, it is now using "standard layout", "layout compatible", "POD" allowing more refined cases. And offsetof now needs "standard layout" classes, which are the cases where the committee didn't want to force a layout.

回到offsetof,第二个原因可能占主导地位。如果您查看 C++0X,其中标准以前使用 POD,现在使用“标准布局”、“布局兼容”、“POD”允许更精细的情况。而 offsetof 现在需要“标准布局”类,这是委员会不想强制布局的情况。

You have also to consider the common use of offsetof(), which is to get the value of a field when you have a void* pointer to the object. Multiple inheritance -- virtual or not -- is problematic for that use.

您还必须考虑 offsetof() 的常见用法,即当您有指向对象的 void* 指针时获取字段的值。多重继承——虚拟与否——对于这种使用是有问题的。

回答by KitsuneYMG

I think your class fits the c++0x definition of a POD. g++ has implemented some of c++0x in their latest releases. I think that VS2008 also has some c++0x bits in it.

我认为您的课程符合 POD 的 c++0x 定义。g++ 在其最新版本中实现了一些 c++0x。我认为 VS2008 中也有一些 c++0x 位。

From wikipedia's c++0x article

来自维基百科的 c++0x 文章

C++0x will relax several rules with regard to the POD definition.

A class/struct is considered a POD if it is trivial, standard-layout, and if all of its non-static members are PODs.

A trivial class or struct is defined as one that:

  1. Has a trivial default constructor. This may use the default constructor syntax (SomeConstructor() = default;).
  2. Has a trivial copy constructor, which may use the default syntax.
  3. Has a trivial copy assignment operator, which may use the default syntax.
  4. Has a trivial destructor, which must not be virtual.

A standard-layout class or struct is defined as one that:

  1. Has only non-static data members that are of standard-layout type
  2. Has the same access control (public, private, protected) for all non-static members
  3. Has no virtual functions
  4. Has no virtual base classes
  5. Has only base classes that are of standard-layout type
  6. Has no base classes of the same type as the first defined non-static member
  7. Either has no base classes with non-static members, or has no non-static data members in the most derived class and at most one base class with non-static members. In essence, there may be only one class in this class's hierarchy that has non-static members.

C++0x 将放宽一些关于 POD 定义的规则。

如果一个类/结构是简单的、标准布局的,并且它的所有非静态成员都是 POD,那么它就被认为是 POD。

一个平凡的类或结构被定义为:

  1. 有一个简单的默认构造函数。这可以使用默认构造函数语法(SomeConstructor() = default;)。
  2. 有一个简单的复制构造函数,它可以使用默认语法。
  3. 有一个简单的复制赋值运算符,它可以使用默认语法。
  4. 有一个微不足道的析构函数,它不能是虚拟的。

标准布局类或结构被定义为:

  1. 只有标准布局类型的非静态数据成员
  2. 对所有非静态成员具有相同的访问控制(公共、私有、受保护)
  3. 没有虚函数
  4. 没有虚拟基类
  5. 只有标准布局类型的基类
  6. 没有与第一个定义的非静态成员相同类型的基类
  7. 要么没有具有非静态成员的基类,要么在最派生的类中没有非静态数据成员,而至多一个具有非静态成员的基类。本质上,此类的层次结构中可能只有一个具有非静态成员的类。

回答by Roopesh Majeti

For the definition of POD data structure,here you go with the explanation [ already posted in another post in Stack Overflow ]

对于POD数据结构的定义,这里有解释[已经在Stack Overflow的另一篇文章中发布]

What are POD types in C++?

C++ 中的 POD 类型是什么?

Now, coming to your code, it is working fine as expected. This is because, you are trying to find the offsetof(), for the public members of your class, which is valid.

现在,来到您的代码,它按预期工作正常。这是因为,您正在尝试为类的公共成员找到有效的 offsetof()。

Please let me know, the correct question, if my viewpoint above, doesnot clarify your doubt.

请让我知道,正确的问题,如果我的上述观点没有澄清您的疑问。

回答by Braxton Nunnally

This works every time and its the most portable version to be used in both c and c++

这每次都有效,并且是在 C 和 C++ 中使用的最便携的版本

#define offset_start(s) s
#define offset_end(e) e
#define relative_offset(obj, start, end) ((int64_t)&obj->offset_end(end)-(int64_t)&obj->offset_start(start))

struct Test {
     int a;
     double b;
     Test* c;
     long d;
 }


int main() {
    Test t;
    cout << "a " << relative_offset((&t), a, a) << endl;
    cout << "b " << relative_offset((&t), a, b) << endl;
    cout << "c " << relative_offset((&t), a, c) << endl;
    cout << "d " << relative_offset((&t), a, d) << endl;
    return 0;
}

The above code simply requires you to hold an instance of some object be it a struct or a class. you then need to pass a pointer reference to the class or struct to gain access to its fields. To make sure you get the right offset never set the "start" field to be under the "end" field. We use the compiler to figure out what the address offset is at run-time.

上面的代码只要求您持有某个对象的实例,无论是结构体还是类。然后,您需要传递对类或结构的指针引用以访问其字段。为确保获得正确的偏移量,切勿将“开始”字段设置在“结束”字段下方。我们使用编译器来确定运行时的地址偏移量是多少。

This allows you to not have to worry about the problems with compiler padding data, etc.

这使您不必担心编译器填充数据等问题。

回答by Hamdi Hamdi

Works for me

为我工作

   #define get_offset(type, member) ((size_t)(&((type*)(1))->member)-1)
   #define get_container(ptr, type, member) ((type *)((char *)(ptr) - get_offset(type, member)))

回答by Ropez

If you add, for instance, a virtual empty destructor:

例如,如果添加一个虚拟的空析构函数:

virtual ~Foo() {}

Your class will become "polymorphic", i.e. it will have a hidden member field which is a pointer to a "vtable" that contains pointers to virtual functions.

你的类将变成“多态的”,即它会有一个隐藏的成员字段,它是一个指向包含虚函数指针的“vtable”的指针。

Due to the hidden member field, the size of an object, and offset of members, will not be trivial. Thus, you should get trouble using offsetof.

由于隐藏成员字段,对象的大小和成员的偏移量将不是微不足道的。因此,您应该在使用 offsetof 时遇到麻烦。

回答by Pavel Minaev

I bet you compile this with VC++. Now try it with g++, and see how it works...

我打赌你用 VC++ 编译这个。现在用 g++ 试试看,看看它是如何工作的......

Long story short, it's undefined, but some compilers may allow it. Others do not. In any case, it's non-portable.

长话短说,它是未定义的,但一些编译器可能允许它。其他人没有。无论如何,它是不可移植的。

回答by Greg Slepak

This seems to work fine for me:

这对我来说似乎很好用:

#define myOffset(Class,Member) ({Class o; (size_t)&(o.Member) - (size_t)&o;})