C++ 带有未初始化存储的 STL 向量？

Question

提问by Jim Hunziker

I'm writing an inner loop that needs to place structs in contiguous storage. I don't know how many of these structs there will be ahead of time. My problem is that STL's vectorinitializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct's members to their values.

我正在编写一个需要将structs放置在连续存储中的内部循环。我不知道struct提前会有多少这些。我的问题是 STLvector将其值初始化为 0，所以无论我做什么，我都会承担初始化成本加上将struct的成员设置为其值的成本。

Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?

有没有办法阻止初始化，或者是否有一个类似 STL 的容器，具有可调整大小的连续存储和未初始化的元素？

(I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.)

（我确定这部分代码需要优化，我确定初始化是一个很大的成本。）

Also, see my comments below for a clarification about when the initialization occurs.

另外，请参阅下面我的评论以了解有关何时发生初始化的说明。

SOME CODE:

一些代码：

void GetsCalledALot(int* data1, int* data2, int count) {
    int mvSize = memberVector.size()
    memberVector.resize(mvSize + count); // causes 0-initialization

    for (int i = 0; i < count; ++i) {
        memberVector[mvSize + i].d1 = data1[i];
        memberVector[mvSize + i].d2 = data2[i];
    }
}

Answer 1

采纳答案by Lloyd

std::vectormust initialize the values in the array somehow, which means some constructor (or copy-constructor) must be called. The behavior of vector(or any container class) is undefined if you were to access the uninitialized section of the array as if it were initialized.

std::vector必须以某种方式初始化数组中的值，这意味着必须调用某些构造函数（或复制构造函数）。vector如果您要访问数组的未初始化部分，就好像它已初始化一样，则（或任何容器类）的行为是未定义的。

The best way is to use reserve()and push_back(), so that the copy-constructor is used, avoiding default-construction.

最好的方法是使用reserve()and push_back()，以便使用复制构造函数，避免默认构造。

Using your example code:

使用您的示例代码：

struct YourData {
    int d1;
    int d2;
    YourData(int v1, int v2) : d1(v1), d2(v2) {}
};

std::vector<YourData> memberVector;

void GetsCalledALot(int* data1, int* data2, int count) {
    int mvSize = memberVector.size();

    // Does not initialize the extra elements
    memberVector.reserve(mvSize + count);

    // Note: consider using std::generate_n or std::copy instead of this loop.
    for (int i = 0; i < count; ++i) {
        // Copy construct using a temporary.
        memberVector.push_back(YourData(data1[i], data2[i]));
    }
}

The only problem with calling reserve()(or resize()) like this is that you may end up invoking the copy-constructor more often than you need to. If you can make a good prediction as to the final size of the array, it's better to reserve()the space once at the beginning. If you don't know the final size though, at least the number of copies will be minimal on average.

像这样调用reserve()(或resize())的唯一问题是您最终可能会比您需要的更频繁地调用复制构造函数。如果你能对数组的最终大小做出很好的预测，最好reserve()在开头的空间一次。但是，如果您不知道最终大小，那么至少平均副本数将是最少的。

In the current version of C++, the inner loop is a bit inefficient as a temporary value is constructed on the stack, copy-constructed to the vectors memory, and finally the temporary is destroyed. However the next version of C++ has a feature called R-Value references (T&&) which will help.

在当前版本的 C++ 中，内部循环有点低效，因为在堆栈上构造临时值，复制构造到向量内存，最后临时值被销毁。然而，下一个版本的 C++ 有一个称为 R-Value 引用 ( T&&) 的特性，它会有所帮助。

The interface supplied by std::vectordoes not allow for another option, which is to use some factory-like class to construct values other than the default. Here is a rough example of what this pattern would look like implemented in C++:

提供的接口std::vector不允许另一种选择，即使用一些类似工厂的类来构造默认值以外的值。下面是这个模式在 C++ 中实现的粗略示例：

template <typename T>
class my_vector_replacement {

    // ...

    template <typename F>
    my_vector::push_back_using_factory(F factory) {
        // ... check size of array, and resize if needed.

        // Copy construct using placement new,
        new(arrayData+end) T(factory())
        end += sizeof(T);
    }

    char* arrayData;
    size_t end; // Of initialized data in arrayData
};

// One of many possible implementations
struct MyFactory {
    MyFactory(int* p1, int* p2) : d1(p1), d2(p2) {}
    YourData operator()() const {
        return YourData(*d1,*d2);
    }
    int* d1;
    int* d2;
};

void GetsCalledALot(int* data1, int* data2, int count) {
    // ... Still will need the same call to a reserve() type function.

    // Note: consider using std::generate_n or std::copy instead of this loop.
    for (int i = 0; i < count; ++i) {
        // Copy construct using a factory
        memberVector.push_back_using_factory(MyFactory(data1+i, data2+i));
    }
}

Doing this does mean you have to create your own vector class. In this case it also complicates what should have been a simple example. But there may be times where using a factory function like this is better, for instance if the insert is conditional on some other value, and you would have to otherwise unconditionally construct some expensive temporary even if it wasn't actually needed.

这样做意味着您必须创建自己的向量类。在这种情况下，它也使本来应该是一个简单示例的内容变得复杂。但是有时使用这样的工厂函数可能会更好，例如，如果插入以其他值为条件，那么即使实际上并不需要，您也必须无条件地构造一些昂贵的临时函数。

Answer 2

回答by fredoverflow

C++0x adds a new member function template emplace_backto vector(which relies on variadic templates and perfect forwarding) that gets rid of any temporaries entirely:

的C ++ 0x增加了新的成员函数模板emplace_back来vector（这依赖于可变参数模板和完善的转发）是摆脱完全的任何临时对象：

memberVector.emplace_back(data1[i], data2[i]);

Answer 3

回答by goertzenator

In C++11 (and boost) you can use the array version of unique_ptrto allocate an uninitialized array. This isn't quite an stl container, but is still memory managed and C++-ish which will be good enough for many applications.

在 C++11（和 boost）中，您可以使用的数组版本unique_ptr来分配未初始化的数组。这不是一个 stl 容器，但仍然是内存管理和 C++-ish，这对于许多应用程序来说已经足够了。

auto my_uninit_array = std::unique_ptr<mystruct[]>(new mystruct[count]);

Answer 4

回答by goertzenator

To clarify on reserve() responses: you need to use reserve() in conjunction with push_back(). This way, the default constructor is not called for each element, but rather the copy constructor. You still incur the penalty of setting up your struct on stack, and then copying it to the vector. On the other hand, it's possible that if you use

澄清reserve() 响应：您需要将reserve() 与push_back() 结合使用。这样，不会为每个元素调用默认构造函数，而是为复制构造函数调用。您仍然会在堆栈上设置结构，然后将其复制到向量中。另一方面，如果您使用

vect.push_back(MyStruct(fieldValue1, fieldValue2))

the compiler will construct the new instance directly in the memory thatbelongs to the vector. It depends on how smart the optimizer is. You need to check the generated code to find out.

编译器将直接在属于向量的内存中构造新实例。这取决于优化器的智能程度。您需要检查生成的代码以找出答案。

Answer 5

回答by Don Neufeld

So here's the problem, resize is calling insert, which is doing a copy construction from a default constructed element for each of the newly added elements. To get this to 0 cost you need to write your own default constructor AND your own copy constructor as empty functions. Doing this to your copy constructor is a very bad ideabecause it will break std::vector's internal reallocation algorithms.

所以这里的问题是，resize 正在调用 insert，它正在为每个新添加的元素从默认构造元素进行复制构造。要将其降低到 0 成本，您需要将自己的默认构造函数和自己的复制构造函数编写为空函数。对复制构造函数执行此操作是一个非常糟糕的主意，因为它会破坏 std::vector 的内部重新分配算法。

Summary: You're not going to be able to do this with std::vector.

摘要：您将无法使用 std::vector 执行此操作。

Answer 6

回答by deonb

You can use a wrapper type around your element type, with a default constructor that does nothing. E.g.:

您可以在元素类型周围使用包装器类型，并使用不执行任何操作的默认构造函数。例如：

template <typename T>
struct no_init
{
    T value;

    no_init() { static_assert(std::is_standard_layout<no_init<T>>::value && sizeof(T) == sizeof(no_init<T>), "T does not have standard layout"); }

    no_init(T& v) { value = v; }
    T& operator=(T& v) { value = v; return value; }

    no_init(no_init<T>& n) { value = n.value; }
    no_init(no_init<T>&& n) { value = std::move(n.value); }
    T& operator=(no_init<T>& n) { value = n.value; return this; }
    T& operator=(no_init<T>&& n) { value = std::move(n.value); return this; }

    T* operator&() { return &value; } // So you can use &(vec[0]) etc.
};

To use:

使用：

std::vector<no_init<char>> vec;
vec.resize(2ul * 1024ul * 1024ul * 1024ul);

Answer 7

回答by paercebal

Err...

呃...

try the method:

试试方法：

std::vector<T>::reserve(x)

It will enable you to reserve enough memory for x items without initializing any (your vector is still empty). Thus, there won't be reallocation until to go over x.

它将使您能够为 x 项保留足够的内存而无需初始化任何项（您的向量仍然为空）。因此，在超过 x 之前不会重新分配。

The second point is that vector won't initialize the values to zero. Are you testing your code in debug ?

第二点是向量不会将值初始化为零。您是否在调试中测试您的代码？

After verification on g++, the following code:

在g++上验证后，代码如下：

#include <iostream>
#include <vector>

struct MyStruct
{
   int m_iValue00 ;
   int m_iValue01 ;
} ;

int main()
{
   MyStruct aaa, bbb, ccc ;

   std::vector<MyStruct> aMyStruct ;

   aMyStruct.push_back(aaa) ;
   aMyStruct.push_back(bbb) ;
   aMyStruct.push_back(ccc) ;

   aMyStruct.resize(6) ; // [EDIT] double the size

   for(std::vector<MyStruct>::size_type i = 0, iMax = aMyStruct.size(); i < iMax; ++i)
   {
      std::cout << "[" << i << "] : " << aMyStruct[i].m_iValue00 << ", " << aMyStruct[0].m_iValue01 << "\n" ;
   }

   return 0 ;
}

gives the following results:

给出以下结果：

[0] : 134515780, -16121856
[1] : 134554052, -16121856
[2] : 134544501, -16121856
[3] : 0, -16121856
[4] : 0, -16121856
[5] : 0, -16121856

The initialization you saw was probably an artifact.

您看到的初始化可能是一个工件。

[EDIT] After the comment on resize, I modified the code to add the resize line. The resize effectively calls the default constructor of the object inside the vector, but if the default constructor does nothing, then nothing is initialized... I still believe it was an artifact (I managed the first time to have the whole vector zerooed with the following code:

[编辑] 在对调整大小发表评论后，我修改了代码以添加调整大小行。调整大小有效地调用了向量内部对象的默认构造函数，但是如果默认构造函数什么都不做，那么什么都不会初始化......我仍然相信它是一个工件（我第一次设法将整个向量归零以下代码：

aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;

So... :-/

所以... ：-/

[EDIT 2] Like already offered by Arkadiy, the solution is to use an inline constructor taking the desired parameters. Something like

[编辑 2] 与 Arkadiy 已经提供的一样，解决方案是使用带所需参数的内联构造函数。就像是

struct MyStruct
{
   MyStruct(int p_d1, int p_d2) : d1(p_d1), d2(p_d2) {}
   int d1, d2 ;
} ;

This will probably get inlined in your code.

这可能会内联在您的代码中。

But you should anyway study your code with a profiler to be sure this piece of code is the bottleneck of your application.

但是无论如何，您应该使用分析器研究您的代码，以确保这段代码是您的应用程序的瓶颈。

Answer 8

回答by nsanders

Use the std::vector::reserve() method. It won't resize the vector, but it will allocate the space.

使用 std::vector::reserve() 方法。它不会调整向量的大小，但会分配空间。

Answer 9

回答by fizzer

From your comments to other posters, it looks like you're left with malloc() and friends. Vector won't let you have unconstructed elements.

从你的评论到其他海报，看起来你只剩下 malloc() 和朋友了。Vector 不会让您拥有未构造的元素。

Answer 10

回答by fizzer

From your code, it looks like you have a vector of structs each of which comprises 2 ints. Could you instead use 2 vectors of ints? Then

从您的代码来看，您似乎有一个结构向量，每个结构包含 2 个整数。你可以改用 2 个整数向量吗？然后

copy(data1, data1 + count, back_inserter(v1));
copy(data2, data2 + count, back_inserter(v2));

Now you don't pay for copying a struct each time.

现在您无需为每次复制结构付费。

C++ 带有未初始化存储的 STL 向量？

提问by Jim Hunziker

采纳答案by Lloyd

回答by fredoverflow

回答by goertzenator

回答by goertzenator

回答by Don Neufeld

回答by deonb

回答by paercebal

回答by nsanders

回答by fizzer

回答by fizzer

相关推荐

最近更新

标签

C++ 带有未初始化存储的 STL 向量？

提问by Jim Hunziker

采纳答案by Lloyd

回答by fredoverflow

回答by goertzenator

回答by goertzenator

回答by Don Neufeld

回答by deonb

回答by paercebal

回答by nsanders

回答by fizzer

回答by fizzer

相关推荐

我应该在 C++ 中使用异常说明符吗？

C++ 显示字符串的地址

C++ 如何在 Windows 中使用 MinGW 构建 OpenSSL？

C++ 初始化对象时 {0} 是什么意思？

相关推荐

最近更新

标签