C++ std::vector 是如何实现的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2096571/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How is C++ std::vector implemented?
提问by Arun
I have been using std::vector
a lot, and recently I asked myself this question: "How is std::vector
implemented?"
一直在用std::vector
,最近问自己这个问题:“是怎么std::vector
实现的?”
I had two alternatives:
我有两个选择:
1) Linked list, and then making the API feel like random access (i.e. overloading operator[]
).
1)链表,然后让API感觉像随机访问(即重载operator[]
)。
2) Using new
, e.g. Foo* temp = new Foo[20]
: I believe they do something like this, but then it raises one more question. Do they always allocate a maximum (uint32_t
) storage to give random access? (This is inefficient in terms of memory.)
2)使用new
,例如Foo* temp = new Foo[20]
:我相信他们会做这样的事情,但随后又提出了一个问题。他们是否总是分配最大 ( uint32_t
) 存储空间来提供随机访问?(这在内存方面是低效的。)
Or is there something else that I should be aware of?
或者还有什么我应该注意的吗?
回答by JaredPar
It's implemented by using an underlying array.
它是通过使用底层数组实现的。
It's not possible to implement a std::vector<T>
with a linked list because the standard guarantees the elements in the list will be held in contiguous memory.
std::vector<T>
使用链表实现 a是不可能的,因为标准保证列表中的元素将保存在连续内存中。
回答by UncleBens
I believe it is the third option. It can't just use new T[n]
because then it would actually have to construct as many objects as it allocates. E.g
我相信这是第三种选择。它不能只是使用,new T[n]
因为它实际上必须构造与分配的对象一样多的对象。例如
std::vector<Foo> v;
v.reserve(10);
If your implementation simply ended up doing new Foo[10]
then you'd just have constructed 10 instances of Foo.
如果您的实现只是简单地结束了,new Foo[10]
那么您只需要构造 10 个 Foo 实例。
Instead it uses its allocator to allocate and deallocate raw memory (without constructing objects), and as needed (for example, when you actually push_back
objects) places copy-constructed instances into correct memory locations in its reserve using placement newand removes them with explicit calls to the destructor(something you'd only do in combination with placement new). The allocator class provides following methods for that which I presume vector's implementations use
相反,它使用它的分配器来分配和释放原始内存(不构造对象),并根据需要(例如,当你实际push_back
对象时)使用新的布局将复制构造的实例放置到其保留中的正确内存位置,并通过显式调用删除它们到析构函数(您只会在与放置新位置结合时做的事情)。分配器类为我认为向量的实现使用的方法提供了以下方法
void construct(pointer p, const_reference val);
Returns:
new((void *)p) T(val)
void destroy(pointer p);
Returns:
((T*)p)->~T()
(The "returns" probably should read "effect" or similar.)
(“回报”可能应该读作“效果”或类似的。)
More about placement new
详细了解新版位
回答by jason
They use a dynamically allocated array that is regrown as needed. It is necessary to use something like an array so that the elements are contiguous in memory which is guaranteed by the standard.
它们使用动态分配的数组,该数组根据需要重新增长。有必要使用类似数组的东西,以便元素在内存中是连续的,这是由标准保证的。
Incidentally, one common way of regrowing an array is to double the size as needed. This is so that if you are inserting n
items at most only O(log n)
regrowths are performed and at most O(n)
space is wasted.
顺便提一下,重新增长数组的一种常见方法是根据需要将大小加倍。这样一来,如果您n
最多只插入项目,只会O(log n)
执行再生长,最多O(n)
会浪费空间。
You can read one implementation for yourself at SGI(where STL was originally conceived).
您可以在SGI(最初构想 STL 的地方)为自己阅读一个实现。
回答by bmargulies
There is no one way it is implemented. Different implementations can be different, so long as the preserve the semantics and satisfy the requirements.
没有一种实现方式。不同的实现可以不同,只要保留语义并满足要求即可。
At any given time, there has to be a primitive array of T to satisfy the requirements of contiguity. However, how it is allocated, grown, shrunk, and freed is up to the implementor.
在任何给定的时间,都必须有一个 T 的原始数组来满足连续性的要求。然而,如何分配、增长、收缩和释放它取决于实现者。
You can read the implementation for yourself, it's right there in the header file.
您可以自己阅读实现,它就在头文件中。
I can tell you that noimplementations use linked lists. They aren't consistent with the requirements of the standard.
我可以告诉你,没有任何实现使用链表。它们不符合标准的要求。
回答by Potatoswatter
Section 23.2.4, ?1 of the standard requires that arithmetic on pointers into a vector work the same as with pointers into an array.
标准的第 23.2.4 节,?1 要求对指向向量的指针的算术与指向数组的指针相同。
The elements of a vector are stored contiguously, meaning that if v is a vector where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().
向量的元素是连续存储的,这意味着如果 v 是一个向量,其中 T 是 bool 以外的某种类型,那么对于所有 0 <= n < v,它都遵循恒等式 &v[n] == &v[0] + n 。尺寸()。
This guarantees that the storage is in an array. Of course, if you resize the array to be bigger, it might get moved in memory.
这保证了存储在一个数组中。当然,如果您将数组调整得更大,它可能会在内存中移动。
回答by Alexandros Gezerlis
A pedagogic (and thus simplified) version of a container called "Vec" is discussed in Chapter 11 of the wonderful (introductory) book "Accelerated C++". What they describe is a stripped-down version of std::vector, but I think it is still worth noting that:
在精彩(介绍)书籍“Accelerated C++”的第 11 章中讨论了一个名为“Vec”的容器的教学(因此简化)版本。他们描述的是 std::vector 的精简版本,但我认为仍然值得注意的是:
1) they implement their template class in terms of an array,
1) 他们以数组的形式实现他们的模板类,
2) they discuss push_back in terms of the trick (mentioned above) of allocating more storage than is needed, and coming back for more when they run out, and
2) 他们讨论 push_back 的技巧(如上所述),分配比需要更多的存储,并在用完时返回更多,并且
3) they use allocator<T
> for memory management. The new operator is not flexible enough in this context, since it both allocates and initializes memory.
3) 他们使用 allocator <T
> 进行内存管理。new 运算符在这种情况下不够灵活,因为它既分配又初始化内存。
I repeat, though, that this doesn't mean that actual implementations out there are this simple. But since "Accelerated C++" is quite widespread, those interested can find in the relevant chapter one way vector-like objects can be created, copied, assigned, and destroyed.
不过,我重复一遍,这并不意味着实际的实现就这么简单。但是由于“Accelerated C++”非常普遍,感兴趣的人可以在相关章节中找到一种创建、复制、分配和销毁类向量对象的方法。
EDIT: On a related note, I just found the following blog post by Herb Sutter in which he comments on an earlier blog post by Andrew Koenig, regarding whether or not one should be worried about vector elements being contiguous in memory: Cringe not: Vectors are guaranteed to be contiguous.
编辑:在相关说明中,我刚刚找到了 Herb Sutter 的以下博客文章,其中他评论了 Andrew Koenig 较早的一篇博客文章,关于是否应该担心向量元素在内存中是连续的:Cringe not: Vectors保证是连续的。
回答by Charles Eli Cheese
There's no actual array at all in any decent implementation (if there is, you can't use any object in it without a default constructor), but just raw memory that gets allocated. It gets allocated in a manner that's usually along the lines of doubling every time you need to expand it.
在任何体面的实现中根本没有实际的数组(如果有,你不能在没有默认构造函数的情况下使用任何对象),而只是分配的原始内存。它以一种通常在每次需要扩展时加倍的方式分配。
The vector then uses in place allocation to call the constructors of the class in the proper location once each slot actually gets used actually used.
一旦每个插槽被实际使用,向量就使用就地分配在适当的位置调用类的构造函数。
When there is expansion it will try to reallocate in place (but this is a bit silly and doesn't normally work, think windows 98 heap compaction) but usually will end up making a whole new allocation and copying over.
当有扩展时,它会尝试就地重新分配(但这有点愚蠢并且通常不起作用,想想 Windows 98 堆压缩)但通常最终会进行全新的分配和复制。
A standard stl vector is always all together, but not all implementations work like that (I know, having written some of them). Probably none are exactly a linked list, though, either.
一个标准的 stl 向量总是在一起,但并不是所有的实现都是这样工作的(我知道,已经写了一些)。不过,也可能没有一个完全是链表。
回答by paxos1977
I believe the STL uses option #2 (or something similar) because a std::vector<> is guaranteed to store the elements in contiguous memory.
我相信 STL 使用选项 #2(或类似的东西),因为 std::vector<> 保证将元素存储在连续内存中。
If you're looking for a memory structure that doesn't need to use contiguous memory, look at std::deque.
如果您正在寻找不需要使用连续内存的内存结构,请查看 std::deque。
回答by Yogesh Arora
From what i have read in books and from the functionality of reserve and and the requirement that elements of vectors be contiguous, This is what i think could be a possible way to implement Vector.
从我在书中读到的内容以及保留的功能以及向量元素连续的要求,我认为这可能是实现 Vector 的一种可能方式。
1) Elements of vectors be contiguous , supporting O(1) random access and vectors should be compatible with C arrays. This just implies there are no linked lists.
1) 向量的元素是连续的,支持 O(1) 随机访问,向量应该与 C 数组兼容。这只是意味着没有链表。
2) When you call reserve it reserves additional memory. But reserve does call
2)当你调用reserve时,它会保留额外的内存。但储备确实打电话
new T[newSize]
to reserve more memory. Otherwise it will call default constructor. As uncleben explained whenever reserve is called the vector class just allocates more uninitialized memory usin its allocator (if required) and copy construct new objects into that memory using placement new(if more memory has been allocated)
以保留更多内存。否则它将调用默认构造函数。正如 uncleben 所解释的,每当调用 Reserve 时,vector 类只会在其分配器中分配更多未初始化的内存(如果需要),并使用 newplacement new 将构造新对象复制到该内存中(如果已分配更多内存)
3) Initially vector has some default capacity. for which uninitialized memory is allocated when the vector object is constructed
3) 最初矢量有一些默认容量。在构造向量对象时为其分配未初始化的内存
4) push_back copy constructs the object into the first available location. If required more memory has to be allocated in similar manner as reserve
4) push_back copy 将对象构造到第一个可用位置。如果需要,必须以与保留类似的方式分配更多内存