C++中vector是如何实现的
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3064559/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How is vector implemented in C++
提问by John Smith
I am thinking of how I can implement std::vector
from the ground up.
我正在考虑如何std::vector
从头开始实施。
How does it resize the vector?
它如何调整向量的大小?
realloc
only seems to work for plain old stucts, or am I wrong?
realloc
似乎只适用于普通的旧结构,还是我错了?
回答by Evan Teran
it is a simple templated class which wraps a native array. It does notuse malloc
/realloc
. Instead, it uses the passed allocator (which by default is std::allocator
).
它是一个简单的模板化类,它包装了一个本机数组。它不使用malloc
/ realloc
。相反,它使用传递的分配器(默认为std::allocator
)。
Resizing is done by allocating a new array and copy constructing each element in the new array from the old one (this way it is safe for non-POD objects). To avoid frequent allocations, often they follow a non-linear growth pattern.
调整大小是通过分配一个新数组并从旧数组复制构造新数组中的每个元素来完成的(这样对于非 POD 对象是安全的)。为了避免频繁分配,它们通常遵循非线性增长模式。
UPDATE:in C++11, the elements will be moved instead of copy constructed if it is possible for the stored type.
更新:在 C++11 中,如果存储类型可能,元素将被移动而不是复制构造。
In addition to this, it will need to store the current "size" and "capacity". Size is how many elements are actually in the vector. Capacity is how many couldbe in the vector.
除此之外,它还需要存储当前的“大小”和“容量”。大小是向量中实际有多少元素。容量是向量中可以有多少。
So as a starting point a vector will need to look somewhat like this:
因此,作为起点,向量需要看起来像这样:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_;
typename A::size_type capacity_;
typename A::size_type size_;
A allocator_;
};
The other common implementation is to store pointers to the different parts of the array. This cheapens the cost of end()
(which no longer needs an addition) ever so slightly at the expense of a marginally more expensive size()
call (which now needs a subtraction). In which case it could look like this:
另一种常见的实现是存储指向数组不同部分的指针。这end()
以稍微昂贵的size()
调用(现在需要减法)为代价降低了(不再需要加法)的成本。在这种情况下,它可能如下所示:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_; // points to first element
T* end_capacity_; // points to one past internal storage
T* end_; // points to one past last element
A allocator_;
};
I believe gcc's libstdc++ uses the latter approach, but both approaches are equally valid and conforming.
我相信 gcc 的 libstdc++ 使用后一种方法,但这两种方法同样有效且符合要求。
NOTE:This is ignoring a common optimization where the empty base class optimization is used for the allocator. I think that is a quality of implementation detail, and not a matter of correctness.
注意:这忽略了一个常见的优化,其中空基类优化用于分配器。我认为这是实现细节的质量,而不是正确性的问题。
回答by Jerry Coffin
Resizing the vector requires allocating a new chunk of space, and copying the existing data to the new space (thus, the requirement that items placed into a vector can be copied).
调整向量大小需要分配新的空间块,并将现有数据复制到新空间(因此,可以复制放入向量中的项目)。
Note that it does notuse new []
either -- it uses the allocator that's passed, but that's required to allocate rawmemory, not an array of objects like new []
does. You then need to use placement new
to construct objects in place. [Edit: well, you could technically use new char[size]
, and use that as raw memory, but I can't quite imagine anybody writing an allocator like that.]
请注意,它并没有使用new []
任何-它使用则传递分配器,但是这需要分配的原始内存,而不是对象的数组一样new []
呢。然后您需要使用placement new
来构造对象。[编辑:好吧,从技术上讲,您可以使用new char[size]
, 并将其用作原始内存,但我无法想象有人会编写这样的分配器。]
When the current allocation is exhausted and a new block of memory needs to be allocated, the size must be increased by a constant factorcompared to the old size to meet the requirement for amortized constant complexity for push_back
. Though many web sites (and such) call this doubling the size, a factor around 1.5 to 1.6 usually works better. In particular, this generally improves chances of re-using freed blocks for future allocations.
当当前分配用完并且需要分配新的内存块时,大小必须比旧大小增加一个常数因子,以满足 的摊销常数复杂度的要求push_back
。尽管许多网站(以及其他网站)称其为两倍大小,但 1.5 到 1.6 左右的系数通常效果更好。特别是,这通常会提高重新使用已释放块以进行未来分配的机会。
回答by Serapth
From Wikipedia, as good an answer as any.
来自Wikipedia,与任何答案一样好。
A typical vector implementation consists, internally, of a pointer to a dynamically allocated array,[2] and possibly data members holding the capacity and size of the vector. The size of the vector refers to the actual number of elements, while the capacity refers to the size of the internal array. When new elements are inserted, if the new size of the vector becomes larger than its capacity, reallocation occurs.[2][4] This typically causes the vector to allocate a new region of storage, move the previously held elements to the new region of storage, and free the old region. Because the addresses of the elements change during this process, any references or iterators to elements in the vector become invalidated.[5] Using an invalidated reference causes undefined behaviour
典型的向量实现在内部由一个指向动态分配数组的指针 [2] 和可能包含向量容量和大小的数据成员组成。向量的大小是指实际的元素个数,容量是指内部数组的大小。当插入新元素时,如果向量的新大小变得大于其容量,则会发生重新分配。 [2] [4] 这通常会导致向量分配一个新的存储区域,将之前持有的元素移动到新的存储区域,并释放旧区域。由于在此过程中元素的地址会发生变化,因此向量中元素的任何引用或迭代器都会失效。 [5] 使用无效的引用会导致未定义的行为
回答by log0
Like this: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_vector.h
像这样:https: //github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_vector.h
(official gcc mirror on github)
(github上的官方gcc镜像)
回答by Macke
It allocates a new array and copies everything over. So, expanding it is quite inefficient if you have to do it often. Use reserve() if you have to use push_back().
它分配一个新数组并复制所有内容。因此,如果您必须经常进行扩展,则扩展它的效率非常低。如果必须使用 push_back(),请使用 Reserve()。
回答by Owen S.
You'd need to define what you mean by "plain old structs."
您需要定义“普通旧结构”的含义。
realloc by itself only creates a block of uninitialized memory. It does no object allocation. For C structs, this suffices, but for C++ it does not.
realloc 本身只创建一块未初始化的内存。它不进行对象分配。对于 C 结构,这已经足够了,但对于 C++ 则不然。
That's not to say you couldn't use realloc. But if you were to use it (note you wouldn't be reimplementing std::vector
exactly in this case!), you'd need to:
这并不是说您不能使用 realloc。但是,如果您要使用它(请注意,std::vector
在这种情况下您不会完全重新实现!),您需要:
- Make sure you're consistently using
malloc/realloc/free
throughout your class. - Use "placement
new
" to initialize objects in your memory chunk. - Explicitly call destructors to clean up objects before freeing your memory chunk.
- 确保您在
malloc/realloc/free
整个课程中始终如一地使用。 - 使用“放置
new
”来初始化内存块中的对象。 - 在释放内存块之前显式调用析构函数来清理对象。
This is actually pretty close to what vector does in my implementation (GCC/glib), except it uses the C++ low-level routines ::operator new
and ::operator delete
to do the raw memory management instead of malloc and free, rewrites the realloc routine using these primitives, and delegates all of this behavior to an allocator object that can be replaced with a custom implementation.
这实际上与 vector 在我的实现(GCC/glib)中所做的非常接近,除了它使用 C++ 低级例程::operator new
并::operator delete
执行原始内存管理而不是 malloc 和 free,使用这些原语重写 realloc 例程,并委托所有这些行为都指向一个分配器对象,该对象可以用自定义实现替换。
Since vector is a template, you actually should have its source to look at if you want a reference – if you can get past the preponderance of underscores, it shouldn't be too hard to read. If you're on a Unix box using GCC, try looking for /usr/include/c++/version/vector
or thereabouts.
由于 vector 是一个模板,如果您想要参考,您实际上应该查看其来源——如果您可以克服下划线的优势,那么阅读它应该不会太难。如果您在使用 GCC 的 Unix 机器上,请尝试查找或附近。/usr/include/c++/version/vector
回答by Manish Chandra Joshi
You can implement them with resizing array implementation. When the array becomes full, create an array with twice as much the size and copy all the content to the new array. Do not forget to delete the old array.
您可以通过调整大小数组实现来实现它们。当数组变满时,创建一个两倍大小的数组并将所有内容复制到新数组中。不要忘记删除旧数组。
As for deleting the elements from vector, do resizing when your array becomes a quarter full. This strategy makes prevents any performance glitches when one might try repeated insertion and deletion at half the array size.
至于从向量中删除元素,请在数组满四分之一时调整大小。当人们可能尝试以数组大小的一半重复插入和删除时,此策略可防止出现任何性能故障。
It can be mathematically proved that the amortized time (Average time) for insertions is still linear for n insertions which is asymptotically the same as you will get with a normal static array.
可以从数学上证明,插入的摊销时间(平均时间)对于 n 次插入仍然是线性的,这与使用普通静态数组所得到的渐近相同。
回答by Edward Strange
realloc only works on heap memory. In C++ you usually want to use the free store.
realloc 仅适用于堆内存。在 C++ 中,您通常希望使用免费商店。