C++ 关于向量增长

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5232198/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 17:44:06  来源:igfitidea点击:

About Vectors growth

c++visual-c++

提问by Sadique

Had been going through the Book: C++ Primer, Third Edition By Stanley B. Lippman, Josée Lajoie

一直在阅读这本书:C++ 入门,第三版 作者:Stanley B. Lippman,Josée Lajoie

Found 1 mistake until now. ...In the program given under the Article 6.3 How a vector Grows Itself, this Program misses a "<" in the couts!! The program given is:

到现在为止发现了 1 个错误。...在第 6.3 条向量如何自我增长下给出的程序中,该程序在 couts 中遗漏了一个“<”!!给出的程序是:

#include <vector>
#include <iostream>

int main(){
vector< int > ivec;
cout < "ivec: size: " < ivec.size()
< " capacity: "  < ivec.capacity() < endl;

for ( int ix = 0; ix < 24; ++ix ) {
ivec.push_back( ix );
cout < "ivec: size: " < ivec.size()
< " capacity: "  < ivec.capacity() < endl;
}    
}

Now i corrected the problem. Later within that article the book says the following: " Under the Rogue Wave implementation, both the size and the capacity of ivec after its definition are 0. On inserting the first element, however, ivec's capacity is 256 and its size is 1."

现在我纠正了这个问题。在那篇文章的后面,该书写道:“在 Rogue Wave 实现下,ivec 定义后的大小和容量均为 0。然而,在插入第一个元素时,ivec 的容量为 256,其大小为 1。”

But, on correcting and running the code i get the following output:

但是,在更正和运行代码时,我得到以下输出:



ivec: size: 0 capacity: 0
ivec[0]=0 ivec: size: 1 capacity: 1
ivec[1]=1 ivec: size: 2 capacity: 2
ivec[2]=2 ivec: size: 3 capacity: 4
ivec[3]=3 ivec: size: 4 capacity: 4
ivec[4]=4 ivec: size: 5 capacity: 8
ivec[5]=5 ivec: size: 6 capacity: 8
ivec[6]=6 ivec: size: 7 capacity: 8
ivec[7]=7 ivec: size: 8 capacity: 8
ivec[8]=8 ivec: size: 9 capacity: 16
ivec[9]=9 ivec: size: 10 capacity: 16
ivec[10]=10 ivec: size: 11 capacity: 16
ivec[11]=11 ivec: size: 12 capacity: 16
ivec[12]=12 ivec: size: 13 capacity: 16
ivec[13]=13 ivec: size: 14 capacity: 16
ivec[14]=14 ivec: size: 15 capacity: 16
ivec[15]=15 ivec: size: 16 capacity: 16
ivec[16]=16 ivec: size: 17 capacity: 32
ivec[17]=17 ivec: size: 18 capacity: 32
ivec[18]=18 ivec: size: 19 capacity: 32
ivec[19]=19 ivec: size: 20 capacity: 32
ivec[20]=20 ivec: size: 21 capacity: 32
ivec[21]=21 ivec: size: 22 capacity: 32
ivec[22]=22 ivec: size: 23 capacity: 32
ivec[23]=23 ivec: size: 24 capacity: 32


Hence the initial capacity is increasing with the formula 2^N isn't it??Where Nis the initial capacity. Please explain.

因此初始容量随着公式 2^N 增加不是吗??其中N是初始容量。请解释。

回答by Omnifarious

The rate at which the capacity of a vector grows is implementation dependent. Implementations almost invariably choose exponential growth, in order to meet the amortized constant timerequirement for the push_backoperation. What amortized constant timemeans and how exponential growth achieves this is interesting.

向量容量增长的速度取决于实现。实现几乎总是选择指数增长,以满足操作的摊销恒定时间要求push_back摊销固定时间意味着什么以及指数增长如何实现这一点很有趣。

Every time a vector's capacity is grown the elements need to be copied. If you 'amortize' this cost out over the lifetime of the vector, it turns out that if you increase the capacity by an exponential factor you end up with an amortized constant cost.

每次向量的容量增加时,都需要复制元素。如果您在向量的整个生命周期内“摊销”此成本,结果是如果您以指数因子增加容量,您最终会得到摊销的恒定成本。

This probably seems a bit odd, so let me explain to you how this works...

这可能看起来有点奇怪,所以让我向您解释这是如何工作的......

  • size: 1 capacity 1 - No elements have been copied, the cost per element for copies is 0.
  • size: 2 capacity 2 - When the vector's capacity was increased to 2, the first element had to be copied. Average copies per element is 0.5
  • size: 3 capacity 4 - When the vector's capacity was increased to 4, the first two elements had to be copied. Average copies per element is (2 + 1 + 0) / 3 = 1.
  • size: 4 capacity 4 - Average copies per element is (2 + 1 + 0 + 0) / 4 = 3 / 4 = 0.75.
  • size: 5 capacity 8 - Average copies per element is (3 + 2 + 1 + 1 + 0) / 5 = 7 / 5 = 1.4
  • ...
  • size: 8 capacity 8 - Average copies per element is (3 + 2 + 1 + 1 + 0 + 0 + 0 + 0) / 8 = 7 / 8 = 0.875
  • size: 9 capacity 16 - Average copies per element is (4 + 3 + 2 + 2 + 1 + 1 + 1 + 1 + 0) / 9 = 15 / 9 = 1.67
  • ...
  • size 16 capacity 16 - Average copies per element is 15 / 16 = 0.938
  • size 17 capacity 32 - Average copies per element is 31 / 17 = 1.82
  • size: 1 capacity 1 - 没有元素被复制,复制的每个元素的成本是 0。
  • size: 2 capacity 2 - 当向量的容量增加到 2 时,必须复制第一个元素。每个元素的平均副本为 0.5
  • size: 3 capacity 4 - 当向量的容量增加到 4 时,必须复制前两个元素。每个元素的平均副本为 (2 + 1 + 0) / 3 = 1。
  • 大小:4 容量 4 - 每个元素的平均副本数为 (2 + 1 + 0 + 0) / 4 = 3 / 4 = 0.75。
  • 大小:5 容量 8 - 每个元素的平均副本数为 (3 + 2 + 1 + 1 + 0) / 5 = 7 / 5 = 1.4
  • ...
  • 大小:8 容量 8 - 每个元素的平均副本数为 (3 + 2 + 1 + 1 + 0 + 0 + 0 + 0) / 8 = 7 / 8 = 0.875
  • 大小:9 容量 16 - 每个元素的平均副本数为 (4 + 3 + 2 + 2 + 1 + 1 + 1 + 1 + 0) / 9 = 15 / 9 = 1.67
  • ...
  • 大小 16 容量 16 - 每个元素的平均副本数为 15 / 16 = 0.938
  • 大小 17 容量 32 - 每个元素的平均副本数为 31 / 17 = 1.82

As you can see, every time the capacity jumps, the number of copies goes up by the previous size of the array. But because the array has to double in size before the capacity jumps again, the number of copies per element always stays less than 2.

如您所见,每次容量跳跃时,副本数量都会增加数组的先前大小。但是因为在容量再次跳跃之前数组的大小必须加倍,所以每个元素的副本数始终保持小于 2。

If you increased the capacity by 1.5 * N instead of by 2 * N, you would end up with a very similar effect, except the upper bound on the copies per element would be higher (I think it would be 3).

如果您将容量增加 1.5 * N 而不是 2 * N,您最终会得到非常相似的效果,除了每个元素的副本上限会更高(我认为应该是 3)。

I suspect an implementation would choose 1.5 over 2 both to save a bit of space, but also because 1.5 is closer to the golden ratio. I have an intuition (that is currently not backed up by any hard data) that a growth rate in line with the golden ratio (because of its relationship to the fibonacci sequence) will prove to be the most efficient growth rate for real-world loads in terms of minimizing both extra space used and time.

我怀疑一个实现会选择 1.5 而不是 2 来节省一点空间,但也因为 1.5 更接近黄金比例。我有一个直觉(目前没有任何硬数据支持),符合黄金比例(因为它与斐波那契数列的关系)的增长率将被证明是现实世界负载最有效的增长率在最大限度地减少使用的额外空间和时间方面。

回答by David Rodríguez - dribeas

To be able to provide amortized constant timeinsertions at the end of the std::vector, the implementation must grow the size of the vector (when needed) by a factor K>1(*), such that when trying to append to a vector of size Nthat is full, the vector grows to be K*N.

为了能够在 的末尾提供分摊的常量时间插入std::vector,实现必须将向量的大小(在需要时)增加一个因子K>1(*),这样当尝试附加到一个大小N为满的向量时,向量增长为K*N

Different implementations use different constants Kthat provide different benefits, in particular most implementations go for either K = 2or K = 1.5. A higher Kwill make it faster as it will require less grows, but it will at the same time have a greater memory impact. As an example, in gcc K = 2, while in VS (Dinkumware) K = 1.5.

不同的实现使用不同的常量K来提供不同的好处,特别是大多数实现都使用K = 2K = 1.5。更高的K将使它更快,因为它需要更少的增长,但同时它会产生更大的内存影响。例如,在 gcc 中K = 2,而在 VS (Dinkumware) 中K = 1.5

(*) If the vector grew by a constant quantity, then the complexity of push_backwould become linear instead of amortized constant. For example, if the vector grew by 10 elements when needed, the cost of growing (copy of all element to the new memory address) would be O( N / 10 )(every 10 elements, move everything) or O( N ).

(*) 如果向量以一个常数增长,那么 的复杂度push_back将变为线性而不是摊销常数。例如,如果向量在需要时增加 10 个元素,则增长(将所有元素复制到新内存地址)的成本将为O( N / 10 )(每 10 个元素,移动所有内容)或O( N )

回答by Baiyan Huang

Just to add some mathematic proof on the time complexity on vector::push_back, say the size of vector is n, what we care about here is the number of copies happened so far, say y, notice the copy happens every time you grow the vector.

只是在 上添加一些关于时间复杂度的数学证明vector::push_back,比如向量的大小是n,我们在这里关心的是到目前为止发生的副本数量,例如y,注意每次增加向量时都会发生副本。

Grow by factor ofK

按因子增长K

  y = K^1 + K^2 + K^3 ... K^log(K, n)
K*y =     + K^2 + K^3 ... K^log(K, n) + K*K^log(K, n)

K*y-y = K*K^log(K, n) - K
y = K(n-1)/(K-1) = (K/(K-1))(n-1)

T(n) = y/n = (K/(K-1)) * (n-1)/n < K/(K-1) = O(1)

K/(K-1)is a constant, and see the most common cases:

K/(K-1)是一个常数,看看最常见的情况:

  • K=2, T(n) = 2/(2-1) = 2
  • K=1.5, T(n) = 1.5/(1.5-1) = 3
  • K=2,T(n) = 2/(2-1) = 2
  • K=1.5,T(n) = 1.5/(1.5-1) = 3

and actually there is a reason of choosing K as 1.5 or 2 in different implementations, see this graph: as T(n)reaching the minimum when Kis around 2, there is not much benefit on using a larger K, at the cost of allocating more memory

实际上,在不同的实现中选择 K 作为 1.5 或 2 是有原因的,请参见此T(n)K在 2 左右时达到最小值,使用更大的 没有太大好处K,但代价是分配更多内存

Grow by constant quantity ofC

以恒定数量增长C

y = C + 2*C + 3*C + 4*C +  ... (n/C) * C
  = C(1+2+3+...+n/C), say m = n/C
  = C*(m*(m-1)/2)
  = n(m-1)/2

T(n) = y/n = (n(m-1)/2)/n = (m-1)/2 = n/2C - 1/2 = O(n)

As we could see it is liner

正如我们所看到的,它是班轮

回答by Kiril Kirov

The capacity of the vectoris completely implementation-dependent, no one can tell how it's growing..

的容量vector完全依赖于实现,没有人知道它是如何增长的。

回答by Erik

Are you using the "Rogue Wave" implementation?

您在使用“Rogue Wave”实现吗?

How capacity grows is up to the implementation. Yours use 2^N.

容量如何增长取决于实施。你使用 2^N。

回答by swegi

Yes, the capacity doubles each time it is exceeded. This is implementation dependent.

是的,每次超过容量都会增加一倍。这是依赖于实现的。