在 C++ 中使用数组或 std::vectors,性能差距是多少?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/381621/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using arrays or std::vectors in C++, what's the performance gap?
提问by tunnuz
In our C++ course they suggest not to use C++ arrays on new projects anymore. As far as I know Stroustroup himself suggests not to use arrays. But are there significant performance differences?
在我们的 C++ 课程中,他们建议不要再在新项目中使用 C++ 数组。据我所知,Stroustroup 本人建议不要使用数组。但是是否存在显着的性能差异?
采纳答案by Johannes Schaub - litb
Using C++ arrays with new
(that is, using dynamic arrays) should be avoided. There is the problem you have to keep track of the size, and you need to delete them manually and do all sort of housekeeping.
new
应避免使用 C++ 数组(即使用动态数组)。有一个问题,您必须跟踪大小,您需要手动删除它们并进行各种内务处理。
Using arrays on the stack is also discouraged because you don't have range checking, and passing the array around will lose any information about its size (array to pointer conversion). You should use boost::array
in that case, which wraps a C++ array in a small class and provides a size
function and iterators to iterate over it.
也不鼓励在堆栈上使用数组,因为您没有范围检查,并且传递数组将丢失有关其大小的任何信息(数组到指针的转换)。boost::array
在这种情况下,您应该使用它,它将 C++ 数组包装在一个小类中,并提供一个size
函数和迭代器来对其进行迭代。
Now the std::vector vs. native C++ arrays(taken from the internet):
现在std::vector 与本机 C++ 数组(取自互联网):
// Comparison of assembly code generated for basic indexing, dereferencing,
// and increment operations on vectors and arrays/pointers.
// Assembly code was generated by gcc 4.1.0 invoked with g++ -O3 -S on a
// x86_64-suse-linux machine.
#include <vector>
struct S
{
int padding;
std::vector<int> v;
int * p;
std::vector<int>::iterator i;
};
int pointer_index (S & s) { return s.p[3]; }
// movq 32(%rdi), %rax
// movl 12(%rax), %eax
// ret
int vector_index (S & s) { return s.v[3]; }
// movq 8(%rdi), %rax
// movl 12(%rax), %eax
// ret
// Conclusion: Indexing a vector is the same damn thing as indexing a pointer.
int pointer_deref (S & s) { return *s.p; }
// movq 32(%rdi), %rax
// movl (%rax), %eax
// ret
int iterator_deref (S & s) { return *s.i; }
// movq 40(%rdi), %rax
// movl (%rax), %eax
// ret
// Conclusion: Dereferencing a vector iterator is the same damn thing
// as dereferencing a pointer.
void pointer_increment (S & s) { ++s.p; }
// addq , 32(%rdi)
// ret
void iterator_increment (S & s) { ++s.i; }
// addq , 40(%rdi)
// ret
// Conclusion: Incrementing a vector iterator is the same damn thing as
// incrementing a pointer.
Note: If you allocate arrays with new
and allocate non-class objects (like plain int
) or classes without a user defined constructor andyou don't want to have your elements initialized initially, using new
-allocated arrays can have performance advantages because std::vector
initializes all elements to default values (0 for int, for example) on construction (credits to @bernie for reminding me).
注意:如果您分配数组new
并分配非类对象(如 plain int
)或没有用户定义的构造函数的类,并且您不想初始初始化元素,则使用new
-allocated 数组可以具有性能优势,因为std::vector
将所有元素初始化为构造时的默认值(例如 int 为 0)(感谢@bernie 提醒我)。
回答by paercebal
Preamble for micro-optimizer people
微优化人员的序言
Remember:
记住:
"Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.Yet we should not pass up our opportunities in that critical 3%".
“程序员浪费了大量时间来考虑或担心他们程序中非关键部分的速度,而这些对效率的尝试实际上在考虑调试和维护时会产生强烈的负面影响。我们应该忘记小效率,比如说97% 的情况下:过早的优化是万恶之源。但我们不应该在关键的 3% 中放弃我们的机会”。
(Thanks to metamorphosisfor the full quote)
(感谢变形记的完整报价)
Don't use a C array instead of a vector (or whatever) just because you believe it's faster as it is supposed to be lower-level. You would be wrong.
不要使用 C 数组而不是向量(或其他任何东西),因为您认为它更快,因为它应该是较低级别的。你错了。
Use by default vector (or the safe container adapted to your need), and then if your profiler says it is a problem, see if you can optimize it, either by using a better algorithm, or changing container.
默认使用向量(或适合您需要的安全容器),然后如果您的分析器说这是一个问题,请查看您是否可以通过使用更好的算法或更改容器来优化它。
This said, we can go back to the original question.
这就是说,我们可以回到最初的问题。
Static/Dynamic Array?
静态/动态数组?
The C++ array classes are better behaved than the low-level C array because they know a lot about themselves, and can answer questions C arrays can't. They are able to clean after themselves. And more importantly, they are usually written using templates and/or inlining, which means that what appears to a lot of code in debug resolves to little or no code produced in release build, meaning no difference with their built-in less safe competition.
C++ 数组类比低级 C 数组表现得更好,因为它们对自己有很多了解,并且可以回答 C 数组不能回答的问题。他们可以自己打扫卫生。更重要的是,它们通常是使用模板和/或内联编写的,这意味着在调试中出现的大量代码会解析为在发布版本中生成的代码很少或没有,这意味着与它们内置的不太安全的竞争没有区别。
All in all, it falls on two categories:
总而言之,它分为两类:
Dynamic arrays
动态数组
Using a pointer to a malloc-ed/new-ed array will be at best as fast as the std::vector version, and a lot less safe (see litb's post).
使用指向 malloc-ed/new-ed 数组的指针最多与 std::vector 版本一样快,但安全性要低得多(参见litb 的帖子)。
So use a std::vector.
所以使用 std::vector。
Static arrays
静态数组
Using a static array will be at best:
最好使用静态数组:
- as fast as the std::arrayversion
- and a lot less safe.
- 与std::array版本一样快
- 而且安全性要差很多。
So use a std::array.
所以使用std::array。
Uninitialized memory
未初始化的内存
Sometimes, using a vector
instead of a raw buffer incurs a visible cost because the vector
will initialize the buffer at construction, while the code it replaces didn't, as remarked bernieby in his answer.
有时,使用 avector
而不是原始缓冲区会产生可见的成本,因为它vector
会在构造时初始化缓冲区,而它替换的代码却没有,正如bernieby 在他的回答中所说。
If this is the case, then you can handle it by using a unique_ptr
instead of a vector
or, if the case is not exceptional in your codeline, actually write a class buffer_owner
that will own that memory, and give you easy and safe access to it, including bonuses like resizing it (using realloc
?), or whatever you need.
如果是这种情况,那么您可以使用 aunique_ptr
而不是 a来处理它,vector
或者,如果您的代码行中的情况并不例外,则实际编写一个buffer_owner
将拥有该内存的类,并让您轻松安全地访问它,包括奖金,比如调整它的大小(使用realloc
?),或者你需要的任何东西。
回答by EvilTeach
Vectors are arrays under the hood. The performance is the same.
向量是引擎盖下的数组。性能是一样的。
One place where you can run into a performance issue, is not sizing the vector correctly to begin with.
您可能会遇到性能问题的一个地方是,一开始就没有正确调整向量的大小。
As a vector fills, it will resize itself, and that can imply, a new array allocation, followed by n copy constructors, followed by about n destructor calls, followed by an array delete.
当向量填充时,它将调整自身大小,这可能意味着一个新的数组分配,然后是 n 个复制构造函数,然后是大约 n 个析构函数调用,然后是一个数组删除。
If your construct/destruct is expensive, you are much better off making the vector the correct size to begin with.
如果您的构造/破坏很昂贵,那么您最好从一开始就使向量具有正确的大小。
There is a simple way to demonstrate this. Create a simple class that shows when it is constructed/destroyed/copied/assigned. Create a vector of these things, and start pushing them on the back end of the vector. When the vector fills, there will be a cascade of activity as the vector resizes. Then try it again with the vector sized to the expected number of elements. You will see the difference.
有一种简单的方法可以证明这一点。创建一个简单的类,显示何时构造/销毁/复制/分配。创建这些东西的向量,并开始将它们推到向量的后端。当矢量填满时,随着矢量调整大小,将会有一系列活动。然后使用大小为预期元素数量的向量再试一次。你会看到不同之处。
回答by Frank Krueger
To respond to something Mehrdadsaid:
回应Mehrdad所说的话:
However, there might be cases where you still need arrays. When interfacing with low level code (i.e. assembly) or old libraries that require arrays, you might not be able to use vectors.
但是,在某些情况下您仍然需要数组。当与需要数组的低级代码(即汇编)或旧库交互时,您可能无法使用向量。
Not true at all. Vectors degrade nicely into arrays/pointers if you use:
根本不是真的。如果您使用,向量可以很好地降级为数组/指针:
vector<double> vector;
vector.push_back(42);
double *array = &(*vector.begin());
// pass the array to whatever low-level code you have
This works for all major STL implementations. In the next standard, it will be required to work (even though it does just fine today).
这适用于所有主要的 STL 实现。在下一个标准中,它将被要求工作(即使它今天做得很好)。
回答by Germán Diago
You have even fewer reasons to use plain arrays in C++11.
在 C++11 中使用普通数组的理由更少。
There are 3 kind of arrays in nature from fastest to slowest, depending on the features they have (of course the quality of implementation can make things really fast even for case 3 in the list):
自然界中有 3 种数组,从最快到最慢,这取决于它们具有的功能(当然,即使对于列表中的情况 3,实现的质量也可以使事情变得非常快):
- Static with size known at compile time. ---
std::array<T, N>
- Dynamic with size known at runtime and never resized. The typical optimization here is, that if the array can be allocated in the stack directly. -- Not available. Maybe
dynarray
in C++ TS after C++14. In C there are VLAs - Dynamic and resizable at runtime. ---
std::vector<T>
- 静态大小在编译时已知。---
std::array<T, N>
- 动态大小在运行时已知,从不调整大小。这里典型的优化是,如果数组可以直接在堆栈中分配。-不可用。也许
dynarray
在 C++14 之后的 C++ TS 中。在 C 中有 VLA - 在运行时动态和可调整大小。---
std::vector<T>
For 1.plain static arrays with fixed number of elements, use std::array<T, N>
in C++11.
对于具有固定元素数量的1.纯静态数组,请std::array<T, N>
在 C++11 中使用。
For 2.fixed size arrays specified at runtime, but that won't change their size, there is discussion in C++14 but it has been moved to a technical specification and made out of C++14 finally.
为2.在运行时指定的固定大小的数组,但这不会改变它们的大小,有在C ++讨论14,但它已被移动到一个技术规范和C ++ 14做出来最后。
For 3.std::vector<T>
will usually ask for memory in the heap. This could have performance consequences, though you could use std::vector<T, MyAlloc<T>>
to improve the situation with a custom allocator. The advantage compared to T mytype[] = new MyType[n];
is that you can resize it and that it will not decay to a pointer, as plain arrays do.
对于3.std::vector<T>
通常会要求堆中的内存。这可能会对性能产生影响,但您可以使用std::vector<T, MyAlloc<T>>
自定义分配器来改善这种情况。与T mytype[] = new MyType[n];
它相比的优势在于您可以调整它的大小并且它不会像普通数组那样衰减为指针。
Use the standard library types mentioned to avoid arrays decaying to pointers. You will save debugging time and the performance is exactlythe same as with plain arrays if you use the same set of features.
使用提到的标准库类型来避免 数组衰减为指针。您将节省调试时间和性能完全相同,如果你使用相同的功能集相同,与普通的数组。
回答by John D. Cook
Go with STL. There's no performance penalty. The algorithms are very efficient and they do a good job of handling the kinds of details that most of us would not think about.
使用 STL。没有性能损失。这些算法非常有效,它们在处理我们大多数人不会想到的各种细节方面做得很好。
回答by lalebarde
The conclusion is that arrays of integers are faster than vectors of integers (5 times in my example). However, arrays and vectors are arround the same speed for more complex / not aligned data.
结论是整数数组比整数向量快(在我的例子中是 5 倍)。但是,对于更复杂/未对齐的数据,数组和向量的速度大致相同。
回答by Mehrdad Afshari
STL is a heavily optimized library. In fact, it's even suggested to use STL in games where high performance might be needed. Arrays are too error prone to be used in day to day tasks. Today's compilers are also very smart and can really produce excellent code with STL. If you know what you are doing, STL can usually provide the necessary performance. For example by initializing vectors to required size (if you know from start), you can basically achieve the array performance. However, there might be cases where you still need arrays. When interfacing with low level code (i.e. assembly) or old libraries that require arrays, you might not be able to use vectors.
STL 是一个高度优化的库。事实上,甚至建议在可能需要高性能的游戏中使用 STL。数组太容易出错,无法在日常任务中使用。今天的编译器也非常聪明,可以真正用 STL 生成优秀的代码。如果您知道自己在做什么,STL 通常可以提供必要的性能。例如通过将向量初始化为所需的大小(如果您从一开始就知道),您基本上可以实现数组性能。但是,在某些情况下您仍然需要数组。当与需要数组的低级代码(即汇编)或旧库交互时,您可能无法使用向量。
回答by Mehrdad Afshari
If you compile the software in debug mode, many compilers will not inline the accessor functions of the vector. This will make the stl vector implementation much slower in circumstances where performance is an issue. It will also make the code easier to debug since you can see in the debugger how much memory was allocated.
如果在调试模式下编译软件,许多编译器不会内联向量的访问器函数。在性能有问题的情况下,这将使 stl 向量实现慢得多。它还将使代码更易于调试,因为您可以在调试器中看到分配了多少内存。
In optimized mode, I would expect the stl vector to approach the efficiency of an array. This is since many of the vector methods are now inlined.
在优化模式下,我希望 stl 向量接近数组的效率。这是因为许多向量方法现在都是内联的。
回答by bernie
There is definitely a performance impact to using an std::vector
vs a raw array when you want an uninitializedbuffer (e.g. to use as destination for memcpy()
). An std::vector
will initialize all its elements using the default constructor. A raw array will not.
std::vector
当您想要一个未初始化的缓冲区(例如用作 的目标memcpy()
)时,使用vs 原始数组肯定会对性能产生影响。Anstd::vector
将使用默认构造函数初始化其所有元素。原始数组不会。
The c++ specfor the std:vector
constructor taking a count
argument (it's the third form) states:
带有参数的构造函数的C++ 规范(它是第三种形式)指出:std:vector
count
`Constructs a new container from a variety of data sources, optionally using a user supplied allocator alloc.
3) Constructs the container with count default-inserted instances of T. No copies are made.
Complexity
2-3) Linear in count
`从各种数据源构造一个新容器,可选择使用用户提供的分配器分配。
3) 使用 count 个默认插入的 T 实例构造容器。不制作副本。
复杂
2-3) 线性计数
A raw array does not incur this initialization cost.
原始数组不会产生这种初始化成本。
See also How can I avoid std::vector<> to initialize all its elements?