C++ 矢量或地图,使用哪一个?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/454762/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
vector or map, which one to use?
提问by Naveen
I've heard many people say that if the number of expected elements in the container is relatively small, it is better to use std::vector
instead of std::map
even if you were to use the container for lookups only and not iterating.
我听很多人说,如果容器中预期元素的数量相对较少,即使您将容器仅用于查找而不是迭代,也最好使用std::vector
而不是std::map
使用。
What is the real reason behind this?
这背后的真正原因是什么?
Obviously the lookup performance of std::map
cannot be worse than std::vector
(although it may differ in nanoseconds/microseconds) so does it have something to do with memory usage?
显然, 的查找性能std::map
不能比std::vector
(尽管它可能以纳秒/微秒为单位有所不同)更差,所以它与内存使用量有关吗?
Does std::vector
fare any better/worse than std::map
in fragmenting the virtual address space?
是否std::vector
比std::map
分割虚拟地址空间更好/更糟?
I am using the STL library that comes along with Visual Studio (i.e. Microsoft's implementation). Does that make any difference compared to other implementations?
我正在使用 Visual Studio 附带的 STL 库(即 Microsoft 的实现)。与其他实现相比,这有什么区别吗?
回答by Doug
I presume you're comparing map<A, B>
with vector<pair<A, B> >
.
我想你是在map<A, B>
与vector<pair<A, B> >
.
Firstly, finding an item in a very small vector can easily be faster than the same thing in a map, because all the memory in a vector is always contiguous (and so plays more nicely with computers' caches and such things), and the number of comparisons needed to find something in a vector might be about the same as for a map. Finding an element in a map needs fewer operations in the limit of very large containers.
首先,在一个非常小的向量中找到一个项目很容易比在地图中找到相同的东西要快,因为一个向量中的所有内存总是连续的(因此与计算机的缓存和类似的东西一起玩得更好),并且数量在向量中查找某些内容所需的比较次数可能与地图大致相同。在非常大的容器的限制下,在地图中查找元素需要较少的操作。
The point where maps become faster than vectors depends on the implementation, on your processor, what data is in the map, and subtle things like what memory is in the processor's cache. Typically, the point where map becomes faster would be about 5-30 elements.
映射变得比向量更快的点取决于实现、处理器、映射中的数据以及诸如处理器缓存中的内存等细微之处。通常,地图变得更快的点大约是 5-30 个元素。
An alternative is to use a hash container. They are often named hash_map
or unordered_map
. Classes named hash_map
are not part of the official standard (and there are a few variants out there); std::tr1::unordered_map
is. A hash map is often faster than a normal map for lookups, regardless of how many elements are in it, but whether it is actually faster depends on what the key is, how it is hashed, what values you have to deal with, and how the key is compared in std::map. It doesn't keep things in a specific order like std::map, but you've said that you don't care about that. I'd recommend hash maps particularly if the keys are integers or pointers, because these hash very quickly.
另一种方法是使用哈希容器。它们通常被命名为hash_map
或unordered_map
。命名hash_map
的类不是官方标准的一部分(并且有一些变体);std::tr1::unordered_map
是。对于查找而言,散列映射通常比普通映射更快,无论其中有多少元素,但实际上是否更快取决于键是什么、散列的方式、必须处理的值以及如何处理键在 std::map 中进行比较。它不会像 std::map 那样按特定顺序保存事物,但您已经说过您不在乎这一点。我推荐散列映射,特别是如果键是整数或指针,因为这些散列非常快。
回答by Crashworks
Maps are usually implemented as binary search trees, and walking a binary tree always comes with a little overhead (performing comparisons, walking links, etc.) Vectors are basically just arrays. For very small amounts of data, maybe 8 or 12 elements, sometimes it's faster just to do a linear search over an array than to walk a binary search tree.
映射通常被实现为二叉搜索树,并且遍历二叉树总是会带来一些开销(执行比较、遍历链接等)。向量基本上只是数组。对于非常少量的数据,可能是 8 或 12 个元素,有时仅对数组进行线性搜索比遍历二叉搜索树更快。
You can run some timings yourself to see where the break-even point is -- time a search over four elements, then eight, then sixteen, and so on to find the sweet spot for your particular implementation of the STL.
您可以自己运行一些计时来查看盈亏平衡点在哪里——对四个元素进行时间搜索,然后是八个,然后是十六个,依此类推,以找到您的特定 STL 实现的最佳点。
Maps do tend to have a bunch of small allocations all over the heap, whereas vectors are contiguous so the cache-hit rate of vectors can sometimes be a little better in cases where you're iterating over all the elements from front to back.
映射确实倾向于在整个堆上有一堆小的分配,而向量是连续的,因此在从前到后迭代所有元素的情况下,向量的缓存命中率有时会好一点。
回答by Stefan R?dstr?m
"By default, use vector when you need a container" - Bjarne Stroustrup.
“默认情况下,当您需要容器时使用矢量”- Bjarne Stroustrup。
Otherwise, I find this little flow chart to be of very good help:
否则,我发现这个小流程图很有帮助:
回答by Mark Ransom
If you're doing all your insertions at once then doing lots of lookups, you can use a vector and sort it when you're through inserting; then use lower_bound to do a quick lookup. It might be faster than using a map, even for large numbers of items.
如果您一次完成所有插入然后进行大量查找,则可以使用向量并在插入时对其进行排序;然后使用lower_bound 进行快速查找。它可能比使用地图更快,即使对于大量项目也是如此。
回答by danieldk
I think you should use the container that fits the data first and foremost. std::vector is used in situations where you would use an array in C or pre-STL C++: you want a contiguous block of memory to store values with fast constant time look-up. std::map should be used to map keys to values. The primary overlap here is a vector vs a map with a size_t as the key. In that case there are two concerns: are the indexes continuous? If not, you will probably be wasting memory with a vector. Second, what look-up time do you want? A vector has constant time lookup, while std::map is usually implemented as a RB tree, which has a O(log n) look-up time, and even a hash map (such as TR1 unordered_map) usually has a worse complexity, because the index (or a hash thereof) will be mapped to a bucket that can contain multiple values.
我认为您应该首先使用适合数据的容器。std::vector 用于在 C 或 STL C++ 之前使用数组的情况:您需要一个连续的内存块来存储具有快速常数时间查找的值。std::map 应该用于将键映射到值。这里的主要重叠是向量与以 size_t 为键的地图。在这种情况下,有两个问题:索引是否连续?如果没有,您可能会在使用向量时浪费内存。第二,你想要什么查询时间?向量具有恒定时间查找,而 std::map 通常实现为 RB 树,其查找时间为 O(log n),甚至哈希映射(例如 TR1 unordered_map)通常具有更差的复杂度,因为索引(或其散列)将映射到可以包含多个值的存储桶。
If were aiming at a vector with pairs: you could the elements of the vector and use find to find elements. But this is a binary search, and will practically be as fast as a std::map.
如果针对成对的向量:您可以使用向量的元素并使用 find 来查找元素。但这是一个二分搜索,实际上和 std::map 一样快。
Anyway, try to model the data in the obvious manner. Premature optimization often doesn't help much.
无论如何,尝试以明显的方式对数据建模。过早的优化通常没有多大帮助。
回答by teeks99
Another way to look at this, is if we're talking about small containers, then neither one is going to take very long to look up. Unless you're searching through this container on a very tight loop, the difference in time will probably be negligible.
看待这个问题的另一种方式是,如果我们谈论的是小容器,那么没有人会花很长时间去查找。除非您在非常紧凑的循环中搜索此容器,否则时间差异可能可以忽略不计。
In that case, I would look for which container more closely matches what you want to do. If you're looking for a particular value, map's built-in find() method will be a lot easier (and less complex to use) than creating a for loop and iterating over a vector.
在这种情况下,我会寻找哪个容器更符合您想要做的事情。如果您正在寻找一个特定的值,map 的内置 find() 方法将比创建 for 循环和迭代向量更容易(并且使用起来更简单)。
You're time is probably worth a lot more than a few nano-seconds here and there.
你的时间可能比这里和那里的几纳秒更有价值。
回答by fury.slay
Basically, maps are used for lookup.
基本上,地图用于查找。
But, sometimes std::vector
can be used instead of std::map
even for look up.
但是,有时甚至std::vector
可以用来代替std::map
查找。
If there are going to be very less elements in your key-value pairs, then you can go for an iterative search using key even in std::vector<std::pair<x,y>>
.
如果您的键值对中的元素非常少,那么您甚至可以使用 key 进行迭代搜索std::vector<std::pair<x,y>>
。
This is because of the fact that hashing takes time, especially for hashing strings and for other operations in map like storing data in heap.
这是因为散列需要时间,特别是对于散列字符串和映射中的其他操作,例如在堆中存储数据。
You would only see a better difference in std::map, if you have more elements in which you have to lookup and also when you want to do frequent lookup in the list of elements that you have.
如果您有更多的元素需要查找,并且当您想要在您拥有的元素列表中进行频繁查找时,您只会在 std::map 中看到更好的差异。