C++ STL 中的 set 和 hashset 有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2518305/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:46:09  来源:igfitidea点击:

What is the difference between set and hashset in C++ STL?

c++performancestlsethashset

提问by kal

When should I choose one over the other? Are there any pointers that you would recommend for using the right STL containers?

我什么时候应该选择一个?对于使用正确的 STL 容器,您是否有任何建议?

回答by Mark Ransom

hash_setis an extension that is not part of the C++ standard. Lookups should be O(1) rather than O(log n) for set, so it will be faster in most circumstances.

hash_set是不属于 C++ 标准的扩展。查找应该是 O(1) 而不是 O(log n) for set,因此在大多数情况下它会更快。

Another difference will be seen when you iterate through the containers. setwill deliver the contents in sorted order, while hash_setwill be essentially random (Thanks Lou Franco).

当您遍历容器时,会看到另一个不同之处。set将按排序顺序提供内容,而hash_set基本上是随机的(感谢 Lou Franco)。

Edit: The C++11 update to the C++ standard introduced unordered_setwhich should be preferred instead of hash_set. The performance will be similar and is guaranteed by the standard. The "unordered" in the name stresses that iterating it will produce results in no particular order.

编辑:引入的 C++ 标准的 C++11 更新unordered_set应该是首选而不是hash_set. 性能将相似,并由标准保证。名称中的“无序”强调迭代它会产生没有特定顺序的结果。

回答by Alex

stl::setis implemented as a binary search tree. hashsetis implemented as a hash table.

stl::set实现为二叉搜索树。 hashset被实现为一个哈希表。

The main issue here is that many people use stl::setthinking it is a hash table with look-up of O(1), which it isn't, and doesn't have. It really has O(log(n)) for look-ups. Other than that, read about binary trees vs hash tables to get a better idea of the data structures.

这里的主要问题是许多人stl::set认为它是一个查找 O(1) 的哈希表,它不是,也没有。它确实有 O(log(n)) 用于查找。除此之外,阅读二叉树与哈希表以更好地了解数据结构。

回答by ronys

Another thing to keep in mind is that with hash_set you have to provide the hash function, whereas a set only requires a comparison function ('<') which is easier to define (and predefined for native types).

要记住的另一件事是,使用 hash_set 必须提供散列函数,而集合只需要一个比较函数 ('<'),它更容易定义(并且为本地类型预定义)。

回答by Zan Lynx

I don't think anyone has answered the other part of the question yet.

我认为还没有人回答过问题的另一部分。

The reason to use hash_set or unordered_set is the usually O(1) lookup time. I say usually because every so often, depending on implementation, a hash may have to be copied to a larger hash array, or a hash bucket may end up containing thousands of entries.

使用 hash_set 或 unordered_set 的原因通常是 O(1) 查找时间。我说通常是因为每隔一段时间,根据实现,可能必须将散列复制到更大的散列数组,或者散列桶最终可能包含数千个条目。

The reason to use a set is if you often need the largest or smallest member of a set. A hash has no order so there is no quick way to find the smallest item. A tree has order, so largest or smallest is very quick. O(log n) for a simple tree, O(1) if it holds pointers to the ends.

使用集合的原因是如果您经常需要集合中最大或最小的成员。散列没有顺序,因此无法快速找到最小的项目。一棵树有顺序,所以最大或最小是非常快的。对于简单的树,O(log n),如果它持有指向两端的指针,则为 O(1)。

回答by Alex Gaynor

A hash_set would be implemented by a hash table, which has mostly O(1) operations, whereas a set is implemented by a tree of some sort (AVL, red black, etc.) which have O(log n) operations, but are in sorted order.

hash_set 将由哈希表实现,它的操作主要是 O(1),而集合是由某种树(AVL、红黑等)实现的,它有 O(log n) 次操作,但是按排序顺序。

Edit: I had written that trees are O(n). That's completely wrong.

编辑:我写过树是 O(n)。那是完全错误的。