C++ 为什么有人会使用 set 而不是 unordered_set?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1349734/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why would anyone use set instead of unordered_set?
提问by AraK
C++0x is introducing unordered_set
which is available in boost
and many other places. What I understand is that unordered_set
is hash table with O(1)
lookup complexity. On the other hand, set
is nothing but a tree with log(n)
lookup complexity. Why on earth would anyone use set
instead of unordered_set
? i.e is there a need for set
anymore?
C++0x 正在引入unordered_set
可在boost
许多其他地方使用的功能。我所理解的是unordered_set
具有O(1)
查找复杂性的哈希表。另一方面,set
它只不过是具有log(n)
查找复杂性的树。为什么会有人使用set
而不是unordered_set
?即有需要set
吗?
回答by sth
Unordered sets have to pay for their O(1) average access time in a few ways:
无序集必须通过以下几种方式为其 O(1) 平均访问时间付出代价:
set
uses less memorythanunordered_set
to store the same number of elements.- For a small number of elements, lookups in a
set
might be fasterthan lookups in anunordered_set
. - Even though many operations are faster in the average casefor
unordered_set
, they are often guaranteed to have better worst case complexitiesforset
(for exampleinsert
). - That
set
sorts the elementsis useful if you want to access them in order. - You can lexicographically comparedifferent
set
s with<
,<=
,>
and>=
.unordered_set
s are not required to support these operations.
set
使用比存储相同数量的元素更少的内存unordered_set
。- 对于少量元素,在 a 中查找
set
可能比在 a 中查找快unordered_set
。 - 尽管很多操作都在更快的平均情况为
unordered_set
,他们经常保证有更好的最坏情况复杂的set
(例如insert
)。 - 这
set
种种元素,如果你想将它们按顺序访问是有益的。 - 您可以按字典顺序比较不同
set
s的<
,<=
,>
和>=
。unordered_set
s 不需要支持这些操作。
回答by moonshadow
When, for someone who wants to iterate over the items of the set, the order matters.
对于想要迭代集合中的项目的人来说,顺序很重要。
回答by Mehrdad Afshari
Whenever you prefer a tree to a hash table.
每当您更喜欢树而不是哈希表时。
For instance, hash tables are "O(n)" at worst case. O(1) is the average case. Trees are "O(logn)" at worst.
例如,哈希表在最坏的情况下是“O(n)”。O(1) 是平均情况。树在最坏的情况下是“O(logn)”。
回答by Jayhello
Use set when:
在以下情况下使用 set:
- We need ordered data(distinct elements).
- We would have to print/access the data (in sorted order).
- We need predecessor/successor of elements.
- 我们需要有序数据(不同的元素)。
- 我们将不得不打印/访问数据(按排序顺序)。
- 我们需要元素的前身/后继。
Use unordered_set when:
在以下情况下使用 unordered_set:
- We need to keep a set of distinct elements and no ordering is required.
- We need single element access i.e. no traversal.
- 我们需要保留一组不同的元素并且不需要排序。
- 我们需要单元素访问,即没有遍历。
Examples:
例子:
set:
放:
Input : 1, 8, 2, 5, 3, 9
输入 : 1, 8, 2, 5, 3, 9
Output : 1, 2, 3, 5, 8, 9
输出:1、2、3、5、8、9
Unordered_set:
无序_set:
Input : 1, 8, 2, 5, 3, 9
输入 : 1, 8, 2, 5, 3, 9
Output : 9 3 1 8 2 5 (maybe this order, influenced by hash function)
输出:9 3 1 8 2 5(可能是这个顺序,受散列函数影响)
Mainly difference :
主要区别:
Note:(in some case set
is more convenient) for example using vector
as key
注意:(在某些情况下set
更方便)例如使用vector
作为键
set<vector<int>> s;
s.insert({1, 2});
s.insert({1, 3});
s.insert({1, 2});
for(const auto& vec:s)
cout<<vec<<endl; // I have override << for vector
// 1 2
// 1 3
The reason why vector<int>
can be as key in set
because vector
override operator<
.
之所以vector<int>
可以作为key inset
是因为vector
override operator<
。
But if you use unordered_set<vector<int>>
you have to create a hash function for vector<int>
, because vector does't have a hash function, so you have to define one like:
但是如果你使用unordered_set<vector<int>>
你必须为 建立一个散列函数vector<int>
,因为 vector 没有散列函数,所以你必须定义一个像:
struct VectorHash {
size_t operator()(const std::vector<int>& v) const {
std::hash<int> hasher;
size_t seed = 0;
for (int i : v) {
seed ^= hasher(i) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}
return seed;
}
};
vector<vector<int>> two(){
//unordered_set<vector<int>> s; // error vector<int> doesn't have hash function
unordered_set<vector<int>, VectorHash> s;
s.insert({1, 2});
s.insert({1, 3});
s.insert({1, 2});
for(const auto& vec:s)
cout<<vec<<endl;
// 1 2
// 1 3
}
you can see that in some case unordered_set
is more complicated.
你可以看到在某些情况下unordered_set
更复杂。
Mainly cited from: https://www.geeksforgeeks.org/set-vs-unordered_set-c-stl/https://stackoverflow.com/a/29855973/6329006
主要引用自:https: //www.geeksforgeeks.org/set-vs-unordered_set-c-stl/ https://stackoverflow.com/a/29855973/6329006
回答by Jayhello
Because std::set is part of Standard C++ and unordered_set isn't. C++0x is NOT a standard, and neither is Boost. For many of us, portability is essential, and that means sticking to the standard.
因为 std::set 是标准 C++ 的一部分,而 unordered_set 不是。C++0x 不是标准,Boost 也不是标准。对于我们中的许多人来说,便携性是必不可少的,这意味着坚持标准。
回答by ldog
Consider sweepline algorithms. These algorithms would fail utterly with hash tables, but work beautifully with balanced trees. To give you a concrete example of a sweepline algorithm consider fortune's algorithm. http://en.wikipedia.org/wiki/Fortune%27s_algorithm
考虑扫描线算法。这些算法在使用哈希表时会完全失败,但在使用平衡树时效果很好。为了给你一个扫描线算法的具体例子,请考虑财富算法。http://en.wikipedia.org/wiki/Fortune%27s_algorithm
回答by Blargle
One more thing, in addition to what other people already mentioned. While the expected amortized complexity for inserting an element to an unordered_set is O(1), every now and then it willtake O(n) because the hash-table needs to be restructured (the number of buckets needs to change) - even with a 'good' hash function. Just like inserting an element in a vector takes O(n) every now and then because the underlying array needs to be reallocated.
除了其他人已经提到的之外,还有一件事。虽然将元素插入到 unordered_set 的预期摊销复杂度是 O(1),但它时不时地需要 O(n),因为哈希表需要重组(桶的数量需要改变) - 即使有一个“好”的哈希函数。就像在向量中插入一个元素时不时需要 O(n) 一样,因为底层数组需要重新分配。
Inserting in a set always takes at most O(log n). This might be preferable in some applications.
插入一个集合总是最多需要 O(log n)。这在某些应用中可能更可取。
回答by Spectral
Pardon me, one more thing worth noticing about the sorted property:
对不起,关于 sorted 属性还有一件值得注意的事情:
If you want a range of datain container, for example: You stored time in set, and you want time from 2013-01-01 to 2014-01-01.
如果您想要容器中的一系列数据,例如:您将时间存储在set 中,并且您想要从 2013-01-01 到 2014-01-01 的时间。
For unordered_setit is impossible.
对于unordered_set是不可能的。
Of course, this example would be more convincing for usage cases between mapand unordered_map.
当然,这个例子对于map和unordered_map之间的用例会更有说服力。
回答by mic_e
While this answer might be 10 years late, it's worth pointing out that std::unordered_set
also has security downsides.
虽然这个答案可能晚了 10 年,但值得指出的是,它std::unordered_set
也有安全隐患。
If the hash function is predictable (this is typically the case unless it applies counter-measures such as a randomized salt), attackers can hand-craft data that produces hash collisions and causes all insertions and look-ups to take O(n) time.
如果散列函数是可预测的(这通常是这种情况,除非它采用随机盐等对策),攻击者可以手工制作产生散列冲突的数据,并导致所有插入和查找花费 O(n) 时间.
This can be used for very efficient and elegant denial-of-service attacks.
这可用于非常有效和优雅的拒绝服务攻击。
Many (most?) implementations of languages that internally employ hash maps have run into this:
许多(大多数?)内部使用哈希映射的语言实现都遇到了这种情况:
回答by Rushyo
Off hand, I would say it is convenient to have things in a relationship if you're looking to convert it into a different format.
顺便说一句,如果您希望将其转换为不同的格式,那么在关系中建立关系是很方便的。
It is also possible that whilst one is faster to access, the time to build the index or the memory used when creating and/or accessing it is greater.
也有可能虽然访问速度更快,但构建索引的时间或创建和/或访问它时使用的内存更大。