C++ 为什么有人会使用 set 而不是 unordered_set？

Question

提问by AraK

C++0x is introducing unordered_setwhich is available in boostand many other places. What I understand is that unordered_setis hash table with O(1)lookup complexity. On the other hand, setis nothing but a tree with log(n)lookup complexity. Why on earth would anyone use setinstead of unordered_set? i.e is there a need for setanymore?

C++0x 正在引入unordered_set可在boost许多其他地方使用的功能。我所理解的是unordered_set具有O(1)查找复杂性的哈希表。另一方面，set它只不过是具有log(n)查找复杂性的树。为什么会有人使用set而不是unordered_set？即有需要set吗？

Answer 1

回答by sth

Unordered sets have to pay for their O(1) average access time in a few ways:

无序集必须通过以下几种方式为其 O(1) 平均访问时间付出代价：

setuses less memorythan unordered_setto store the same number of elements.
For a small number of elements, lookups in a setmight be fasterthan lookups in an unordered_set.
Even though many operations are faster in the average casefor unordered_set, they are often guaranteed to have better worst case complexitiesfor set(for example insert).
That setsorts the elementsis useful if you want to access them in order.
You can lexicographically comparedifferent sets with <, <=, >and >=. unordered_sets are not required to support these operations.

set使用比存储相同数量的元素更少的内存unordered_set。
对于少量元素，在 a 中查找set可能比在 a 中查找快unordered_set。
尽管很多操作都在更快的平均情况为unordered_set，他们经常保证有更好的最坏情况复杂的set（例如insert）。
这set种种元素，如果你想将它们按顺序访问是有益的。
您可以按字典顺序比较不同sets的<，<=，>和>=。unordered_sets 不需要支持这些操作。

Answer 2

回答by moonshadow

When, for someone who wants to iterate over the items of the set, the order matters.

对于想要迭代集合中的项目的人来说，顺序很重要。

Answer 3

回答by Mehrdad Afshari

Whenever you prefer a tree to a hash table.

每当您更喜欢树而不是哈希表时。

For instance, hash tables are "O(n)" at worst case. O(1) is the average case. Trees are "O(logn)" at worst.

例如，哈希表在最坏的情况下是“O(n)”。O(1) 是平均情况。树在最坏的情况下是“O（logn）”。

Answer 4

回答by Jayhello

Use set when:

在以下情况下使用 set：

We need ordered data(distinct elements).
We would have to print/access the data (in sorted order).
We need predecessor/successor of elements.

我们需要有序数据（不同的元素）。
我们将不得不打印/访问数据（按排序顺序）。
我们需要元素的前身/后继。

Use unordered_set when:

在以下情况下使用 unordered_set：

We need to keep a set of distinct elements and no ordering is required.
We need single element access i.e. no traversal.

我们需要保留一组不同的元素并且不需要排序。
我们需要单元素访问，即没有遍历。

Examples:

例子：

set:

放：

Input : 1, 8, 2, 5, 3, 9

输入 : 1, 8, 2, 5, 3, 9

Output : 1, 2, 3, 5, 8, 9

输出：1、2、3、5、8、9

Unordered_set:

无序_set：

Input : 1, 8, 2, 5, 3, 9

输入 : 1, 8, 2, 5, 3, 9

Output : 9 3 1 8 2 5 (maybe this order, influenced by hash function)

输出：9 3 1 8 2 5（可能是这个顺序，受散列函数影响）

Mainly difference :

主要区别：

Note:(in some case setis more convenient) for example using vectoras key

注意：（在某些情况下set更方便）例如使用vector作为键

set<vector<int>> s;
s.insert({1, 2});
s.insert({1, 3});
s.insert({1, 2});

for(const auto& vec:s)
    cout<<vec<<endl;   // I have override << for vector
// 1 2
// 1 3

The reason why vector<int>can be as key in setbecause vectoroverride operator<.

之所以vector<int>可以作为key inset是因为vectoroverride operator<。

But if you use unordered_set<vector<int>>you have to create a hash function for vector<int>, because vector does't have a hash function, so you have to define one like:

但是如果你使用unordered_set<vector<int>>你必须为建立一个散列函数vector<int>，因为 vector 没有散列函数，所以你必须定义一个像：

struct VectorHash {
    size_t operator()(const std::vector<int>& v) const {
        std::hash<int> hasher;
        size_t seed = 0;
        for (int i : v) {
            seed ^= hasher(i) + 0x9e3779b9 + (seed<<6) + (seed>>2);
        }
        return seed;
    }
};

vector<vector<int>> two(){
    //unordered_set<vector<int>> s; // error vector<int> doesn't  have hash function
    unordered_set<vector<int>, VectorHash> s;
    s.insert({1, 2});
    s.insert({1, 3});
    s.insert({1, 2});

    for(const auto& vec:s)
        cout<<vec<<endl;
    // 1 2
    // 1 3
}

you can see that in some case unordered_setis more complicated.

你可以看到在某些情况下unordered_set更复杂。

Mainly cited from: https://www.geeksforgeeks.org/set-vs-unordered_set-c-stl/https://stackoverflow.com/a/29855973/6329006

主要引用自：https: //www.geeksforgeeks.org/set-vs-unordered_set-c-stl/ https://stackoverflow.com/a/29855973/6329006

Answer 5

回答by Jayhello

Because std::set is part of Standard C++ and unordered_set isn't. C++0x is NOT a standard, and neither is Boost. For many of us, portability is essential, and that means sticking to the standard.

因为 std::set 是标准 C++ 的一部分，而 unordered_set 不是。C++0x 不是标准，Boost 也不是标准。对于我们中的许多人来说，便携性是必不可少的，这意味着坚持标准。

Answer 6

回答by ldog

Consider sweepline algorithms. These algorithms would fail utterly with hash tables, but work beautifully with balanced trees. To give you a concrete example of a sweepline algorithm consider fortune's algorithm. http://en.wikipedia.org/wiki/Fortune%27s_algorithm

考虑扫描线算法。这些算法在使用哈希表时会完全失败，但在使用平衡树时效果很好。为了给你一个扫描线算法的具体例子，请考虑财富算法。http://en.wikipedia.org/wiki/Fortune%27s_algorithm

Answer 7

回答by Blargle

One more thing, in addition to what other people already mentioned. While the expected amortized complexity for inserting an element to an unordered_set is O(1), every now and then it willtake O(n) because the hash-table needs to be restructured (the number of buckets needs to change) - even with a 'good' hash function. Just like inserting an element in a vector takes O(n) every now and then because the underlying array needs to be reallocated.

除了其他人已经提到的之外，还有一件事。虽然将元素插入到 unordered_set 的预期摊销复杂度是 O(1)，但它时不时地需要 O(n)，因为哈希表需要重组（桶的数量需要改变） - 即使有一个“好”的哈希函数。就像在向量中插入一个元素时不时需要 O(n) 一样，因为底层数组需要重新分配。

Inserting in a set always takes at most O(log n). This might be preferable in some applications.

插入一个集合总是最多需要 O(log n)。这在某些应用中可能更可取。

Answer 8

回答by Spectral

Pardon me, one more thing worth noticing about the sorted property:

对不起，关于 sorted 属性还有一件值得注意的事情：

If you want a range of datain container, for example: You stored time in set, and you want time from 2013-01-01 to 2014-01-01.

如果您想要容器中的一系列数据，例如：您将时间存储在set 中，并且您想要从 2013-01-01 到 2014-01-01 的时间。

For unordered_setit is impossible.

对于unordered_set是不可能的。

Of course, this example would be more convincing for usage cases between mapand unordered_map.

当然，这个例子对于map和unordered_map之间的用例会更有说服力。

Answer 9

回答by mic_e

While this answer might be 10 years late, it's worth pointing out that std::unordered_setalso has security downsides.

虽然这个答案可能晚了 10 年，但值得指出的是，它std::unordered_set也有安全隐患。

If the hash function is predictable (this is typically the case unless it applies counter-measures such as a randomized salt), attackers can hand-craft data that produces hash collisions and causes all insertions and look-ups to take O(n) time.

如果散列函数是可预测的（这通常是这种情况，除非它采用随机盐等对策），攻击者可以手工制作产生散列冲突的数据，并导致所有插入和查找花费 O(n) 时间.

This can be used for very efficient and elegant denial-of-service attacks.

这可用于非常有效和优雅的拒绝服务攻击。

Many (most?) implementations of languages that internally employ hash maps have run into this:

许多（大多数？）内部使用哈希映射的语言实现都遇到了这种情况：

Answer 10

回答by Rushyo

Off hand, I would say it is convenient to have things in a relationship if you're looking to convert it into a different format.

顺便说一句，如果您希望将其转换为不同的格式，那么在关系中建立关系是很方便的。

It is also possible that whilst one is faster to access, the time to build the index or the memory used when creating and/or accessing it is greater.

也有可能虽然访问速度更快，但构建索引的时间或创建和/或访问它时使用的内存更大。

C++ 为什么有人会使用 set 而不是 unordered_set？

提问by AraK

回答by sth

回答by moonshadow

回答by Mehrdad Afshari

回答by Jayhello

回答by Jayhello

回答by ldog

回答by Blargle

回答by Spectral

回答by mic_e

回答by Rushyo

相关推荐

最近更新

标签

C++ 为什么有人会使用 set 而不是 unordered_set？

提问by AraK

回答by sth

回答by moonshadow

回答by Mehrdad Afshari

回答by Jayhello

回答by Jayhello

回答by ldog

回答by Blargle

回答by Spectral

回答by mic_e

回答by Rushyo

相关推荐

C++ 如何打印到 Win32 应用程序中的调试输出窗口？

C++ 数字常量前的预期 ',' 或 '...'

C++ 程序中的 #include <cmath> 与 #include <math.h>

C++异常类设计

相关推荐

最近更新

标签