C++ 如何在 std::set 中选择一个随机元素?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3052788/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to select a random element in std::set?
提问by Frank
How can I select a random element in an std::set
?
如何在 中选择一个随机元素std::set
?
I naively tried this:
我天真地尝试了这个:
int GetSample(const std::set<int>& s) {
double r = rand() % s.size();
return *(s.begin() + r); // compile error
}
But the operator+
is not allowed in this way.
但是operator+
这种方式是不允许的。
回答by xtofl
You could use the std::advance
method.
你可以用这个std::advance
方法。
#include <set>
#include <algorithm>
int main() {
using namespace std;
// generate a set...
set<int> s;
for( int i = 0; i != 10; ++i ) s.insert(i);
auto r = rand() % s.size(); // not _really_ random
auto n = *select_random(s, r);
}
Where
在哪里
template<typename S>
auto select_random(const S &s, size_t n) {
auto it = std::begin(s);
// 'advance' the iterator n times
std::advance(it,n);
return it;
}
回答by davidhigh
If the random access is important and you can live with O(N) average effort for the insertion, then the workaround given in this papermight be convenient.
如果随机访问很重要,并且您可以忍受 O(N) 平均插入工作量,那么本文中给出的解决方法可能会很方便。
The main idea there is to use a sorted vector, and then for lookup the function std::lower_bound
. This, the lookup takes O(log N) just as in a normal set. Further, (random) insertion takes O(N), as all following elements must be shifted just like in a normal vector (and possibly a reallocation is performed). Insertion at the back, however, is constant (except for the reallocation. You can avoid this by calling reserve()
with a large enough storage).
主要思想是使用排序向量,然后查找函数std::lower_bound
。这个,查找需要 O(log N) 就像在正常集合中一样。此外,(随机)插入需要 O(N),因为所有后续元素必须像在法向量中一样移位(并且可能执行重新分配)。然而,后面的插入是不变的(除了重新分配。你可以通过调用reserve()
足够大的存储来避免这种情况)。
Finally, the main point of the question: Random access is O(1).Just draw a random number i
from a uniform distribution in [0, V.size()-1]
, and return the corresponding element V[i]
.
最后,问题的要点:随机访问是 O(1)。只需i
从 中的均匀分布中抽取一个随机数[0, V.size()-1]
,并返回相应的元素V[i]
。
Here is the code basis out of the paper, which implements this sorted vector. Extend it as needed:
这是论文中的代码基础,它实现了这个排序向量。根据需要扩展它:
template <class T, class Compare = std::less<T> >
struct sorted_vector {
using std::vector;
using std::lower_bound;
vector<T> V;
Compare cmp;
typedef typename vector<T>::iterator iterator;
typedef typename vector<T>::const_iterator const_iterator;
iterator begin() { return V.begin(); }
iterator end() { return V.end(); }
const_iterator begin() const { return V.begin(); }
const_iterator end() const { return V.end(); }
//...if needed, implement more by yourself
sorted_vector(const Compare& c = Compare()) : V(), cmp(c) {}
template <class InputIterator>
sorted_vector(InputIterator first, InputIterator last, Const Compare& c = Compare())
: V(first, last), cmp(c)
{
std::sort(begin(), end(), cmp);
}
//...
iterator insert(const T& t) {
iterator i = lower_bound(begin(), end(), t, cmp);
if (i == end() || cmp(t, *i))
V.insert(i, t);
return i;
}
const_iterator find(const T& t) const {
const_iterator i = lower_bound(begin(), end(), t, cmp);
return i == end() || cmp(t, *i) ? end() : i;
}
};
For a more sophisticated implementation, you might also consider this page.
对于更复杂的实现,您也可以考虑这个页面。
EDIT: or even better, use boost::container::flat_set
, which implements the set using the idea above, i.e. as a sorted vector.
编辑:或者甚至更好,使用boost::container::flat_set
,它使用上面的想法实现集合,即作为排序向量。
回答by matovitch
First Solution : O(log n)in time / O(1)in space (not uniform !)
第一个解决方案:时间上的O(log n)/空间上的O(1)(不统一!)
A hypothesized in a comment above, it can be done in O(log(n))(vs O(n)for std::advance
) without a vector (using O(n)more space) by using the method I describe here.
在上面的评论中假设,它可以通过使用我在此处描述的方法在O(log(n))(vs O(n)for std::advance
) 中完成,无需向量(使用O(n)更多空间)。
Essentially, you :
本质上,您:
- check if the set is empty (if it is, there is no hope)
- generate a random value
- if already there return it else insert it
- get one iterator
it
on it - get the random element as
*(it++)
or*(set.begin())
ifit
at the end - return it not before deleting the element you inserted
- 检查集合是否为空(如果是,则没有希望)
- 生成随机值
- 如果已经在那里返回它否则插入它
- 得到一个迭代器
it
就可以了 - 获取随机元素 as
*(it++)
或*(set.begin())
ifit
最后 - 在删除您插入的元素之前不要返回它
n.b : As pointed out by Aaronthe element is not chosen uniformlyat random. You need to build the random element with the same distribution than the elements in the set to approach a uniform polling.
nb :正如Aaron所指出的,元素不是随机均匀选择的。您需要构建与集合中元素分布相同的随机元素,以实现统一轮询。
Second Solution : O(1)in time / O(n)in space (uniform)
第二种解决方案:时间上的O(1)/空间上的O(n)(均匀)
davidhighalready gave the solution with a vector but there is a problem because when you popan element of your stack, you will have to perform a linear search in O(n)or you can rebuild your vector each time you want to retrieve a random element but that is O(n)too.
davidhigh已经用向量给出了解决方案,但有一个问题,因为当你弹出堆栈中的一个元素时,你必须在O(n) 中执行线性搜索,或者每次你想要检索一个随机数时都可以重建你的向量元素,但这也是O(n)。
To avoid this problem and keep the insert/delete to O(log n), you can keep an std::unordered_set
and use a similar methodto the first solution to get a random element in O(1).
为避免此问题并将插入/删除保持为O(log n),您可以保留std::unordered_set
并使用与第一个解决方案类似的方法来获取O(1) 中的随机元素。
p.s : If your elements are large you can use an unordered set of pointers (with a modified hasher) to spare some memory.
ps:如果您的元素很大,您可以使用一组无序的指针(带有修改后的散列器)来节省一些内存。
回答by Amir Rachum
int GetSample(const std::set<int>& s) {
double r = rand() % s.size();
std::set<int>::iterator it = s.begin();
for (; r != 0; r--) it++;
return *it;
}
would be one way of doing it, although not pretty;
将是一种方法,虽然不漂亮;