从 C# 中的 List<T> 中选择 N 个随机元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48087/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 10:00:18  来源:igfitidea点击:

Select N random elements from a List<T> in C#

提问by JC Grubbs

I need a quick algorithm to select 5 random elements from a generic list. For example, I'd like to get 5 random elements from a List<string>.

我需要一个快速算法来从通用列表中选择 5 个随机元素。例如,我想从 a 中获取 5 个随机元素List<string>

采纳答案by Kyle Cronin

Iterate through and for each element make the probability of selection = (number needed)/(number left)

迭代并为每个元素使选择概率=(所需数量)/(剩余数量)

So if you had 40 items, the first would have a 5/40 chance of being selected. If it is, the next has a 4/39 chance, otherwise it has a 5/39 chance. By the time you get to the end you will have your 5 items, and often you'll have all of them before that.

因此,如果您有 40 个项目,第一个将有 5/40 的机会被选中。如果是,则下一个机会为 4/39,否则为 5/39。到最后,您将拥有 5 个项目,而且通常在此之前您将拥有所有项目。

回答by Tyler

This is actually a harder problem than it sounds like, mainly because many mathematically-correct solutions will fail to actually allow you to hit all the possibilities (more on this below).

这实际上是一个比听起来更难的问题,主要是因为许多数学上正确的解决方案实际上无法让您找到所有可能性(更多内容见下文)。

First, here are some easy-to-implement, correct-if-you-have-a-truly-random-number generator:

首先,这里有一些易于实现的正确随机数生成器:

(0) Kyle's answer, which is O(n).

(0) 凯尔的答案,即 O(n)。

(1) Generate a list of n pairs [(0, rand), (1, rand), (2, rand), ...], sort them by the second coordinate, and use the first k (for you, k=5) indices to get your random subset. I think this is easy to implement, although it is O(n log n) time.

(1) 生成n对[(0, rand), (1, rand), (2, rand), ...]的列表,按第二个坐标排序,使用前k个(对你来说,k =5) 索引以获取您的随机子集。我认为这很容易实现,尽管它是 O(n log n) 时间。

(2) Init an empty list s = [] that will grow to be the indices of k random elements. Choose a number r in {0, 1, 2, ..., n-1} at random, r = rand % n, and add this to s. Next take r = rand % (n-1) and stick in s; add to r the # elements less than it in s to avoid collisions. Next take r = rand % (n-2), and do the same thing, etc. until you have k distinct elements in s. This has worst-case running time O(k^2). So for k << n, this can be faster. If you keep s sorted and track which contiguous intervals it has, you can implement it in O(k log k), but it's more work.

(2) 初始化一个空列表 s = [],它将增长为 k 个随机元素的索引。在 {0, 1, 2, ..., n-1} 中随机选择一个数 r,r = rand % n,并将其添加到 s。接下来取 r = rand % (n-1) 并插入 s;在 s 中添加小于它的 # 个元素以避免冲突。接下来取 r = rand % (n-2),然后做同样的事情,等等,直到在 s 中有 k 个不同的元素。这有最坏情况的运行时间 O(k^2)。所以对于 k << n,这可以更快。如果您保持 s 排序并跟踪它具有哪些连续间隔,您可以在 O(k log k) 中实现它,但它的工作量更大。

@Kyle - you're right, on second thought I agree with your answer. I hastily read it at first, and mistakenly thought you were indicating to sequentially choose each element with fixed probability k/n, which would have been wrong - but your adaptive approach appears correct to me. Sorry about that.

@Kyle - 你是对的,再想一想我同意你的回答。起初我仓促地阅读它,并错误地认为您指示以固定概率 k/n 顺序选择每个元素,这本来是错误的 - 但您的自适应方法对我来说似乎是正确的。对于那个很抱歉。

Ok, and now for the kicker: asymptotically (for fixed k, n growing), there are n^k/k! choices of k element subset out of n elements [this is an approximation of (n choose k)]. If n is large, and k is not very small, then these numbers are huge. The best cycle length you can hope for in any standard 32 bit random number generator is 2^32 = 256^4. So if we have a list of 1000 elements, and we want to choose 5 at random, there's no way a standard random number generator will hit all the possibilities. However, as long as you're ok with a choice that works fine for smaller sets, and always "looks" random, then these algorithms should be ok.

好的,现在是踢球者:渐近地(对于固定 k,n 增长),有 n^k/k!从 n 个元素中选择 k 个元素子集[这是(n 选择 k)的近似值]。如果 n 很大,而 k 不是很小,那么这些数字很大。在任何标准 32 位随机数生成器中,您希望的最佳循环长度是 2^32 = 256^4。因此,如果我们有一个包含 1000 个元素的列表,并且我们想随机选择 5 个,那么标准的随机数生成器不可能满足所有可能性。但是,只要您可以选择适用于较小集合的选择,并且始终“看起来”随机,那么这些算法应该没问题。

Addendum: After writing this, I realized that it's tricky to implement idea (2) correctly, so I wanted to clarify this answer. To get O(k log k) time, you need an array-like structure that supports O(log m) searches and inserts - a balanced binary tree can do this. Using such a structure to build up an array called s, here is some pseudopython:

附录:写完这篇之后,我意识到正确实施想法(2)很棘手,所以我想澄清这个答案。要获得 O(k log k) 时间,您需要一个支持 O(log m) 搜索和插入的类数组结构——平衡二叉树可以做到这一点。使用这样的结构来构建一个名为 s 的数组,这里是一些伪python:

# Returns a container s with k distinct random numbers from {0, 1, ..., n-1}
def ChooseRandomSubset(n, k):
  for i in range(k):
    r = UniformRandom(0, n-i)                 # May be 0, must be < n-i
    q = s.FirstIndexSuchThat( s[q] - q > r )  # This is the search.
    s.InsertInOrder(q ? r + q : r + len(s))   # Inserts right before q.
  return s

I suggest running through a few sample cases to see how this efficiently implements the above English explanation.

我建议运行几个示例案例,看看它如何有效地实现上述英文解释。

回答by IanStallings

This is the best I could come up with on a first cut:

这是我在第一次剪辑时能想到的最好的方法:

public List<String> getRandomItemsFromList(int returnCount, List<String> list)
{
    List<String> returnList = new List<String>();
    Dictionary<int, int> randoms = new Dictionary<int, int>();

    while (randoms.Count != returnCount)
    {
        //generate new random between one and total list count
        int randomInt = new Random().Next(list.Count);

        // store this in dictionary to ensure uniqueness
        try
        {
            randoms.Add(randomInt, randomInt);
        }
        catch (ArgumentException aex)
        {
            Console.Write(aex.Message);
        } //we can assume this element exists in the dictonary already 

        //check for randoms length and then iterate through the original list 
        //adding items we select via random to the return list
        if (randoms.Count == returnCount)
        {
            foreach (int key in randoms.Keys)
                returnList.Add(list[randoms[key]]);

            break; //break out of _while_ loop
        }
    }

    return returnList;
}

Using a list of randoms within a range of 1 - total list count and then simply pulling those items in the list seemed to be the best way, but using the Dictionary to ensure uniqueness is something I'm still mulling over.

使用 1 范围内的随机数列表 - 列表总数,然后简单地将这些项目拉到列表中似乎是最好的方法,但使用字典来确保唯一性是我仍在考虑的事情。

Also note I used a string list, replace as needed.

另请注意,我使用了字符串列表,根据需要进行替换。

回答by spoulson

From Dragons in the Algorithm, an interpretation in C#:

来自算法中的龙,在 C# 中的解释:

int k = 10; // items to select
var items = new List<int>(new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 });
var selected = new List<int>();
double needed = k;
double available = items.Count;
var rand = new Random();
while (selected.Count < k) {
   if( rand.NextDouble() < needed / available ) {
      selected.Add(items[(int)available-1])
      needed--;
   }
   available--;
}

This algorithm will select unique indicies of the items list.

该算法将选择项目列表的唯一索引。

回答by hitec

I recently did this on my project using an idea similar to Tyler's point 1.
I was loading a bunch of questions and selecting five at random. Sorting was achieved using an IComparer.
aAll questions were loaded in the a QuestionSorter list, which was then sorted using the List's Sort functionand the first k elements where selected.

我最近使用类似于Tyler's point 1的想法在我的项目中做了这个。
我正在加载一堆问题并随机选择五个。排序是使用IComparer实现的。
a所有问题都加载到一个 QuestionSorter 列表中,然后使用列表的 Sort 函数和选择的前 k 个元素对其进行排序

    private class QuestionSorter : IComparable<QuestionSorter>
    {
        public double SortingKey
        {
            get;
            set;
        }

        public Question QuestionObject
        {
            get;
            set;
        }

        public QuestionSorter(Question q)
        {
            this.SortingKey = RandomNumberGenerator.RandomDouble;
            this.QuestionObject = q;
        }

        public int CompareTo(QuestionSorter other)
        {
            if (this.SortingKey < other.SortingKey)
            {
                return -1;
            }
            else if (this.SortingKey > other.SortingKey)
            {
                return 1;
            }
            else
            {
                return 0;
            }
        }
    }

Usage:

用法:

    List<QuestionSorter> unsortedQuestions = new List<QuestionSorter>();

    // add the questions here

    unsortedQuestions.Sort(unsortedQuestions as IComparer<QuestionSorter>);

    // select the first k elements

回答by Cameron A. Ellis

why not something like this:

为什么不是这样的:

 Dim ar As New ArrayList
    Dim numToGet As Integer = 5
    'hard code just to test
    ar.Add("12")
    ar.Add("11")
    ar.Add("10")
    ar.Add("15")
    ar.Add("16")
    ar.Add("17")

    Dim randomListOfProductIds As New ArrayList

    Dim toAdd As String = ""
    For i = 0 To numToGet - 1
        toAdd = ar(CInt((ar.Count - 1) * Rnd()))

        randomListOfProductIds.Add(toAdd)
        'remove from id list
        ar.Remove(toAdd)

    Next
'sorry i'm lazy and have to write vb at work :( and didn't feel like converting to c#

回答by Frank Schwieterman

I think the selected answer is correct and pretty sweet. I implemented it differently though, as I also wanted the result in random order.

我认为所选的答案是正确的,而且非常甜蜜。不过,我以不同的方式实现了它,因为我也想要随机顺序的结果。

    static IEnumerable<SomeType> PickSomeInRandomOrder<SomeType>(
        IEnumerable<SomeType> someTypes,
        int maxCount)
    {
        Random random = new Random(DateTime.Now.Millisecond);

        Dictionary<double, SomeType> randomSortTable = new Dictionary<double,SomeType>();

        foreach(SomeType someType in someTypes)
            randomSortTable[random.NextDouble()] = someType;

        return randomSortTable.OrderBy(KVP => KVP.Key).Take(maxCount).Select(KVP => KVP.Value);
    }

回答by Tobias Hertkorn

It is a lot harder than one would think. See the great Article "Shuffling"from Jeff.

这比人们想象的要困难得多。请参阅Jeff的精彩文章“洗牌”

I did write a very short article on that subject including C# code:
Return random subset of N elements of a given array

我确实写了一篇关于该主题的非常简短的文章,包括 C# 代码:
返回给定数组的 N 个元素的随机子集

回答by Ers

Using linq:

使用 linq:

YourList.OrderBy(x => rnd.Next()).Take(5)

回答by Tine M.

The simple solution I use (probably not good for large lists): Copy the list into temporary list, then in loop randomly select Item from temp list and put it in selected items list while removing it form temp list (so it can't be reselected).

我使用的简单解决方案(可能不适用于大型列表):将列表复制到临时列表中,然后在循环中从临时列表中随机选择项目并将其放入选定项目列表中,同时将其从临时列表中删除(因此它不能重新选择)。

Example:

例子:

List<Object> temp = OriginalList.ToList();
List<Object> selectedItems = new List<Object>();
Random rnd = new Random();
Object o;
int i = 0;
while (i < NumberOfSelectedItems)
{
            o = temp[rnd.Next(temp.Count)];
            selectedItems.Add(o);
            temp.Remove(o);
            i++;
 }