C# 使用 .NET 随机化数组的最佳方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/108819/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 14:17:16  来源:igfitidea点击:

Best way to randomize an array with .NET

提问by Mats

What is the best way to randomize an array of strings with .NET? My array contains about 500 strings and I'd like to create a new Arraywith the same strings but in a random order.

使用 .NET 随机化字符串数组的最佳方法是什么?我的数组包含大约 500 个字符串,我想创建一个Array具有相同字符串但顺序随机的新字符串。

Please include a C# example in your answer.

请在您的答案中包含一个 C# 示例。

采纳答案by mdb

If you're on .NET 3.5, you can use the following IEnumerable coolness (VB.NET, not C#, but the idea should be clear...):

如果您使用的是 .NET 3.5,则可以使用以下 IEnumerable 酷(VB.NET,不是 C#,但想法应该很清楚...):

Random rnd=new Random();
string[] MyRandomArray = MyArray.OrderBy(x => rnd.Next()).ToArray();    

Edit: OK and here's the corresponding VB.NET code:

编辑:好的,这里是相应的 VB.NET 代码:

Dim rnd As New System.Random
Dim MyRandomArray = MyArray.OrderBy(Function() rnd.Next()).ToArray()

Second edit, in response to remarks that System.Random "isn't threadsafe" and "only suitable for toy apps" due to returning a time-based sequence: as used in my example, Random() is perfectly thread-safe, unless you're allowing the routine in which you randomize the array to be re-entered, in which case you'll need something like lock (MyRandomArray)anyway in order not to corrupt your data, which will protect rndas well.

第二次编辑,针对 System.Random “不是线程安全的”和“仅适用于玩具应用程序”的评论,由于返回基于时间的序列:如我的示例中所使用的,Random() 是完全线程安全的,除非您允许重新输入随机化数组的例程,在这种情况下,您将需要类似的东西lock (MyRandomArray),以免损坏您的数据,这也将提供保护rnd

Also, it should be well-understood that System.Random as a source of entropy isn't very strong. As noted in the MSDN documentation, you should use something derived from System.Security.Cryptography.RandomNumberGeneratorif you're doing anything security-related. For example:

此外,应该很好理解 System.Random 作为熵的来源并不是很强大。正如MSDN 文档中所指出的,System.Security.Cryptography.RandomNumberGenerator如果您正在做任何与安​​全相关的事情,您应该使用派生的东西。例如:

using System.Security.Cryptography;

...

...

RNGCryptoServiceProvider rnd = new RNGCryptoServiceProvider();
string[] MyRandomArray = MyArray.OrderBy(x => GetNextInt32(rnd)).ToArray();

...

...

static int GetNextInt32(RNGCryptoServiceProvider rnd)
    {
        byte[] randomInt = new byte[4];
        rnd.GetBytes(randomInt);
        return Convert.ToInt32(randomInt[0]);
    }

回答by Nick

Generate an array of random floats or ints of the same length. Sort that array, and do corresponding swaps on your target array.

生成相同长度的随机浮点数或整数数组。对该数组进行排序,并在目标数组上进行相应的交换。

This yields a truly independent sort.

这产生了一个真正独立的排序。

回答by stimms

Randomizing the array is intensive as you have to shift around a bunch of strings. Why not just randomly read from the array? In the worst case you could even create a wrapper class with a getNextString(). If you really do need to create a random array then you could do something like

随机化数组是密集的,因为您必须在一堆字符串周围移动。为什么不从数组中随机读取?在最坏的情况下,您甚至可以使用 getNextString() 创建一个包装类。如果您确实需要创建一个随机数组,那么您可以执行以下操作

for i = 0 -> i= array.length * 5
   swap two strings in random places

The *5 is arbitrary.

*5 是任意的。

回答by Sklivvz

This algorithm is simple but not efficient, O(N2). All the "order by" algorithms are typically O(N log N). It probably doesn't make a difference below hundreds of thousands of elements but it would for large lists.

该算法简单但效率不高,O(N 2)。所有“排序依据”算法通常都是 O(N log N)。它可能不会在数十万个元素下产生影响,但对于大型列表却会产生影响。

var stringlist = ... // add your values to stringlist

var r = new Random();

var res = new List<string>(stringlist.Count);

while (stringlist.Count >0)
{
   var i = r.Next(stringlist.Count);
   res.Add(stringlist[i]);
   stringlist.RemoveAt(i);
}

The reason why it's O(N2) is subtle: List.RemoveAt()is a O(N) operation unless you remove in order from the end.

它是 O(N 2) 的原因很微妙:List.RemoveAt()是一个 O(N) 操作,除非您从末尾按顺序删除。

回答by Tarsier

Just thinking off the top of my head, you could do this:

只是想想我的头顶,你可以这样做:

public string[] Randomize(string[] input)
{
  List<string> inputList = input.ToList();
  string[] output = new string[input.Length];
  Random randomizer = new Random();
  int i = 0;

  while (inputList.Count > 0)
  {
    int index = r.Next(inputList.Count);
    output[i++] = inputList[index];
    inputList.RemoveAt(index);
  }

  return (output);
}

回答by nullDev

Random r = new Random();
List<string> list = new List(originalArray);
List<string> randomStrings = new List();

while(list.Count > 0)
{
int i = r.Random(list.Count);
randomStrings.Add(list[i]);
list.RemoveAt(i);
}

回答by Pitarou

You're looking for a shuffling algorithm, right?

您正在寻找一种改组算法,对吗?

Okay, there are two ways to do this: the clever-but-people-always-seem-to-misunderstand-it-and-get-it-wrong-so-maybe-its-not-that-clever-after-all way, and the dumb-as-rocks-but-who-cares-because-it-works way.

好的,有两种方法可以做到这一点:聪明但人总是似乎误解它并弄错了所以也许它毕竟不是那么聪明方式,以及愚蠢的岩石,但谁在乎,因为它的工作方式。

Dumb way

笨办法

  • Create a duplicate of your first array, but tag each string should with a random number.
  • Sort the duplicate array with respect to the random number.
  • 创建第一个数组的副本,但应使用随机数标记每个字符串。
  • 根据随机数对重复数组进行排序。

This algorithm works well, but make sure that your random number generator is unlikely to tag two strings with the same number. Because of the so-called Birthday Paradox, this happens more often than you might expect. Its time complexity is O(nlog n).

该算法运行良好,但请确保您的随机数生成器不太可能用相同的数字标记两个字符串。由于所谓的生日悖论,这种情况发生的频率比您预期的要高。它的时间复杂度是 O( nlog n)。

Clever way

聪明的方式

I'll describe this as a recursive algorithm:

我将其描述为递归算法:

To shuffle an array of size n(indices in the range [0..n-1]):

if n= 0
  • do nothing
if n> 0
  • (recursive step)shuffle the first n-1 elements of the array
  • choose a random index, x, in the range [0..n-1]
  • swap the element at index n-1 with the element at index x

打乱大小为n的数组(范围 [0.. n-1]中的索引):

如果n= 0
  • 没做什么
如果n> 0
  • (递归步骤)洗牌数组的前n-1 个元素
  • 在 [0.. n-1]范围内选择一个随机索引x
  • 将索引n-1 处的元素与索引x处的元素交换

The iterative equivalent is to walk an iterator through the array, swapping with random elements as you go along, but notice that you cannot swap with an element afterthe one that the iterator points to. This is a very common mistake, and leads to a biased shuffle.

迭代等价物是遍历数组的迭代器,在进行过程中与随机元素进行交换,但请注意,不能与迭代器指向的元素之后的元素进行交换。这是一个非常常见的错误,会导致有偏见的 shuffle。

Time complexity is O(n).

时间复杂度为 O( n)。

回答by Seth Morris

Here's a simple way using OLINQ:

这是使用 OLINQ 的简单方法:

// Input array
List<String> lst = new List<string>();
for (int i = 0; i < 500; i += 1) lst.Add(i.ToString());

// Output array
List<String> lstRandom = new List<string>();

// Randomize
Random rnd = new Random();
lstRandom.AddRange(from s in lst orderby rnd.Next(100) select s);

回答by Matt Howells

The following implementation uses the Fisher-Yates algorithmAKA the Knuth Shuffle. It runs in O(n) time and shuffles in place, so is better performing than the 'sort by random' technique, although it is more lines of code. See herefor some comparative performance measurements. I have used System.Random, which is fine for non-cryptographic purposes.*

以下实现使用Fisher-Yates 算法,即 Knuth Shuffle。它在 O(n) 时间内运行并原地打乱,因此比“按随机排序”技术性能更好,尽管它的代码行数更多。有关一些比较性能测量,请参见此处。我使用了 System.Random,它适用于非加密目的。*

static class RandomExtensions
{
    public static void Shuffle<T> (this Random rng, T[] array)
    {
        int n = array.Length;
        while (n > 1) 
        {
            int k = rng.Next(n--);
            T temp = array[n];
            array[n] = array[k];
            array[k] = temp;
        }
    }
}

Usage:

用法:

var array = new int[] {1, 2, 3, 4};
var rng = new Random();
rng.Shuffle(array);
rng.Shuffle(array); // different order from first call to Shuffle

* For longer arrays, in order to make the (extremely large) number of permutations equally probable it would be necessary to run a pseudo-random number generator (PRNG) through many iterations for each swap to produce enough entropy. For a 500-element array only a very small fraction of the possible 500! permutations will be possible to obtain using a PRNG. Nevertheless, the Fisher-Yates algorithm is unbiased and therefore the shuffle will be as good as the RNG you use.

* 对于较长的数组,为了使(非常大的)排列数量相等,有必要通过每次交换的多次迭代运行伪随机数生成器 (PRNG) 以产生足够的熵。对于 500 个元素的数组,只有可能的 500 个元素的很小一部分!使用 PRNG 可以获得排列。尽管如此,Fisher-Yates 算法是无偏的,因此 shuffle 将与您使用的 RNG 一样好。

回答by Matt Howells

Jacco, your solution ising a custom IComparer isn't safe. The Sort routines require the comparer to conform to several requirements in order to function properly. First among them is consistency. If the comparer is called on the same pair of objects, it must always return the same result. (the comparison must also be transitive).

Jacco,您的解决方案是自定义 IComparer 不安全。Sort 例程要求比较器符合几个要求才能正常运行。其中首先是一致性。如果在同一对对象上调用比较器,则它必须始终返回相同的结果。(比较也必须是可传递的)。

Failure to meet these requirements can cause any number of problems in the sorting routine including the possibility of an infinite loop.

不满足这些要求可能会导致排序例程中出现许多问题,包括无限循环的可能性。

Regarding the solutions that associate a random numeric value with each entry and then sort by that value, these are lead to an inherent bias in the output because any time two entries are assigned the same numeric value, the randomness of the output will be compromised. (In a "stable" sort routine, whichever is first in the input will be first in the output. Array.Sort doesn't happen to be stable, but there is still a bias based on the partitioning done by the Quicksort algorithm).

关于将随机数值与每个条目关联然后按该值排序的解决方案,这些会导致输出中的固有偏差,因为任何时候两个条目分配相同的数值,输出的随机性都会受到影响。(在“稳定”排序例程中,输入中的第一个将在输出中的第一个。Array.Sort 碰巧不是稳定的,但仍然存在基于快速排序算法完成的分区的偏差)。

You need to do some thinking about what level of randomness you require. If you are running a poker site where you need cryptographic levels of randomness to protect against a determined attacker you have very different requirements from someone who just wants to randomize a song playlist.

您需要考虑一下您需要什么级别的随机性。如果您运行的扑克网站需要加密级别的随机性来抵御确定的攻击者,那么您的要求与只想随机化歌曲播放列表的人有很大不同。

For song-list shuffling, there's no problem using a seeded PRNG (like System.Random). For a poker site, it's not even an option and you need to think about the problem a lot harder than anyone is going to do for you on stackoverflow. (using a cryptographic RNG is only the beginning, you need to ensure that your algorithm doesn't introduce a bias, that you have sufficient sources of entropy, and that you don't expose any internal state that would compromise subsequent randomness).

对于歌曲列表改组,使用种子 PRNG(如 System.Random)没有问题。对于扑克网站来说,它甚至不是一个选项,你需要比任何人在 stackoverflow 上为你做的更难思考这个问题。(使用加密 RNG 只是开始,您需要确保您的算法不会引入偏差,您有足够的熵源,并且您不会暴露任何会损害后续随机性的内部状态)。