java.util.Random 真的那么随机吗?我怎么能生成52!(阶乘)可能的序列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51771206/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is java.util.Random really that random? How can I generate 52! (factorial) possible sequences?
提问by Serj Ardovic
I've been using Random (java.util.Random)
to shuffle a deck of 52 cards. There are 52! (8.0658175e+67) possibilities. Yet, I've found out that the seed for java.util.Random
is a long
, which is much smaller at 2^64 (1.8446744e+19).
我一直Random (java.util.Random)
用来洗一副 52 张牌。有52个!(8.0658175e+67) 可能性。然而,我发现种子java.util.Random
是 a long
,它在 2^64 (1.8446744e+19) 处要小得多。
From here, I'm suspicious whether java.util.Random
is really that random; is it actually capable of generating all 52! possibilities?
从这里开始,我怀疑是否java.util.Random
真的那么随机;它真的能够生成所有52个吗!可能性?
If not, how can I reliably generate a better random sequence that can produce all 52! possibilities?
如果没有,我怎样才能可靠地生成一个可以产生所有 52 个的更好的随机序列!可能性?
采纳答案by NPE
Selecting a random permutation requires simultaneously more and less randomness than what your question implies. Let me explain.
选择随机排列同时需要比您的问题所暗示的更多和更少的随机性。让我解释。
The bad news: need more randomness.
坏消息:需要更多的随机性。
The fundamental flaw in your approach is that it's trying to choose between ~2226possibilities using 64 bits of entropy (the random seed). To fairly choose between ~2226possibilities you're going to have to find a way to generate 226 bits of entropy instead of 64.
您的方法的根本缺陷是它试图使用 64 位熵(随机种子)在 ~2 226 种可能性之间进行选择。为了在 ~2 226 种可能性之间公平地进行选择,您将不得不找到一种方法来生成 226 位熵而不是 64 位。
There are several ways to generate random bits: dedicated hardware, CPU instructions, OS interfaces, online services. There is already an implicit assumption in your question that you can somehow generate 64 bits, so just do whatever you were going to do, only four times, and donate the excess bits to charity. :)
产生随机位的方法有几种:专用硬件、CPU指令、OS接口、在线服务。您的问题中已经有一个隐含的假设,即您可以以某种方式生成 64 位,所以只需做任何您想做的事,只做四次,并将多余的位捐赠给慈善机构。:)
The good news: need less randomness.
好消息:需要更少的随机性。
Once you have those 226 random bits, the rest can be done deterministically and so the properties of java.util.Random
can be made irrelevant. Here is how.
一旦您拥有那 226 个随机位,其余的就可以确定性地完成,因此的属性java.util.Random
可以变得无关紧要。这是如何。
Let's say we generate all 52! permutations (bear with me) and sort them lexicographically.
假设我们生成了所有 52 个!排列(请耐心等待)并按字典顺序对它们进行排序。
To choose one of the permutations all we need is a single random integer between 0
and 52!-1
. That integer is our 226 bits of entropy. We'll use it as an index into our sorted list of permutations. If the random index is uniformly distributed, not only are you guaranteed that all permutations can be chosen, they will be chosen equiprobably(which is a stronger guarantee than what the question is asking).
要选择一种排列,我们只需要一个介于0
和之间的随机整数52!-1
。该整数是我们的 226 位熵。我们将使用它作为我们排列的排序列表的索引。如果随机索引是均匀分布的,则不仅可以保证可以选择所有排列,而且可以等概率地选择它们(这是比问题所要求的更有力的保证)。
Now, you don't actually need to generate all those permutations. You can produce one directly, given its randomly chosen position in our hypothetical sorted list. This can be done in O(n2) time using the Lehmer[1]code(also see numbering permutationsand factoriadic number system). The n here is the size of your deck, i.e. 52.
现在,您实际上不需要生成所有这些排列。考虑到它在我们假设的排序列表中随机选择的位置,您可以直接生成一个。这可以使用Lehmer [1]代码在 O(n 2) 时间内完成(另请参阅编号排列和阶乘数系统)。这里的 n 是你的牌组的大小,即 52。
There is a C implementation in this StackOverflow answer. There are several integer variables there that would overflow for n=52, but luckily in Java you can use java.math.BigInteger
. The rest of the computations can be transcribed almost as-is:
这个StackOverflow answer 中有一个 C 实现。有几个整数变量会在 n=52 时溢出,但幸运的是在 Java 中你可以使用java.math.BigInteger
. 其余的计算几乎可以按原样转录:
public static int[] shuffle(int n, BigInteger random_index) {
int[] perm = new int[n];
BigInteger[] fact = new BigInteger[n];
fact[0] = BigInteger.ONE;
for (int k = 1; k < n; ++k) {
fact[k] = fact[k - 1].multiply(BigInteger.valueOf(k));
}
// compute factorial code
for (int k = 0; k < n; ++k) {
BigInteger[] divmod = random_index.divideAndRemainder(fact[n - 1 - k]);
perm[k] = divmod[0].intValue();
random_index = divmod[1];
}
// readjust values to obtain the permutation
// start from the end and check if preceding values are lower
for (int k = n - 1; k > 0; --k) {
for (int j = k - 1; j >= 0; --j) {
if (perm[j] <= perm[k]) {
perm[k]++;
}
}
}
return perm;
}
public static void main (String[] args) {
System.out.printf("%s\n", Arrays.toString(
shuffle(52, new BigInteger(
"7890123456789012345678901234567890123456789012345678901234567890"))));
}
[1]Not to be confused with Lehrer. :)
[1]不要与Lehrer混淆。:)
回答by Peter O.
In general, a pseudorandom number generator (PRNG) can't choose from among all permutations of a 52-item list if its state length is less than 226 bits.
通常,如果状态长度小于 226 位,伪随机数生成器 (PRNG) 无法从 52 项列表的所有排列中进行选择。
java.util.Random
implements an algorithm with a modulus of 248; thus its state length is only 48 bits, so much less than the 226 bits I referred to. You will need to use another PRNG with a bigger state length — specifically, one with a period of 52 factorial or greater.
java.util.Random
实现一个模数为 2 48的算法;因此它的状态长度只有 48 位,比我提到的 226 位少得多。您将需要使用另一个具有更大状态长度的 PRNG——特别是一个周期为 52 阶乘或更大的 PRNG。
See also "Shuffling" in my article on random number generators.
另请参阅我关于随机数生成器的文章中的“洗牌” 。
This consideration is independent of the nature of the PRNG; it applies equally to cryptographic and noncryptographic PRNGs (of course, noncryptographic PRNGs are inappropriate whenever information security is involved).
这种考虑与 PRNG 的性质无关;它同样适用于加密和非加密 PRNG(当然,当涉及信息安全时,非加密 PRNG 是不合适的)。
Although java.security.SecureRandom
allows seeds of unlimited length to be passed in, the SecureRandom
implementation could use an underlying PRNG (e.g., "SHA1PRNG" or "DRBG"). And it depends on that PRNG's period (and to a lesser extent, state length) whether it's capable of choosing from among 52 factorial permutations. (Note that I define "state length"as the "maximum size of the seed a PRNG can take to initialize its state without shortening or compressing that seed").
虽然java.security.SecureRandom
允许传入无限长度的种子,但SecureRandom
实现可以使用底层 PRNG(例如,“SHA1PRNG”或“DRBG”)。并且它是否能够从 52 个阶乘排列中进行选择取决于该 PRNG 的周期(以及在较小程度上,状态长度)。(请注意,我将“状态长度”定义为“PRNG 可以用来初始化其状态而不缩短或压缩该种子的种子的最大大小”)。
回答by dasblinkenlight
Your analysis is correct: seeding a pseudo-random number generator with any specific seed must yield the same sequence after a shuffle, limiting the number of permutations that you could obtain to 264. This assertion is easy to verify experimentallyby calling Collection.shuffle
twice, passing a Random
object initialized with the same seed, and observing that the two random shuffles are identical.
您的分析是正确的:使用任何特定种子播种伪随机数生成器必须在 shuffle 后产生相同的序列,将您可以获得的排列数量限制为 2 64。通过调用两次,传递一个用相同种子初始化的对象,并观察到两次随机洗牌是相同的,这个断言很容易通过实验验证。Collection.shuffle
Random
A solution to this, then, is to use a random number generator that allows for a larger seed. Java provides SecureRandom
class that could be initialized with byte[]
array of virtually unlimited size. You could then pass an instance of SecureRandom
to Collections.shuffle
to complete the task:
因此,对此的解决方案是使用允许更大种子的随机数生成器。Java 提供SecureRandom
了可以用byte[]
几乎无限大小的数组初始化的类。然后,您可以传递SecureRandom
to的实例Collections.shuffle
来完成任务:
byte seed[] = new byte[...];
Random rnd = new SecureRandom(seed);
Collections.shuffle(deck, rnd);
回答by IvanK
If you consider the number as just an array of bits (or bytes) then maybe you could use the (Secure)Random.nextBytes
solutions suggested in this Stack Overflowquestion, and then map the array into a new BigInteger(byte[])
.
如果您认为该数字只是一个位(或字节)数组,那么也许您可以使用Random.nextBytes
此堆栈溢出问题中建议的(安全)解决方案,然后将该数组映射到new BigInteger(byte[])
.
回答by Matt Timmermans
Let me apologize in advance, because this is a little tough to understand...
让我提前道歉,因为这有点难以理解......
First of all, you already know that java.util.Random
is not completely random at all. It generates sequences in a perfectly predictable way from the seed. You are completely correct that, since the seed is only 64 bits long, it can only generate 2^64 different sequences. If you were to somehow generate 64 real random bits and use them to select a seed, you could not use that seed to randomly choose between allof the 52! possible sequences with equal probability.
首先,您已经知道这java.util.Random
根本不是完全随机的。它以完全可预测的方式从种子生成序列。您完全正确,因为种子只有 64 位长,所以它只能生成 2^64 个不同的序列。如果您要以某种方式生成 64 个真正的随机位并使用它们来选择种子,则您不能使用该种子在所有52 个中随机选择!等概率的可能序列。
However, this fact is of no consequenceas long as you're not actually going to generate more than 2^64 sequences, as long as there is nothing 'special' or 'noticeably special' about the 2^64 sequences that it cangenerate.
然而,这其实是无关紧要的,只要你不是真的要产生超过2 ^ 64个序列,只要有什么“特殊”或“显着特殊的”大约2 ^ 64个序列,它可以产生.
Lets say you had a much better PRNG that used 1000-bit seeds. Imagine you had two ways to initialize it -- one way would initialize it using the whole seed, and one way would hash the seed down to 64 bits before initializing it.
假设您有一个使用 1000 位种子的更好的 PRNG。想象一下,你有两种方法来初始化它——一种方法是使用整个种子来初始化它,另一种方法是在初始化之前将种子散列到 64 位。
If you didn't know which initializer was which, could you write any kind of test to distinguish them? Unless you were (un)lucky enough to end up initializing the bad one with the same64 bits twice, then the answer is no. You could not distinguish between the two initializers without some detailed knowledge of some weakness in the specific PRNG implementation.
如果您不知道哪个初始化程序是哪个,您可以编写任何类型的测试来区分它们吗?除非你(不)幸运地最终用相同的64 位初始化坏的两次,否则答案是否定的。如果不了解特定 PRNG 实现中的某些弱点,您将无法区分这两个初始化程序。
Alternatively, imagine that the Random
class had an array of 2^64 sequences that were selected completely and random at some time in the distant past, and that the seed was just an index into this array.
或者,假设Random
该类有一个由 2^64 个序列组成的数组,这些序列是在遥远的过去某个时间完全随机选择的,而种子只是该数组的索引。
So the fact that Random
uses only 64 bits for its seed is actually notnecessarily a problem statistically, as long as there is no significant chance that you will use the same seed twice.
因此,事实上,Random
使用仅64位它的种子其实并不一定是统计上的问题,只要有您将使用相同的种子两次没有显著的机会。
Of course, for cryptographicpurposes, a 64 bit seed is just not enough, because getting a system to use the same seed twice is computationally feasible.
当然,出于加密目的,64 位种子是不够的,因为让系统使用相同的种子两次在计算上是可行的。
EDIT:
编辑:
I should add that, even though all of the above is correct, that the actual implementation of java.util.Random
is not awesome. If you are writing a card game, maybe use the MessageDigest
API to generate the SHA-256 hash of "MyGameName"+System.currentTimeMillis()
, and use those bits to shuffle the deck. By the above argument, as long as your users are not really gambling, you don't have to worry that currentTimeMillis
returns a long. If your users arereally gambling, then use SecureRandom
with no seed.
我应该补充一点,即使上述所有内容都是正确的,但实际的实现java.util.Random
并不出色。如果您正在编写纸牌游戏,也许可以使用MessageDigest
API 生成 的 SHA-256 哈希值"MyGameName"+System.currentTimeMillis()
,并使用这些位来洗牌。由上面的说法,只要你的用户不是真的赌博,你就不用担心currentTimeMillis
返回一个long。如果你的用户是真正的赌博,然后用SecureRandom
无种子。
回答by Kevin
I'm going to take a bit of a different tack on this. You're right on your assumptions - your PRNG isn't going to be able to hit all 52! possibilities.
我将对此采取一些不同的策略。您的假设是正确的 - 您的 PRNG 无法达到全部 52 个!可能性。
The question is: what's the scale of your card game?
问题是:你的纸牌游戏的规模是多少?
If you're making a simple klondike-style game?Then you definitely don't needall 52! possibilities. Instead, look at it like this: a player will have 18 quintilliondistinct games. Even accounting for the 'Birthday Problem', they'd have to play billions of hands before they'd run into the first duplicate game.
如果你正在制作一个简单的克朗代克风格的游戏?那么你绝对不需要全部 52 个!可能性。相反,看它是这样的:一个玩家将有18个百万的三次不同的游戏。即使考虑到“生日问题”,他们在遇到第一个重复游戏之前也必须玩数十亿手牌。
If you're making a monte-carlo simulation?Then you're probablyokay. You might have to deal with artifacts due to the 'P' in PRNG, but you're probably not going to run into problems simply due to a low seed space (again, you're looking at quintillions of unique possibilities.) On the flip side, if you're working with large iteration count, then, yeah, your low seed space might be a deal-breaker.
如果您要进行蒙特卡罗模拟?那你可能没问题。由于 PRNG 中的“P”,您可能不得不处理伪像,但您可能不会仅仅因为种子空间低而遇到问题(同样,您正在寻找成千上万种独特的可能性。)另一方面,如果您正在处理大量迭代,那么,是的,您的低种子空间可能会破坏交易。
If you're making a multiplayer card game, particularly if there's money on the line?Then you're going to need to do some googling on how the online poker sites handled the same problem you're asking about. Because while the low seed space issue isn't noticeableto the average player, it is exploitableif it's worth the time investment. (The poker sites all went through a phase where their PRNGs were 'hacked', letting someone see the hole cards of all the other players, simply by deducing the seed from exposed cards.) If this is the situation you're in, don'tsimply find a better PRNG - you'll need to treat it as seriously as a Crypto problem.
如果您正在制作多人纸牌游戏,尤其是在线上有钱的时候?然后,您将需要对在线扑克网站如何处理您所询问的相同问题进行一些谷歌搜索。因为虽然普通玩家不会注意到低种子空间问题,但如果值得花时间投资,它是可以利用的。(扑克网站都经历过一个阶段,他们的 PRNG 被“黑客攻击”,让某人看到所有其他玩家的底牌,只需从暴露的牌中推断出种子。)如果您处于这种情况,请不要不是简单地找到更好的 PRNG - 您需要像对待加密问题一样认真对待它。
回答by Thorsten S.
Short solution which is essentially the same of dasblinkenlight:
与 dasblinkenlight 基本相同的简短解决方案:
// Java 7
SecureRandom random = new SecureRandom();
// Java 8
SecureRandom random = SecureRandom.getInstanceStrong();
Collections.shuffle(deck, random);
You don't need to worry about the internal state. Long explanation why:
您无需担心内部状态。详细解释为什么:
When you create a SecureRandom
instance this way, it accesses an OS specific
true random number generator. This is either an entropy pool where values are
accessed which contain random bits (e.g. for a nanosecond timer the nanosecond
precision is essentially random) or an internal hardware number generator.
当您以SecureRandom
这种方式创建实例时,它会访问操作系统特定的真随机数生成器。这是一个熵池,其中包含随机位的值被访问(例如,对于纳秒计时器,纳秒精度基本上是随机的)或内部硬件数字生成器。
This input (!) which may still contain spurious traces are fed into a
cryptographically strong hash which removes those traces. That is the reason those CSPRNGs are used, not for creating those numbers themselves! The SecureRandom
has a counter which traces how many bits were used (getBytes()
, getLong()
etc.) and refills the SecureRandom
with entropy bits when necessary.
这个可能仍然包含虚假痕迹的输入 (!) 被输入到一个加密的强散列中,以删除这些痕迹。这就是使用这些 CSPRNG 的原因,而不是用于创建这些数字本身!在SecureRandom
具有跟踪多少位被用于(一个计数器getBytes()
,getLong()
等)和笔芯的SecureRandom
熵比特必要时。
In short: Simply forget objections and use SecureRandom
as true random number generator.
简而言之:只需忘记反对意见并SecureRandom
用作真正的随机数生成器。
回答by Artelius
A very simple algorithm is to apply SHA-256 to a sequence of integers incrementing from 0 upwards. (A salt can be appended if desired to "get a different sequence".) If we assume that the output of SHA-256 is "as good as" uniformly distributed integers between 0 and 2256- 1 then we have enough entropy for the task.
一个非常简单的算法是将 SHA-256 应用于从 0 向上递增的整数序列。(如果需要,可以附加盐以“获得不同的序列”。)如果我们假设 SHA-256 的输出与 0 到 2 256- 1之间的均匀分布整数“一样好”,那么我们有足够的熵任务。
To get a permutation from the output of SHA256 (when expressed as an integer) one simply needs to reduce it modulo 52, 51, 50... as in this pseudocode:
要从 SHA256 的输出(当表示为整数时)中获得排列,只需将其以 52、51、50...
deck = [0..52]
shuffled = []
r = SHA256(i)
while deck.size > 0:
pick = r % deck.size
r = floor(r / deck.size)
shuffled.append(deck[pick])
delete deck[pick]