Python numpy.random.seed(0) 有什么作用?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21494489/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What does numpy.random.seed(0) do?
提问by covariance
What does np.random.seeddo in the below code from a Scikit-Learn tutorial? I'm not very familiar with NumPy's random state generator stuff, so I'd really appreciate a layman's terms explanation of this.
np.random.seed以下来自 Scikit-Learn 教程的代码做了什么?我对 NumPy 的随机状态生成器的东西不是很熟悉,所以我真的很感激外行人对此的解释。
np.random.seed(0)
indices = np.random.permutation(len(iris_X))
采纳答案by John1024
np.random.seed(0)makes the random numbers predictable
np.random.seed(0)使随机数可预测
>>> numpy.random.seed(0) ; numpy.random.rand(4)
array([ 0.55, 0.72, 0.6 , 0.54])
>>> numpy.random.seed(0) ; numpy.random.rand(4)
array([ 0.55, 0.72, 0.6 , 0.54])
With the seed reset (every time), the sameset of numbers will appear every time.
随着种子重置(每次),每次都会出现相同的一组数字。
If the random seed is not reset, differentnumbers appear with every invocation:
如果不重置随机种子,则每次调用都会出现不同的数字:
>>> numpy.random.rand(4)
array([ 0.42, 0.65, 0.44, 0.89])
>>> numpy.random.rand(4)
array([ 0.96, 0.38, 0.79, 0.53])
(pseudo-)random numbers work by starting with a number (the seed), multiplying it by a large number, adding an offset, then taking modulo of that sum. The resulting number is then used as the seed to generate the next "random" number. When you set the seed (every time), it does the same thing every time, giving you the same numbers.
(伪)随机数的工作原理是从一个数字(种子)开始,将它乘以一个大数,加上一个偏移量,然后对该和取模。然后将结果数用作种子以生成下一个“随机”数。当你设置种子时(每次),它每次都做同样的事情,给你相同的数字。
If you want seemingly random numbers, do not set the seed. If you have code that uses random numbers that you want to debug, however, it can be very helpful to set the seed before each run so that the code does the same thing every time you run it.
如果您想要看似随机的数字,请不要设置种子。但是,如果您有要调试的使用随机数的代码,那么在每次运行之前设置种子会非常有帮助,这样代码每次运行时都会做同样的事情。
To get the most random numbers for each run, call numpy.random.seed(). Thiswill cause numpy to set the seed to a random number obtained from /dev/urandomor its Windows analog or, if neither of those is available, it will use the clock.
要为每次运行获得最多的随机数,请调用numpy.random.seed()。 这将导致 numpy 将种子设置为从/dev/urandom或它的 Windows 模拟获得的随机数,或者,如果这两个都不可用,它将使用时钟。
For more information on using seeds to generate pseudo-random numbers, see wikipedia.
有关使用种子生成伪随机数的更多信息,请参阅wikipedia。
回答by ntg
As noted, numpy.random.seed(0) sets the random seed to 0, so the pseudo random numbers you get from random will start from the same point. This can be good for debuging in some cases. HOWEVER, after some reading, this seems to be the wrong way to go at it, if you have threads because it is not thread safe.
如前所述, numpy.random.seed(0) 将随机种子设置为 0,因此您从 random 获得的伪随机数将从同一点开始。在某些情况下,这对调试很有用。但是,经过一些阅读,如果您有线程,这似乎是错误的方法,因为它不是线程安全的。
from differences-between-numpy-random-and-random-random-in-python:
从numpy-random-and-random-random-in-python 之间的差异:
For numpy.random.seed(), the main difficulty is that it is not thread-safe - that is, it's not safe to use if you have many different threads of execution, because it's not guaranteed to work if two different threads are executing the function at the same time. If you're not using threads, and if you can reasonably expect that you won't need to rewrite your program this way in the future, numpy.random.seed() should be fine for testing purposes. If there's any reason to suspect that you may need threads in the future, it's much safer in the long run to do as suggested, and to make a local instance of the numpy.random.Random class. As far as I can tell, random.random.seed() is thread-safe (or at least, I haven't found any evidence to the contrary).
对于 numpy.random.seed(),主要的困难在于它不是线程安全的——也就是说,如果你有许多不同的执行线程,使用它是不安全的,因为如果两个不同的线程正在执行,它不能保证工作同时功能。如果您不使用线程,并且您可以合理地期望将来不需要以这种方式重写您的程序,那么 numpy.random.seed() 应该可以用于测试目的。如果有任何理由怀疑您将来可能需要线程,从长远来看,按照建议进行操作并创建 numpy.random.Random 类的本地实例会更安全。据我所知, random.random.seed() 是线程安全的(或者至少,我没有发现任何相反的证据)。
example of how to go about this:
如何解决这个问题的例子:
from numpy.random import RandomState
prng = RandomState()
print prng.permutation(10)
prng = RandomState()
print prng.permutation(10)
prng = RandomState(42)
print prng.permutation(10)
prng = RandomState(42)
print prng.permutation(10)
may give:
可能会给:
[3 0 4 6 8 2 1 9 7 5]
[1 6 9 0 2 7 8 3 5 4]
[8 1 5 0 7 2 9 4 3 6]
[8 1 5 0 7 2 9 4 3 6]
[3 0 4 6 8 2 1 9 7 5]
[1 6 9 0 2 7 8 3 5 4]
[8 1 5 0 7 2 9 4 3 6]
[8 1 5 0 7 2 9 4 3 6]
Lastly, note that there might be cases where initializing to 0 (as opposed to a seed that has not all bits 0) may result to non-uniform distributions for some few first iterations because of the way xor works, but this depends on the algorithm, and is beyond my current worries and the scope of this question.
最后,请注意,由于 xor 的工作方式,在某些情况下,初始化为 0(而不是所有位均为 0 的种子)可能会导致一些第一次迭代的分布不均匀,但这取决于算法,并且超出了我目前的担忧和这个问题的范围。
回答by Zhun Chen
If you set the np.random.seed(a_fixed_number)every time you call the numpy's other random function, the result will be the same:
如果np.random.seed(a_fixed_number)每次调用 numpy 的其他随机函数时都设置 ,结果将是相同的:
>>> import numpy as np
>>> np.random.seed(0)
>>> perm = np.random.permutation(10)
>>> print perm
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.rand(4)
[0.5488135 0.71518937 0.60276338 0.54488318]
>>> np.random.seed(0)
>>> print np.random.rand(4)
[0.5488135 0.71518937 0.60276338 0.54488318]
However, if you just call it once and use various random functions, the results will still be different:
但是,如果只调用一次并使用各种随机函数,结果仍然会有所不同:
>>> import numpy as np
>>> np.random.seed(0)
>>> perm = np.random.permutation(10)
>>> print perm
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> print np.random.permutation(10)
[3 5 1 2 9 8 0 6 7 4]
>>> print np.random.permutation(10)
[2 3 8 4 5 1 0 6 9 7]
>>> print np.random.rand(4)
[0.64817187 0.36824154 0.95715516 0.14035078]
>>> print np.random.rand(4)
[0.87008726 0.47360805 0.80091075 0.52047748]
回答by sunidhi mittal
A random seed specifies the start point when a computer generates a random number sequence.
随机种子指定计算机生成随机数序列时的起点。
For example, let's say you wanted to generate a random number in Excel (Note: Excel sets a limit of 9999 for the seed). If you enter a number into the Random Seed box during the process, you'll be able to use the same set of random numbers again. If you typed “77” into the box, and typed “77” the next time you run the random number generator, Excel will display that same set of random numbers. If you type “99”, you'll get an entirely different set of numbers. But if you revert back to a seed of 77, then you'll get the same set of random numbers you started with.
例如,假设您想在 Excel 中生成一个随机数(注意:Excel 为种子设置了 9999 的限制)。如果您在此过程中在随机种子框中输入一个数字,您将能够再次使用同一组随机数。如果您在框中键入“77”,并在下次运行随机数生成器时键入“77”,Excel 将显示相同的随机数集。如果你输入“99”,你会得到一组完全不同的数字。但是,如果您恢复到 77 的种子,那么您将获得与开始时相同的一组随机数。
For example, “take a number x, add 900 +x, then subtract 52.” In order for the process to start, you have to specify a starting number, x (the seed). Let's take the starting number 77:
例如,“取一个数字 x,加上 900 + x,然后减去 52。” 为了启动进程,您必须指定一个起始编号 x(种子)。让我们以起始数字 77 为例:
Add 900 + 77 = 977 Subtract 52 = 925 Following the same algorithm, the second “random” number would be:
加 900 + 77 = 977 减 52 = 925 按照相同的算法,第二个“随机”数将是:
900 + 925 = 1825 Subtract 52 = 1773 This simple example follows a pattern, but the algorithms behind computer number generation are much more complicated
900 + 925 = 1825 减 52 = 1773 这个简单的例子遵循一个模式,但计算机数字生成背后的算法要复杂得多
回答by Prashant
All the random numbers generated after setting particular seed value are same across all the platforms/systems.
设置特定种子值后生成的所有随机数在所有平台/系统中都是相同的。
回答by A Santosh
I have used this very often in neural networks. It is well known that when we start training a neural network we randomly initialise the weights. The model is trained on these weights on a particular dataset. After number of epochs you get trained set of weights.
我在神经网络中经常使用它。众所周知,当我们开始训练神经网络时,我们会随机初始化权重。该模型在特定数据集上的这些权重上进行训练。在经过多次训练后,您将获得一组经过训练的权重。
Now suppose you want to again train from scratch or you want to pass the model to others to reproduce your results, the weights will be again initialised to a random numbers which mostly will be different from earlier ones. The obtained trained weights after same number of epochs ( keeping same data and other parameters ) as earlier one will differ. The problem is your model is no more reproducible that is every time you train your model from scratch it provides you different sets of weights. This is because the model is being initialized by different random numbers every time.
现在假设您想再次从头开始训练,或者您想将模型传递给其他人以重现您的结果,权重将再次初始化为随机数,该数通常与之前的数不同。在与之前的相同数量的 epoch(保持相同的数据和其他参数)之后获得的训练权重会有所不同。问题是您的模型不再具有可重复性,因为每次您从头开始训练模型时,它都会为您提供不同的权重集。这是因为模型每次都被不同的随机数初始化。
What if every time you start training from scratch the model is initialised to the same set of random initialise weights? In this case your model could become reproducible. This is achieved by numpy.random.seed(0). By mentioning seed() to a particular number, you are hanging on to same set of random numbers always.
如果每次从头开始训练时,模型都会初始化为相同的随机初始化权重集怎么办?在这种情况下,您的模型可以重现。这是通过 numpy.random.seed(0) 实现的。通过向特定数字提及 seed() ,您始终会使用同一组随机数。
回答by cjHerold
Imagine you are showing someone how to code something with a bunch of "random" numbers. By using numpy seed they can use the same seed number and get the same set of "random" numbers.
想象一下,您正在向某人展示如何使用一堆“随机”数字对某些内容进行编码。通过使用 numpy 种子,他们可以使用相同的种子数并获得相同的“随机”数集。
So it's not exactly random because an algorithm spits out the numbers but it looks like a randomly generated bunch.
所以它不是完全随机的,因为算法会吐出数字,但它看起来像是随机生成的一堆。
回答by Ruslan S.
There is a nice explanation in Numpy docs: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.RandomState.htmlit refers to Mersenne Twister pseudo-random number generator. More details on the algorithm here: https://en.wikipedia.org/wiki/Mersenne_Twister
Numpy 文档中有一个很好的解释:https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.RandomState.html 它指的是Mersenne Twister 伪随机数生成器。有关该算法的更多详细信息,请访问:https: //en.wikipedia.org/wiki/Mersenne_Twister
回答by Humayun Ahmad Rajib
numpy.random.seed(0)
numpy.random.randint(10, size=5)
This produces the following output:
array([5, 0, 3, 3, 7])Again,if we run the same code we will get the same result.
这会产生以下输出:
array([5, 0, 3, 3, 7])同样,如果我们运行相同的代码,我们将得到相同的结果。
Now if we change the seed value 0 to 1 or others:
现在,如果我们将种子值 0 更改为 1 或其他值:
numpy.random.seed(1)
numpy.random.randint(10, size=5)
This produces the following output: array([5 8 9 5 0])but now the output not the same like above.
这会产生以下输出:array([5 8 9 5 0])但现在输出与上面不同。

