python 3.1 - 创建正态分布

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4697836/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 17:00:40  来源:igfitidea点击:

python 3.1 - Creating normal distribution

python

提问by jimy

I have scipy and numpy, Python v3.1

我有 scipy 和 numpy,Python v3.1

I need to create a 1D array of length 3million, using random numbers between (and including) 100-60,000. It has to fit a normal distribution.

我需要创建一个长度为 300 万的一维数组,使用介于(包括)100-60,000 之间的随机数。它必须符合正态分布。

Using 'a = numpy.random.standard_normal(3000000)', I get a normal distribution for that required length; not sure how to achieve the required range.

使用'a = numpy.random.standard_normal(3000000)',我得到了所需长度的正态分布;不确定如何达到所需的范围。

采纳答案by Apalala

A standard normal distribution has mean 0 and standard deviation 1. What I understand from your requirements is that you need a ((60000-100)/2, (60000-100)/2) one. Take each value from standard_normal()result, multiply it by the new variance, and add the new mean.

标准正态分布的平均值为 0,标准差为 1。我从您的要求中了解到,您需要一个 ((60000-100)/2, (60000-100)/2) 一个。从standard_normal()结果中取出每个值,乘以新的方差,然后加上新的均值

I haven't used NumPy, but a quick search of the docs says that you can achieve what you want directly bu using numpy.random.normal()

我没有使用过 NumPy,但是快速搜索文档说你可以直接实现你想要的 bu 使用 numpy.random.normal()

One last tidbit: normal distributions are not bounded. That means there isn't a value with probability zero. Your requirements should be in terms of means and variances (or standard deviations), and not of limits.

最后一点:正态分布是无界的。这意味着不存在概率为零的值。您的要求应该是均值和方差(或标准差),而不是限制。

回答by Eric Pauley

try this nice little method:

试试这个不错的小方法:

You'll want a method that just makes one random number.

您将需要一种仅生成一个随机数的方法。

import random
list = [random.randint(min,max) for i in range(numitems)]

This will give you a list with numitems random numbers between min and max.

这将为您提供一个列表,其中包含最小和最大之间的 numitems 随机数。

Of course, 3000000 is a lot of items to have in memory. Consider making the random numbers as they are needed by the program.

当然,3000000 在内存中是很多项目。考虑制作程序需要的随机数。

回答by fmark

If you want a truly random normal distribution, you can't guarentee how far the numbers will spread. You can reduce the probability of outliers, however, by specifying the standard deviation

如果你想要一个真正随机的正态分布,你不能保证数字会传播多远。但是,您可以通过指定标准偏差来降低异常值的概率

>>> n = 3000000
>>> sigma5 = 1.0 / 1744278
>>> n * sigma5
1.7199093263803131  # Expect one values in 3 mil outside range at 5 stdev.
>>> sigma6 = 1.0 / 1 / 506800000
>>> sigma6 = 1.0 / 506800000
>>> n * sigma6
0.0059194948697711127 # Expect 0.005 values in 3 mil outside range at 6 stdev.
>>> sigma7 = 1.0 / 390600000000
>>> n * sigma7
7.6804915514592934e-06

Therefore, in this case, ensuring that the standard deviation is only 1/6 or 1/7 of half the range will give you reasonable confidence that your data will not exceed the range.

因此,在这种情况下,确保标准偏差仅为范围一半的 1/6 或 1/7 将使您有理由相信您的数据不会超出范围。

>>> range = 60000 - 100
>>> spread = (range / 2) / 6 # Anything outside of the range will be six std. dev. from the mean
>>> mean = (60000 + 100) / 2
>>> a = numpy.random.normal(loc = mean, scale = spread, size = n) 
>>> min(a)
6320.0238199673404
>>> max(a)
55044.015566089176

Of course, you can still can values that fall outside the range here

当然,您仍然可以在这里设置超出范围的值