Python 如何在numpy范围内获得正态分布?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36894191/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to get a normal distribution within a range in numpy?
提问by maple
In machine learning task. We should get a group of random w.r.t normal distribution with bound. We can get a normal distribution number with np.random.normal()
but it does't offer any bound parameter. I want to know how to do that?
在机器学习任务中。我们应该得到一组有界的随机wrt正态分布。我们可以得到一个正态分布数,np.random.normal()
但它不提供任何绑定参数。我想知道怎么做?
回答by toto_tico
The parametrization of truncnorm
is complicated, so here is a function that translates the parametrization to something more intuitive:
的参数化truncnorm
很复杂,所以这里有一个函数可以将参数化转换为更直观的东西:
from scipy.stats import truncnorm
def get_truncated_normal(mean=0, sd=1, low=0, upp=10):
return truncnorm(
(low - mean) / sd, (upp - mean) / sd, loc=mean, scale=sd)
How to use it?
如何使用它?
Instance the generator with the parameters: mean, standard deviation, and truncation range:
>>> X = get_truncated_normal(mean=8, sd=2, low=1, upp=10)
Then, you can use X to generate a value:
>>> X.rvs() 6.0491227353928894
Or, a numpy array with N generated values:
>>> X.rvs(10) array([ 7.70231607, 6.7005871 , 7.15203887, 6.06768994, 7.25153472, 5.41384242, 7.75200702, 5.5725888 , 7.38512757, 7.47567455])
使用以下参数实例化生成器:mean、standard bias和truncation range:
>>> X = get_truncated_normal(mean=8, sd=2, low=1, upp=10)
然后,您可以使用 X 生成一个值:
>>> X.rvs() 6.0491227353928894
或者,一个带有 N 个生成值的 numpy 数组:
>>> X.rvs(10) array([ 7.70231607, 6.7005871 , 7.15203887, 6.06768994, 7.25153472, 5.41384242, 7.75200702, 5.5725888 , 7.38512757, 7.47567455])
A Visual Example
视觉示例
Here is the plot of three different truncated normal distributions:
这是三个不同的截断正态分布图:
X1 = get_truncated_normal(mean=2, sd=1, low=1, upp=10)
X2 = get_truncated_normal(mean=5.5, sd=1, low=1, upp=10)
X3 = get_truncated_normal(mean=8, sd=1, low=1, upp=10)
import matplotlib.pyplot as plt
fig, ax = plt.subplots(3, sharex=True)
ax[0].hist(X1.rvs(10000), normed=True)
ax[1].hist(X2.rvs(10000), normed=True)
ax[2].hist(X3.rvs(10000), normed=True)
plt.show()
回答by bakkal
If you're looking for the Truncated normal distribution, SciPy has a function for it called truncnorm
如果您正在寻找截断正态分布,SciPy 有一个名为truncnorm
The standard form of this distribution is a standard normal truncated to the range [a, b] — notice that a and b are defined over the domain of the standard normal. To convert clip values for a specific mean and standard deviation, use:
a, b = (myclip_a - my_mean) / my_std, (myclip_b - my_mean) / my_std
truncnorm takes a and b as shape parameters.
这种分布的标准形式是截断到 [a, b] 范围内的标准正态——注意 a 和 b 是在标准正态的域上定义的。要转换特定平均值和标准偏差的剪辑值,请使用:
a, b = (myclip_a - my_mean) / my_std, (myclip_b - my_mean) / my_std
truncnorm 将 a 和 b 作为形状参数。
>>> from scipy.stats import truncnorm
>>> truncnorm(a=-2/3., b=2/3., scale=3).rvs(size=10)
array([-1.83136675, 0.77599978, -0.01276925, 1.87043384, 1.25024188,
0.59336279, -0.39343176, 1.9449987 , -1.97674358, -0.31944247])
The above example is bounded by -2 and 2 and returns 10 random variates (using the .rvs()
method)
上面的例子以-2和2为界,返回10个随机变量(使用.rvs()
方法)
>>> min(truncnorm(a=-2/3., b=2/3., scale=3).rvs(size=10000))
-1.9996074381484044
>>> max(truncnorm(a=-2/3., b=2/3., scale=3).rvs(size=10000))
1.9998486576228549
Here's a histogram plot for -6, 6:
这是 -6, 6 的直方图:
回答by armatita
Besides @bakkal suggestion (+1) you might also want to take a look into Vincent Mazetrecipe for achieving this, rewritten as py-rtnormmodule by Christoph Lassner.
除了@bakkal 建议(+1)之外,您可能还想查看Vincent Mazet实现这一目标的秘诀,该秘诀由Christoph Lassner重写为py-rtnorm模块。
回答by Fay?al BENAHMED
You can subdivide your targeted range (by convention) to equal partitions and then calculate the integration of each and all area, then call uniform method on each partition according to the surface. It's implemented in python:
您可以将目标范围(按照惯例)细分为相等的分区,然后计算每个区域和所有区域的积分,然后根据曲面对每个分区调用统一方法。它是在python中实现的:
quad_vec(eval('scipy.stats.norm.pdf'), 1, 4,points=[0.5,2.5,3,4],full_output=True)
quad_vec(eval('scipy.stats.norm.pdf'), 1, 4,points=[0.5,2.5,3,4],full_output=True)
回答by Amuoeba
If you just want to work with numpy
you could also do something like this:
如果你只是想和numpy
你一起工作,你也可以这样做:
int(np.clip(int(np.random.normal(mean,std)),min_size,max_size)
This will just clip smaller and larger values to your specified min
and max
这只会将越来越小的值剪辑到您指定的min
和max