Python 生成随机数列表,总和为 1
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18659858/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Generating a list of random numbers, summing to 1
提问by Tom Kealy
How would I make a list of N (say 100) random numbers, so that their sum is 1?
我将如何制作 N 个(比如 100 个)随机数的列表,使它们的总和为 1?
I can make a list of random numbers with
我可以制作一个随机数列表
r = [ran.random() for i in range(1,100)]
How would I modify this so that the list sums to 1 (this is for a probability simulation).
我将如何修改它以使列表总和为 1(这是用于概率模拟)。
采纳答案by sega_sai
The simplest solution is indeed to take N random values and divide by the sum.
最简单的解决方案确实是取 N 个随机值并除以总和。
A more generic solution is to use the Dirichlet distribution http://en.wikipedia.org/wiki/Dirichlet_distributionwhich is available in numpy.
更通用的解决方案是使用 numpy 中提供的 Dirichlet 分布 http://en.wikipedia.org/wiki/Dirichlet_distribution。
By changing the parameters of the distribution you can change the "randomness" of individual numbers
通过更改分布的参数,您可以更改单个数字的“随机性”
>>> import numpy as np, numpy.random
>>> print np.random.dirichlet(np.ones(10),size=1)
[[ 0.01779975 0.14165316 0.01029262 0.168136 0.03061161 0.09046587
0.19987289 0.13398581 0.03119906 0.17598322]]
>>> print np.random.dirichlet(np.ones(10)/1000.,size=1)
[[ 2.63435230e-115 4.31961290e-209 1.41369771e-212 1.42417285e-188
0.00000000e+000 5.79841280e-143 0.00000000e+000 9.85329725e-005
9.99901467e-001 8.37460207e-246]]
>>> print np.random.dirichlet(np.ones(10)*1000.,size=1)
[[ 0.09967689 0.10151585 0.10077575 0.09875282 0.09935606 0.10093678
0.09517132 0.09891358 0.10206595 0.10283501]]
Depending on the main parameter the Dirichlet distribution will either give vectors where all the values are close to 1./N where N is the length of the vector, or give vectors where most of the values of the vectors will be ~0 , and there will be a single 1, or give something in between those possibilities.
根据主要参数,狄利克雷分布将给出所有值都接近 1./N 的向量,其中 N 是向量的长度,或者给出向量的大部分值都为 ~0 的向量,并且有将是一个单一的 1,或者在这些可能性之间给出一些东西。
EDIT(5 years after the original answer): Another useful fact about the Dirichlet distribution is that you naturally get it, if you generate a Gamma-distributed set of random variables and then divide them by their sum.
编辑(在原始答案之后 5 年):关于狄利克雷分布的另一个有用的事实是,如果您生成一组 Gamma 分布的随机变量,然后将它们除以它们的总和,那么您自然会得到它。
回答by Paul Evans
You could easily do with:
你可以很容易地做到:
r.append(1 - sum(r))
回答by askewchan
The best way to do this is to simply make a list of as many numbers as you wish, then divide them all by the sum. They are totally random this way.
做到这一点的最好方法是简单地列出任意数量的数字,然后将它们全部除以总和。他们是完全随机的。
r = [ran.random() for i in range(1,100)]
s = sum(r)
r = [ i/s for i in r ]
or, as suggested by @TomKealy, keep the sum and creation in one loop:
或者,按照@TomKealy 的建议,将总和和创建保持在一个循环中:
rs = []
s = 0
for i in range(100):
r = ran.random()
s += r
rs.append(r)
For the fastest performance, use numpy
:
为了获得最快的性能,请使用numpy
:
import numpy as np
a = np.random.random(100)
a /= a.sum()
And you can give the random numbers any distribution you want, for a probability distribution:
对于概率分布,您可以为随机数提供任何您想要的分布:
a = np.random.normal(size=100)
a /= a.sum()
---- Timing ----
---- 时间 ----
In [52]: %%timeit
...: r = [ran.random() for i in range(1,100)]
...: s = sum(r)
...: r = [ i/s for i in r ]
....:
1000 loops, best of 3: 231 μs per loop
In [53]: %%timeit
....: rs = []
....: s = 0
....: for i in range(100):
....: r = ran.random()
....: s += r
....: rs.append(r)
....:
10000 loops, best of 3: 39.9 μs per loop
In [54]: %%timeit
....: a = np.random.random(100)
....: a /= a.sum()
....:
10000 loops, best of 3: 21.8 μs per loop
回答by guessing
generate 100 random numbers doesn't matter what range. sum the numbers generated, divide each individual by the total.
生成 100 个随机数与什么范围无关。将生成的数字相加,将每个人除以总数。
回答by Mike Housky
Dividing each number by the total may not give you the distribution you want. For example, with two numbers, the pair x,y = random.random(), random.random() picks a point uniformly on the square 0<=x<1, 0<=y<1. Dividing by the sum "projects" that point (x,y) onto the line x+y=1 along the line from (x,y) to the origin. Points near (0.5,0.5) will be much more likely than points near (0.1,0.9).
将每个数字除以总数可能无法得到您想要的分布。例如,对于两个数字,一对 x,y = random.random(), random.random() 在正方形 0<=x<1, 0<=y<1 上均匀地选取一个点。除以总和将点 (x,y) 沿从 (x,y) 到原点的线“投影”到 x+y=1 线上。(0.5,0.5) 附近的点比 (0.1,0.9) 附近的点更有可能。
For two variables, then, x = random.random(), y=1-x gives a uniform distribution along the geometrical line segment.
对于两个变量,x = random.random(), y=1-x 给出沿几何线段的均匀分布。
With 3 variables, you are picking a random point in a cube and projecting (radially, through the origin), but points near the center of the triangle will be more likely than points near the vertices. The resulting points are on a triangle in the x+y+z plane. If you need unbiased choice of points in that triangle, scaling is no good.
使用 3 个变量,您将在立方体中随机选取一个点并进行投影(径向,通过原点),但是靠近三角形中心的点比靠近顶点的点更有可能。结果点位于 x+y+z 平面中的三角形上。如果您需要在该三角形中无偏地选择点,则缩放是不好的。
The problem gets complicated in n-dimensions, but you can get a low-precision (but high accuracy, for all you laboratory science fans!) estimate by picking uniformly from the set of all n-tuples of non-negative integers adding up to N, and then dividing each of them by N.
问题在 n 维中变得复杂,但是您可以通过从所有非负整数的 n 元组的集合中统一挑选来获得低精度(但高精度,对于所有实验室科学迷!)的估计N,然后将它们中的每一个除以 N。
I recently came up with an algorithm to do that for modest-sized n, N. It should work for n=100 and N = 1,000,000 to give you 6-digit randoms. See my answer at:
我最近想出了一个算法来为中等大小的 n、N 做这件事。它应该适用于 n=100 和 N=1,000,000 给你 6 位随机数。请参阅我的回答:
回答by pjs
Create a list consisting of 0 and 1, then add 99 random numbers. Sort the list. Successive differences will be the lengths of intervals that add up to 1.
创建一个由 0 和 1 组成的列表,然后添加 99 个随机数。对列表进行排序。连续的差异将是加起来为 1 的间隔长度。
I'm not fluent in Python, so forgive me if there's a more Pythonic way of doing this. I hope the intent is clear though:
我对 Python 不是很流利,所以如果有更 Pythonic 的方式来做这件事,请原谅我。我希望意图是明确的:
import random
values = [0.0, 1.0]
for i in range(99):
values.append(random.random())
values.sort()
results = []
for i in range(1,101):
results.append(values[i] - values[i-1])
print results
Here's an updated implementation in Python 3:
这是 Python 3 中的更新实现:
import random
def sum_to_one(n):
values = [0.0, 1.0] + [random.random() for _ in range(n - 1)]
values.sort()
return [values[i+1] - values[i] for i in range(n)]
print(sum_to_one(100))
回答by litepresence
In the spirit of "divide each element in list by sum of list", this definition will create a list of random numbers of length = PARTS, sum = TOTAL, with each element rounded to PLACES (or None):
本着“将列表中的每个元素按列表总和划分”的精神,此定义将创建一个长度为 PARTS,总和 = TOTAL 的随机数列表,每个元素四舍五入为 PLACES(或无):
import random
import time
PARTS = 5
TOTAL = 10
PLACES = 3
def random_sum_split(parts, total, places):
a = []
for n in range(parts):
a.append(random.random())
b = sum(a)
c = [x/b for x in a]
d = sum(c)
e = c
if places != None:
e = [round(x*total, places) for x in c]
f = e[-(parts-1):]
g = total - sum(f)
if places != None:
g = round(g, places)
f.insert(0, g)
log(a)
log(b)
log(c)
log(d)
log(e)
log(f)
log(g)
return f
def tick():
if info.tick == 1:
start = time.time()
alpha = random_sum_split(PARTS, TOTAL, PLACES)
log('********************')
log('***** RESULTS ******')
log('alpha: %s' % alpha)
log('total: %.7f' % sum(alpha))
log('parts: %s' % PARTS)
log('places: %s' % PLACES)
end = time.time()
log('elapsed: %.7f' % (end-start))
result:
结果:
Waiting...
Saved successfully.
[2014-06-13 00:01:00] [0.33561018369775897, 0.4904215932650632, 0.20264927800402832, 0.118862130636748, 0.03107818050878819]
[2014-06-13 00:01:00] 1.17862136611
[2014-06-13 00:01:00] [0.28474809073311597, 0.41609766067850096, 0.17193755673414868, 0.10084844382959707, 0.02636824802463724]
[2014-06-13 00:01:00] 1.0
[2014-06-13 00:01:00] [2.847, 4.161, 1.719, 1.008, 0.264]
[2014-06-13 00:01:00] [2.848, 4.161, 1.719, 1.008, 0.264]
[2014-06-13 00:01:00] 2.848
[2014-06-13 00:01:00] ********************
[2014-06-13 00:01:00] ***** RESULTS ******
[2014-06-13 00:01:00] alpha: [2.848, 4.161, 1.719, 1.008, 0.264]
[2014-06-13 00:01:00] total: 10.0000000
[2014-06-13 00:01:00] parts: 5
[2014-06-13 00:01:00] places: 3
[2014-06-13 00:01:00] elapsed: 0.0054131
回答by litepresence
In the spirit of pjs's method:
本着 pjs 方法的精神:
a = [0, total] + [random.random()*total for i in range(parts-1)]
a.sort()
b = [(a[i] - a[i-1]) for i in range(1, (parts+1))]
If you want them rounded to decimal places:
如果您希望它们四舍五入到小数位:
if places == None:
return b
else:
b.pop()
c = [round(x, places) for x in b]
c.append(round(total-sum(c), places))
return c
回答by Caner Erden
In addition to @pjs's solution we can define a function with two parameters as well.
除了@pjs 的解决方案,我们还可以定义一个带有两个参数的函数。
import numpy as np
def sum_to_x(n, x):
values = [0.0, x] + list(np.random.uniform(low=0.0,high=x,size=n-1))
values.sort()
return [values[i+1] - values[i] for i in range(n)]
sum_to_x(10, 0.6)
Out:
[0.079058655684546,
0.04168649034779022,
0.09897491411670578,
0.065152293196646,
0.000544800901222664,
0.12329662037166766,
0.09562168167787738,
0.01641359261155284,
0.058273232428072474,
0.020977718663918954]
回答by Orvar Korvar
I have solved this question here in a fast and efficient manner:
我在这里以快速有效的方式解决了这个问题: