Python 给定每个变量的概率,选择列表变量
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4437250/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Choose list variable given probability of each variable
提问by Roughmar
I've been trying to code a program that uses the softmax activation function in the middle.
我一直在尝试编写一个在中间使用 softmax 激活函数的程序。
Right now, I have a list of probabilities like this:
现在,我有一个这样的概率列表:
P[0.10,0.25,0.60,0.05]
The sum of all the variables in P is always 1.
P 中所有变量的总和始终为 1。
I wanted a way to pick the index of the list given the probability attached to it. Or, in other words, a function that returned
我想要一种方法来选择列表的索引给定概率。或者,换句话说,一个返回的函数
0 - 10% of the time
1 - 25% of the time
2 - 60% of the time
3 - 5% of the time
I've absolutely no idea where to start on this. Any help would be appreciated. :)
我完全不知道从哪里开始。任何帮助,将不胜感激。:)
回答by slezica
Hmm interesting, how about...
嗯有趣,怎么样...
Generate a number between 0 and 1.
Walk the list substracting the probability of each item from your number.
Pick the item that, after substraction, took your number down to 0 or below.
生成一个介于 0 和 1 之间的数字。
遍历列表,从您的数字中减去每个项目的概率。
选择减去后将您的数字降至 0 或以下的项目。
That's simple, O(n) and should work :)
这很简单,O(n) 应该可以工作:)
回答by sje397
import random
probs = [0.1, 0.25, 0.6, 0.05]
r = random.random()
index = 0
while(r >= 0 and index < len(probs)):
r -= probs[index]
index += 1
print index - 1
回答by Justin Peel
Basically, make a cumulative probability distribution(CDF) array. Basically, the value of the CDF for a given index is equal to the sum of all values in P equal to or less than that index. Then you generate a random number between 0 and 1 and do a binary search (or linear search if you want). Here's some simple code for it.
基本上,制作一个累积概率分布(CDF) 数组。基本上,给定索引的 CDF 值等于 P 中等于或小于该索引的所有值的总和。然后生成一个介于 0 和 1 之间的随机数并进行二分搜索(如果需要,也可以进行线性搜索)。这是一些简单的代码。
from bisect import bisect
from random import random
P = [0.10,0.25,0.60,0.05]
cdf = [P[0]]
for i in xrange(1, len(P)):
cdf.append(cdf[-1] + P[i])
random_ind = bisect(cdf,random())
of course you can generate a bunch of random indices with something like
当然,您可以生成一堆随机索引,例如
rs = [bisect(cdf, random()) for i in xrange(20)]
yielding
屈服
[2, 2, 3, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 2]
(results will, and should vary). Of course, binary search is rather unnecessary for so few of possible indices, but definitely recommended for distributions with more possible indices.
(结果会并且应该有所不同)。当然,对于这么少的可能索引,二进制搜索是不必要的,但绝对推荐用于具有更多可能索引的分布。
回答by animus144
This problem is equivalent to sampling from a categorical distribution. This distribution is commonly conflated with the multinomial distribution which models the result of multiple samples from a categorical distribution.
这个问题相当于从分类分布中抽样。这种分布通常与多项分布混为一谈,后者对来自分类分布的多个样本的结果进行建模。
In numpy, it is easy to sample from the multinomial distribution using numpy.random.multinomial, but a specific categorical version of this does not exist. However, it can be accomplished by sampling from the multinomial distribution with a single trial and then returning the non-zero element in the output.
在 numpy 中,很容易使用numpy.random.multinomial从多项分布中采样,但不存在特定的分类版本。但是,它可以通过使用一次试验从多项分布中采样,然后在输出中返回非零元素来实现。
import numpy as np
pvals = [0.10,0.25,0.60,0.05]
ind = np.where(np.random.multinomial(1,pvals))[0][0]

