如何在 Python 中计算 PDF(概率密度函数)?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41974615/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I calculate PDF (probability density function) in Python?
提问by Raaj
I have the following code below that prints the PDF graph for a particular mean and standard deviation.
我在下面有以下代码,用于打印特定平均值和标准偏差的 PDF 图表。
Now I need to find the actual probability, of a particular value. So for example if my mean is 0, and my value is 0, my probability is 1. This is usually done by calculating the area under the curve. Similar to this:
现在我需要找到特定值的实际概率。例如,如果我的均值是 0,我的值是 0,我的概率是 1。这通常是通过计算曲线下的面积来完成的。与此类似:
http://homepage.divms.uiowa.edu/~mbognar/applets/normal.html
http://homepage.divms.uiowa.edu/~mbognar/applet/normal.html
I am not sure how to approach this problem
我不知道如何解决这个问题
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
def normal(power, mean, std, val):
a = 1/(np.sqrt(2*np.pi)*std)
diff = np.abs(np.power(val-mean, power))
b = np.exp(-(diff)/(2*std*std))
return a*b
pdf_array = []
array = np.arange(-2,2,0.1)
print array
for i in array:
print i
pdf = normal(2, 0, 0.1, i)
print pdf
pdf_array.append(pdf)
plt.plot(array, pdf_array)
plt.ylabel('some numbers')
plt.axis([-2, 2, 0, 5])
plt.show()
print
回答by martinako
Unless you have a reason to implement this yourself. All these functions are available in scipy.stats.norm
除非你有理由自己实现这一点。所有这些功能都在scipy.stats.norm中可用
I think you asking for the cdf, then use this code:
我认为您要求cdf,然后使用以下代码:
from scipy.stats import norm
print(norm.cdf(x, mean, std))
回答by martinako
The area under a curve y = f(x)
from x = a
to x = b
is the same as the integral of f(x)dx
from x = a
to x = b
. Scipyhas a quick easy way to do integrals. And just so you understand, the probability of finding a single point in that area cannot be one because the idea is that the total area under the curve is one (unless MAYBE it's a delta function). So you should get 0 ≤ probability of value < 1
for any particular value of interest. There may be different ways of doing it, but a conventional way is to assign confidence intervals along the x-axis like this. I would read up on Gaussian curves and normalization before continuing to code it.
一个曲线下的面积y = f(x)
从x = a
到x = b
相同的积分f(x)dx
从x = a
到x = b
。Scipy有一个快速简单的方法来做积分。就像你理解的那样,在那个区域找到一个点的概率不可能是 1,因为这个想法是曲线下的总面积是 1(除非它可能是一个 delta 函数)。所以你应该得到0 ≤ probability of value < 1
任何感兴趣的特定价值。可能有不同的方法,但传统的方法是像这样沿 x 轴分配置信区间。在继续编码之前,我会阅读高斯曲线和归一化。
回答by Eye Sun
If you want to write it from scratch:
如果你想从头开始写:
class PDF():
def __init__(self,mu=0, sigma=1):
self.mean = mu
self.stdev = sigma
self.data = []
def calculate_mean(self):
self.mean = sum(self.data) // len(self.data)
return self.mean
def calculate_stdev(self,sample=True):
if sample:
n = len(self.data)-1
else:
n = len(self.data)
mean = self.mean
sigma = 0
for el in self.data:
sigma += (el - mean)**2
sigma = math.sqrt(sigma / n)
self.stdev = sigma
return self.stdev
def pdf(self, x):
return (1.0 / (self.stdev * math.sqrt(2*math.pi))) * math.exp(-0.5*((x - self.mean) / self.stdev) ** 2)