Python 使用matplotlib按样本绘制概率密度函数

Question

提问by Cupitor

I want to plot an approximation of probability density function based on a sample that I have; The curve that mimics the histogram behaviour. I can have samples as big as I want.

我想根据我拥有的样本绘制概率密度函数的近似值；模仿直方图行为的曲线。我可以拥有任意大的样品。

Answer 1

采纳答案by askewchan

If you want to plot a distribution, and you know it, define it as a function, and plot it as so:

如果您想绘制分布并且您知道它，请将其定义为函数，并按如下方式绘制：

import numpy as np
from matplotlib import pyplot as plt

def my_dist(x):
    return np.exp(-x ** 2)

x = np.arange(-100, 100)
p = my_dist(x)
plt.plot(x, p)
plt.show()

If you don't have the exact distribution as an analytical function, perhaps you can generate a large sample, take a histogram and somehow smooth the data:

如果您没有作为分析函数的精确分布，也许您可以生成一个大样本，获取直方图并以某种方式平滑数据：

import numpy as np
from scipy.interpolate import UnivariateSpline
from matplotlib import pyplot as plt

N = 1000
n = N//10
s = np.random.normal(size=N)   # generate your data sample with N elements
p, x = np.histogram(s, bins=n) # bin it into n = N//10 bins
x = x[:-1] + (x[1] - x[0])/2   # convert bin edges to centers
f = UnivariateSpline(x, p, s=n)
plt.plot(x, f(x))
plt.show()

You can increase or decrease s(smoothing factor) within the UnivariateSplinefunction call to increase or decrease smoothing. For example, using the two you get: dist to func

您可以s在UnivariateSpline函数调用中增加或减少（平滑因子）以增加或减少平滑。例如，使用你得到的两个： dist 到 func

Answer 2

回答by EnricoGiampieri

What you have to do is to use the gaussian_kde from the scipy.stats.kde package.

您需要做的是使用 scipy.stats.kde 包中的 gaussian_kde。

given your data you can do something like this:

鉴于您的数据，您可以执行以下操作：

from scipy.stats.kde import gaussian_kde
from numpy import linspace
# create fake data
data = randn(1000)
# this create the kernel, given an array it will estimate the probability over that values
kde = gaussian_kde( data )
# these are the values over wich your kernel will be evaluated
dist_space = linspace( min(data), max(data), 100 )
# plot the results
plt.plot( dist_space, kde(dist_space) )

The kernel density can be configured at will and can handle N-dimensional data with ease. It will also avoid the spline distorsion that you can see in the plot given by askewchan.

核密度可以随意配置，可以轻松处理N维数据。它还将避免您在 askewchan 给出的图中看到的样条扭曲。

enter image description here

在此处输入图片说明

Python 使用matplotlib按样本绘制概率密度函数

提问by Cupitor

采纳答案by askewchan

回答by EnricoGiampieri

相关推荐

最近更新

标签

Python 使用matplotlib按样本绘制概率密度函数

提问by Cupitor

采纳答案by askewchan

回答by EnricoGiampieri

相关推荐

Python 简单的多层神经网络实现

在 Python Spark 中查看 RDD 内容？

Python 无法重塑 numpy 数组

Python 如何在 Selenium WebDriver 中设置浏览器宽度和高度？

相关推荐

最近更新

标签