Python 如何在 matplotlib 中创建密度图?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4150171/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to create a density plot in matplotlib?
提问by unode
In R I can create the desired output by doing:
在 RI 中可以通过执行以下操作来创建所需的输出:
data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))


In python (with matplotlib) the closest I got was with a simple histogram:
在python(使用matplotlib)中,我得到的最接近的是一个简单的直方图:
import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()


I also tried the normed=True parameterbut couldn't get anything other than trying to fit a gaussian to the histogram.
我也尝试了 normed=True 参数,但除了尝试将高斯拟合到直方图之外,什么也没有。
My latest attempts were around scipy.statsand gaussian_kde, following examples on the web, but I've been unsuccessful so far.
我最近的尝试是在scipy.stats和gaussian_kde,遵循网络上的示例,但到目前为止我一直没有成功。
采纳答案by Justin Peel
Sven has shown how to use the class gaussian_kdefrom Scipy, but you will notice that it doesn't look quite like what you generated with R. This is because gaussian_kdetries to infer the bandwidth automatically. You can play with the bandwidth in a way by changing the function covariance_factorof the gaussian_kdeclass. First, here is what you get without changing that function:
Sven 已经展示了如何使用gaussian_kdeScipy 中的类,但是您会注意到它看起来与您使用 R 生成的不太一样。这是因为它gaussian_kde尝试自动推断带宽。您可以通过更改类的功能covariance_factor来以某种方式玩弄带宽gaussian_kde。首先,这是您在不更改该功能的情况下获得的结果:


However, if I use the following code:
但是,如果我使用以下代码:
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gaussian_kde
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = gaussian_kde(data)
xs = np.linspace(0,8,200)
density.covariance_factor = lambda : .25
density._compute_covariance()
plt.plot(xs,density(xs))
plt.show()
I get
我得到


which is pretty close to what you are getting from R. What have I done? gaussian_kdeuses a changable function, covariance_factorto calculate its bandwidth. Before changing the function, the value returned by covariance_factor for this data was about .5. Lowering this lowered the bandwidth. I had to call _compute_covarianceafter changing that function so that all of the factors would be calculated correctly. It isn't an exact correspondence with the bw parameter from R, but hopefully it helps you get in the right direction.
这与你从 R 得到的非常接近。我做了什么?gaussian_kde使用可变函数covariance_factor来计算其带宽。在更改函数之前,covariance_factor 为该数据返回的值约为 0.5。降低这会降低带宽。我必须_compute_covariance在更改该函数后调用,以便正确计算所有因素。它与 R 中的 bw 参数并不完全对应,但希望它可以帮助您找到正确的方向。
回答by Sven Marnach
Maybe try something like:
也许尝试这样的事情:
import matplotlib.pyplot as plt
import numpy
from scipy import stats
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = stats.kde.gaussian_kde(data)
x = numpy.arange(0., 8, .1)
plt.plot(x, density(x))
plt.show()
You can easily replace gaussian_kde()by a different kernel density estimate.
您可以轻松地替换gaussian_kde()为不同的内核密度估计。
回答by Xin
Five years later, when I Google "how to create a kernel density plot using python", this thread still shows up at the top!
五年后,当我谷歌“如何使用 python 创建核密度图”时,这个线程仍然出现在顶部!
Today, a much easier way to do this is to use seaborn, a package that provides many convenient plotting functions and good style management.
今天,一个更简单的方法是使用seaborn,一个提供许多方便的绘图功能和良好的样式管理的包。
import numpy as np
import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.set_style('whitegrid')
sns.kdeplot(np.array(data), bw=0.5)
回答by Aziz Alto
Option 1:
选项1:
Use pandasdataframe plot (built on top of matplotlib):
使用pandas数据框图(建立在 之上matplotlib):
import pandas as pd
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
pd.DataFrame(data).plot(kind='density') # or pd.Series()
Option 2:
选项 2:
Use distplotof seaborn:
使用distplot的seaborn:
import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.distplot(data, hist=False)
回答by tetrisforjeff
The density plot can also be created by using matplotlib: The function plt.hist(data) returns the y and x values necessary for the density plot (see the documentation https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist.html). Resultingly, the following code creates a density plot by using the matplotlib library:
也可以使用 matplotlib 创建密度图:函数 plt.hist(data) 返回密度图所需的 y 和 x 值(请参阅文档https://matplotlib.org/3.1.1/api/_as_gen/ matplotlib.pyplot.hist.html)。结果,以下代码使用 matplotlib 库创建密度图:
import matplotlib.pyplot as plt
dat=[-1,2,1,4,-5,3,6,1,2,1,2,5,6,5,6,2,2,2]
a=plt.hist(dat,density=True)
plt.close()
plt.figure()
plt.plot(a[1][1:],a[0])
This code returns the following density plot
此代码返回以下密度图

