如何使用 matplotlib 在 python 中绘制 3D 密度图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25286811/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to plot a 3D density map in python with matplotlib
提问by nv_wu
I have a large dataset of (x,y,z) protein positions and would like to plot areas of high occupancy as a heatmap. Ideally the output should look similiar to the volumetric visualisation below, but I'm not sure how to achieve this with matplotlib.
我有一个大的 (x,y,z) 蛋白质位置数据集,并想将高占用区域绘制为热图。理想情况下,输出应该类似于下面的体积可视化,但我不确定如何使用 matplotlib 实现这一点。


My initial idea was to display my positions as a 3D scatter plot and color their density via a KDE. I coded this up as follows with test data:
我最初的想法是将我的位置显示为 3D 散点图并通过 KDE 为它们的密度着色。我用测试数据将其编码如下:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
mu, sigma = 0, 0.1
x = np.random.normal(mu, sigma, 1000)
y = np.random.normal(mu, sigma, 1000)
z = np.random.normal(mu, sigma, 1000)
xyz = np.vstack([x,y,z])
density = stats.gaussian_kde(xyz)(xyz)
idx = density.argsort()
x, y, z, density = x[idx], y[idx], z[idx], density[idx]
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x, y, z, c=density)
plt.show()
This works well! However, my real data contains many thousands of data points and calculating the kde and the scatter plot becomes extremely slow.
这很好用!但是,我的真实数据包含数千个数据点,计算 kde 和散点图变得非常缓慢。
A small sample of my real data:
我的真实数据的一个小样本:


My research would suggest that a better option is to evaluate the gaussian kde on a grid. I'm just not sure how to this in 3D:
我的研究表明,更好的选择是评估网格上的高斯 kde。我只是不确定如何在 3D 中做到这一点:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
mu, sigma = 0, 0.1
x = np.random.normal(mu, sigma, 1000)
y = np.random.normal(mu, sigma, 1000)
nbins = 50
xy = np.vstack([x,y])
density = stats.gaussian_kde(xy)
xi, yi = np.mgrid[x.min():x.max():nbins*1j, y.min():y.max():nbins*1j]
di = density(np.vstack([xi.flatten(), yi.flatten()]))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.pcolormesh(xi, yi, di.reshape(xi.shape))
plt.show()
采纳答案by nv_wu
Thanks to mwaskon for suggesting the mayavi library.
感谢 mwaskon 建议使用 mayavi 库。
I recreated the density scatter plot in mayavi as follows:
我在 mayavi 中重新创建了密度散点图,如下所示:
import numpy as np
from scipy import stats
from mayavi import mlab
mu, sigma = 0, 0.1
x = 10*np.random.normal(mu, sigma, 5000)
y = 10*np.random.normal(mu, sigma, 5000)
z = 10*np.random.normal(mu, sigma, 5000)
xyz = np.vstack([x,y,z])
kde = stats.gaussian_kde(xyz)
density = kde(xyz)
# Plot scatter with mayavi
figure = mlab.figure('DensityPlot')
pts = mlab.points3d(x, y, z, density, scale_mode='none', scale_factor=0.07)
mlab.axes()
mlab.show()


Setting the scale_mode to 'none' prevents glyphs from being scaled in proportion to the density vector. In addition for large datasets, I disabled scene rendering and used a mask to reduce the number of points.
将 scale_mode 设置为“none”可防止字形与密度向量成比例缩放。此外,对于大型数据集,我禁用了场景渲染并使用遮罩来减少点数。
# Plot scatter with mayavi
figure = mlab.figure('DensityPlot')
figure.scene.disable_render = True
pts = mlab.points3d(x, y, z, density, scale_mode='none', scale_factor=0.07)
mask = pts.glyph.mask_points
mask.maximum_number_of_points = x.size
mask.on_ratio = 1
pts.glyph.mask_input_points = True
figure.scene.disable_render = False
mlab.axes()
mlab.show()
Next, to evaluate the gaussian kde on a grid:
接下来,要评估网格上的高斯 kde:
import numpy as np
from scipy import stats
from mayavi import mlab
mu, sigma = 0, 0.1
x = 10*np.random.normal(mu, sigma, 5000)
y = 10*np.random.normal(mu, sigma, 5000)
z = 10*np.random.normal(mu, sigma, 5000)
xyz = np.vstack([x,y,z])
kde = stats.gaussian_kde(xyz)
# Evaluate kde on a grid
xmin, ymin, zmin = x.min(), y.min(), z.min()
xmax, ymax, zmax = x.max(), y.max(), z.max()
xi, yi, zi = np.mgrid[xmin:xmax:30j, ymin:ymax:30j, zmin:zmax:30j]
coords = np.vstack([item.ravel() for item in [xi, yi, zi]])
density = kde(coords).reshape(xi.shape)
# Plot scatter with mayavi
figure = mlab.figure('DensityPlot')
grid = mlab.pipeline.scalar_field(xi, yi, zi, density)
min = density.min()
max=density.max()
mlab.pipeline.volume(grid, vmin=min, vmax=min + .5*(max-min))
mlab.axes()
mlab.show()


As a final improvement I sped up the evaluation of kensity density function by calling the kde function in parallel.
作为最后的改进,我通过并行调用 kde 函数加快了对密度密度函数的评估。
import numpy as np
from scipy import stats
from mayavi import mlab
import multiprocessing
def calc_kde(data):
return kde(data.T)
mu, sigma = 0, 0.1
x = 10*np.random.normal(mu, sigma, 5000)
y = 10*np.random.normal(mu, sigma, 5000)
z = 10*np.random.normal(mu, sigma, 5000)
xyz = np.vstack([x,y,z])
kde = stats.gaussian_kde(xyz)
# Evaluate kde on a grid
xmin, ymin, zmin = x.min(), y.min(), z.min()
xmax, ymax, zmax = x.max(), y.max(), z.max()
xi, yi, zi = np.mgrid[xmin:xmax:30j, ymin:ymax:30j, zmin:zmax:30j]
coords = np.vstack([item.ravel() for item in [xi, yi, zi]])
# Multiprocessing
cores = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes=cores)
results = pool.map(calc_kde, np.array_split(coords.T, 2))
density = np.concatenate(results).reshape(xi.shape)
# Plot scatter with mayavi
figure = mlab.figure('DensityPlot')
grid = mlab.pipeline.scalar_field(xi, yi, zi, density)
min = density.min()
max=density.max()
mlab.pipeline.volume(grid, vmin=min, vmax=min + .5*(max-min))
mlab.axes()
mlab.show()

