Python 如何在 matplotlib 中制作按密度着色的散点图？

Question

提问by 2964502

I'd like to make a scatter plot where each point is colored by the spatial density of nearby points.

我想制作一个散点图，其中每个点都由附近点的空间密度着色。

I've come across a very similar question, which shows an example of this using R:

我遇到了一个非常相似的问题，它显示了一个使用 R 的例子：

R Scatter Plot: symbol color represents number of overlapping points

R 散点图：符号颜色代表重叠点的数量

What's the best way to accomplish something similar in python using matplotlib?

使用 matplotlib 在 python 中完成类似操作的最佳方法是什么？

Answer 1

采纳答案by Joe Kington

In addition to hist2dor hexbinas @askewchan suggested, you can use the same method that the accepted answer in the question you linked to uses.

除了@askewchan 建议之外hist2d或hexbin如@askewchan 建议的那样，您可以使用与链接到的问题中已接受的答案相同的方法。

If you want to do that:

如果你想这样做：

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde

# Generate fake data
x = np.random.normal(size=1000)
y = x * 3 + np.random.normal(size=1000)

# Calculate the point density
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)

fig, ax = plt.subplots()
ax.scatter(x, y, c=z, s=100, edgecolor='')
plt.show()

enter image description here

在此处输入图片说明

If you'd like the points to be plotted in order of density so that the densest points are always on top (similar to the linked example), just sort them by the z-values. I'm also going to use a smaller marker size here as it looks a bit better:

如果您希望按密度顺序绘制点，以便最密集的点始终位于顶部（类似于链接示例），只需按 z 值对它们进行排序。我还将在这里使用较小的标记尺寸，因为它看起来更好一些：

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde

# Generate fake data
x = np.random.normal(size=1000)
y = x * 3 + np.random.normal(size=1000)

# Calculate the point density
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)

# Sort the points by density, so that the densest points are plotted last
idx = z.argsort()
x, y, z = x[idx], y[idx], z[idx]

fig, ax = plt.subplots()
ax.scatter(x, y, c=z, s=50, edgecolor='')
plt.show()

enter image description here

在此处输入图片说明

Answer 2

回答by askewchan

You could make a histogram:

你可以做一个直方图：

import numpy as np
import matplotlib.pyplot as plt

# fake data:
a = np.random.normal(size=1000)
b = a*3 + np.random.normal(size=1000)

plt.hist2d(a, b, (50, 50), cmap=plt.cm.jet)
plt.colorbar()

2dhist

Answer 3

回答by Guillaume

Also, if the number of point makes KDE calculation too slow, color can be interpolated in np.histogram2d [Update in response to comments: If you wish to show the colorbar, use plt.scatter() instead of ax.scatter() followed by plt.colorbar()]:

此外，如果点数使 KDE 计算速度过慢，则可以在 np.histogram2d 中插入颜色 [根据评论更新：如果您希望显示颜色条，请使用 plt.scatter() 而不是 ax.scatter()通过 plt.colorbar()]：

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from matplotlib.colors import Normalize 
from scipy.interpolate import interpn

def density_scatter( x , y, ax = None, sort = True, bins = 20, **kwargs )   :
    """
    Scatter plot colored by 2d histogram
    """
    if ax is None :
        fig , ax = plt.subplots()
    data , x_e, y_e = np.histogram2d( x, y, bins = bins, density = True )
    z = interpn( ( 0.5*(x_e[1:] + x_e[:-1]) , 0.5*(y_e[1:]+y_e[:-1]) ) , data , np.vstack([x,y]).T , method = "splinef2d", bounds_error = False)

    #To be sure to plot all data
    z[np.where(np.isnan(z))] = 0.0

    # Sort the points by density, so that the densest points are plotted last
    if sort :
        idx = z.argsort()
        x, y, z = x[idx], y[idx], z[idx]

    ax.scatter( x, y, c=z, **kwargs )

    norm = Normalize(vmin = np.min(z), vmax = np.max(z))
    cbar = fig.colorbar(cm.ScalarMappable(norm = norm), ax=ax)
    cbar.ax.set_ylabel('Density')

    return ax


if "__main__" == __name__ :

    x = np.random.normal(size=100000)
    y = x * 3 + np.random.normal(size=100000)
    density_scatter( x, y, bins = [30,30] )

Python 如何在 matplotlib 中制作按密度着色的散点图？

提问by 2964502

采纳答案by Joe Kington

回答by askewchan

回答by Guillaume

相关推荐

最近更新

标签

Python 如何在 matplotlib 中制作按密度着色的散点图？

提问by 2964502

采纳答案by Joe Kington

回答by askewchan

回答by Guillaume

相关推荐

Python 这些运算符是什么意思 (** , ^ , %, //)？

如何在 Mac OS X 上为 Python 3 安装 pip？

Python 无法导入名称 <class>

Python 在 Pandas 数据框中查找唯一值，而不管行或列位置

相关推荐

最近更新

标签