pandas 使用seaborn在python中绘制3列的热图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44480226/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:45:39  来源:igfitidea点击:

Plotting heatmap for 3 columns in python with seaborn

pythonpandasmatplotlibheatmapseaborn

提问by user308827

v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86

In the dataframe above, I want to plot a heatmap using v1 and v2 as x and y axis and yy as the value. How can I do that in python? I tried seaborn:

在上面的数据框中,我想使用 v1 和 v2 作为 x 和 y 轴以及 yy 作为值绘制热图。我怎么能在python中做到这一点?我试过seaborn:

df = df.pivot('v1', 'v2', 'yy')
ax = sns.heatmap(df)

However, this does not work. Any other solution?

但是,这不起作用。还有其他解决方案吗?

采纳答案by ImportanceOfBeingErnest

A seaborn heatmapplots categorical data. This means that each occuring value would take the same space in the heatmap as any other value, independent on how far they are separated numerically. This is usually undesired for numerical data. Instead one of the following techniques may be chosen.

Seabornheatmap绘制分类数据。这意味着每个出现的值将在热图中占据与任何其他值相同的空间,独立于它们在数字上的分隔距离。这对于数值数据通常是不希望的。作为替代,可以选择以下技术之一。

Scatter

Scatter

A colored scatter plot may be just as good as a heatmap. The colors of the points would represent the yyvalue.

彩色散点图可能与热图一样好。点的颜色将代表该yy值。

ax.scatter(df.v1, df.v2, c=df.yy,  cmap="copper")

enter image description here

在此处输入图片说明

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, ax = plt.subplots()

sc = ax.scatter(df.v1, df.v2, c=df.yy,  cmap="copper")

fig.colorbar(sc, ax=ax)

ax.set_aspect("equal")


plt.show()

Hexbin

Hexbin

You may want to look into hexbin. The data would be shown in hexagonal bins and the data is aggregated as the mean inside each bin. The advantage here is that if you choose the gridsize large, it will look like a scatter plot, while if you make it small, it looks like a heatmap, allowing to adjust the plot easily to the desired resolution.

您可能需要查看hexbin. 数据将显示在六边形箱中,数据聚合为每个箱内的平均值。这里的优点是,如果你选择大的 gridsize,它看起来像一个散点图,而如果你选择它小,它看起来像一个热图,可以轻松地将图调整到所需的分辨率。

h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")

enter image description here

在此处输入图片说明

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, (ax, ax2) = plt.subplots(nrows=2)

h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")

fig.colorbar(h1, ax=ax)
fig.colorbar(h2, ax=ax2)
ax.set_aspect("equal")
ax2.set_aspect("equal")
ax.set_title("gridsize=100")
ax2.set_title("gridsize=10")
fig.subplots_adjust(hspace=0.3)
plt.show()

Tripcolor

Tripcolor

A tripcolorplot can be used to obtain colored reagions in the plot according to the datapoints, which are then interpreted as the edges of triangles, colorized according the edgepoints' data. Such a plot would require to have more data available to give a meaningful representation.

tripcolor情节可用于根据数据点,然后将其解释为三角形的边中的情节,得到着色reagions,根据edgepoints'数据着色。这样的图需要有更多的可用数据才能给出有意义的表示。

ax.tripcolor(df.v1, df.v2, df.yy,  cmap="copper")

enter image description here

在此处输入图片说明

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, ax = plt.subplots()

tc = ax.tripcolor(df.v1, df.v2, df.yy,  cmap="copper")

fig.colorbar(tc, ax=ax)

ax.set_aspect("equal")
ax.set_title("tripcolor")

plt.show()

Note that atricontourfplot may equally be suited, if more datapoints throughout the grid are available.

请注意tricontourf,如果整个网格中有更多数据点可用,则绘图可能同样适用。

ax.tricontourf(df.v1, df.v2, df.yy,  cmap="copper")

回答by Serenity

The problem that your data has duplicate values like:

您的数据具有重复值的问题,例如:

100.00  100.00  100.00
100.00  100.00  100.00

You have to drop duplicate values then pivot and plot like here:

您必须删除重复的值,然后像这样进行透视和绘图:

import seaborn as sns
import pandas as pd

# fill data

df = pd.read_clipboard()
df.drop_duplicates(['v1','v2'], inplace=True)
pivot = df.pivot(index='v1', columns='v2', values='yy')
ax = sns.heatmap(pivot,annot=True)
plt.show()

print (pivot)

enter image description here

在此处输入图片说明

Pivot:

枢:

v2      44.34   46.23   50.00   52.83   56.52   59.78   63.21   65.09   \
v1                                                                       
0.00       NaN     NaN     NaN     NaN   92.86     NaN     NaN     NaN   
10.17      NaN     NaN     NaN   92.86     NaN     NaN     NaN     NaN   
15.25    100.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN   
23.73      NaN   92.86     NaN     NaN     NaN     NaN     NaN     NaN   
83.05      NaN     NaN     NaN     NaN     NaN   100.0     NaN     NaN   
96.61      NaN     NaN     NaN     NaN     NaN     NaN     NaN   100.0   
100.00     NaN     NaN   100.0     NaN     NaN     NaN   100.0     NaN   

v2      68.87   75.47   79.35   100.00  
v1                                      
0.00       NaN     NaN     NaN     NaN  
10.17      NaN     NaN     NaN     NaN  
15.25      NaN     NaN     NaN     NaN  
23.73      NaN     NaN     NaN     NaN  
83.05      NaN     NaN     NaN     NaN  
96.61      NaN     NaN     NaN     NaN  
100.00   100.0   100.0   100.0   100.0