pandas 使用seaborn在python中绘制3列的热图

Question

提问by user308827

v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86

In the dataframe above, I want to plot a heatmap using v1 and v2 as x and y axis and yy as the value. How can I do that in python? I tried seaborn:

在上面的数据框中，我想使用 v1 和 v2 作为 x 和 y 轴以及 yy 作为值绘制热图。我怎么能在python中做到这一点？我试过seaborn：

df = df.pivot('v1', 'v2', 'yy')
ax = sns.heatmap(df)

However, this does not work. Any other solution?

但是，这不起作用。还有其他解决方案吗？

Answer 1

采纳答案by ImportanceOfBeingErnest

A seaborn heatmapplots categorical data. This means that each occuring value would take the same space in the heatmap as any other value, independent on how far they are separated numerically. This is usually undesired for numerical data. Instead one of the following techniques may be chosen.

Seabornheatmap绘制分类数据。这意味着每个出现的值将在热图中占据与任何其他值相同的空间，独立于它们在数字上的分隔距离。这对于数值数据通常是不希望的。作为替代，可以选择以下技术之一。

`Scatter`

A colored scatter plot may be just as good as a heatmap. The colors of the points would represent the yyvalue.

彩色散点图可能与热图一样好。点的颜色将代表该yy值。

ax.scatter(df.v1, df.v2, c=df.yy,  cmap="copper")

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, ax = plt.subplots()

sc = ax.scatter(df.v1, df.v2, c=df.yy,  cmap="copper")

fig.colorbar(sc, ax=ax)

ax.set_aspect("equal")


plt.show()

`Hexbin`

You may want to look into hexbin. The data would be shown in hexagonal bins and the data is aggregated as the mean inside each bin. The advantage here is that if you choose the gridsize large, it will look like a scatter plot, while if you make it small, it looks like a heatmap, allowing to adjust the plot easily to the desired resolution.

您可能需要查看hexbin. 数据将显示在六边形箱中，数据聚合为每个箱内的平均值。这里的优点是，如果你选择大的 gridsize，它看起来像一个散点图，而如果你选择它小，它看起来像一个热图，可以轻松地将图调整到所需的分辨率。

h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, (ax, ax2) = plt.subplots(nrows=2)

h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")

fig.colorbar(h1, ax=ax)
fig.colorbar(h2, ax=ax2)
ax.set_aspect("equal")
ax2.set_aspect("equal")
ax.set_title("gridsize=100")
ax2.set_title("gridsize=10")
fig.subplots_adjust(hspace=0.3)
plt.show()

`Tripcolor`

A tripcolorplot can be used to obtain colored reagions in the plot according to the datapoints, which are then interpreted as the edges of triangles, colorized according the edgepoints' data. Such a plot would require to have more data available to give a meaningful representation.

甲tripcolor情节可用于根据数据点，然后将其解释为三角形的边中的情节，得到着色reagions，根据edgepoints'数据着色。这样的图需要有更多的可用数据才能给出有意义的表示。

ax.tripcolor(df.v1, df.v2, df.yy,  cmap="copper")

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, ax = plt.subplots()

tc = ax.tripcolor(df.v1, df.v2, df.yy,  cmap="copper")

fig.colorbar(tc, ax=ax)

ax.set_aspect("equal")
ax.set_title("tripcolor")

plt.show()

Note that atricontourfplot may equally be suited, if more datapoints throughout the grid are available.

请注意tricontourf，如果整个网格中有更多数据点可用，则绘图可能同样适用。

ax.tricontourf(df.v1, df.v2, df.yy,  cmap="copper")

Answer 2

回答by Serenity

The problem that your data has duplicate values like:

您的数据具有重复值的问题，例如：

100.00  100.00  100.00
100.00  100.00  100.00

You have to drop duplicate values then pivot and plot like here:

您必须删除重复的值，然后像这样进行透视和绘图：

import seaborn as sns
import pandas as pd

# fill data

df = pd.read_clipboard()
df.drop_duplicates(['v1','v2'], inplace=True)
pivot = df.pivot(index='v1', columns='v2', values='yy')
ax = sns.heatmap(pivot,annot=True)
plt.show()

print (pivot)

Pivot:

枢：

v2      44.34   46.23   50.00   52.83   56.52   59.78   63.21   65.09   \
v1                                                                       
0.00       NaN     NaN     NaN     NaN   92.86     NaN     NaN     NaN   
10.17      NaN     NaN     NaN   92.86     NaN     NaN     NaN     NaN   
15.25    100.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN   
23.73      NaN   92.86     NaN     NaN     NaN     NaN     NaN     NaN   
83.05      NaN     NaN     NaN     NaN     NaN   100.0     NaN     NaN   
96.61      NaN     NaN     NaN     NaN     NaN     NaN     NaN   100.0   
100.00     NaN     NaN   100.0     NaN     NaN     NaN   100.0     NaN   

v2      68.87   75.47   79.35   100.00  
v1                                      
0.00       NaN     NaN     NaN     NaN  
10.17      NaN     NaN     NaN     NaN  
15.25      NaN     NaN     NaN     NaN  
23.73      NaN     NaN     NaN     NaN  
83.05      NaN     NaN     NaN     NaN  
96.61      NaN     NaN     NaN     NaN  
100.00   100.0   100.0   100.0   100.0

pandas 使用seaborn在python中绘制3列的热图

提问by user308827

采纳答案by ImportanceOfBeingErnest

`Scatter`

`Scatter`

`Hexbin`

`Hexbin`

`Tripcolor`

`Tripcolor`

回答by Serenity

相关推荐

最近更新

标签

pandas 使用seaborn在python中绘制3列的热图

提问by user308827

采纳答案by ImportanceOfBeingErnest

Scatter

Scatter

Hexbin

Hexbin

Tripcolor

Tripcolor

回答by Serenity

相关推荐

pandas 熊猫数组到列

在 Pandas 中使用列表替换列名

具有特定列聚合功能的 Pandas df.resample

Pandas 中日期列的最大值/最小值，列包含 nan 值

相关推荐

最近更新

标签

`Scatter`

`Scatter`

`Hexbin`

`Hexbin`

`Tripcolor`

`Tripcolor`