Python 相关热图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39409866/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Correlation heatmap
提问by Marko
I want to represent correlation matrix using a heatmap. There is something called correlogramin R, but I don't think there's such a thing in Python.
我想使用热图表示相关矩阵。R 中有一种叫做相关图的东西,但我认为 Python 中没有这种东西。
How can I do this? The values go from -1 to 1, for example:
我怎样才能做到这一点?值从 -1 到 1,例如:
[[ 1. 0.00279981 0.95173379 0.02486161 -0.00324926 -0.00432099]
[ 0.00279981 1. 0.17728303 0.64425774 0.30735071 0.37379443]
[ 0.95173379 0.17728303 1. 0.27072266 0.02549031 0.03324756]
[ 0.02486161 0.64425774 0.27072266 1. 0.18336236 0.18913512]
[-0.00324926 0.30735071 0.02549031 0.18336236 1. 0.77678274]
[-0.00432099 0.37379443 0.03324756 0.18913512 0.77678274 1. ]]
I was able to produce the following heatmap based on another question, but the problem is that my values get 'cut' at 0, so I would like to have a map which goes from blue(-1) to red(1), or something like that, but here values below 0 are not presented in an adequate way.
我能够根据另一个问题生成以下热图,但问题是我的值在 0 处被“切割”,所以我想要一张从蓝色(-1)到红色(1)的地图,或者类似的东西,但这里低于 0 的值没有以适当的方式呈现。
Here's the code for that:
这是代码:
plt.imshow(correlation_matrix,cmap='hot',interpolation='nearest')
回答by mrandrewandrade
Another alternative is to use the heatmap function in seaborn to plot the covariance. This example uses the Auto data set from the ISLR package in R (the same as in the example you showed).
另一种选择是使用 seaborn 中的热图函数来绘制协方差。此示例使用 R 中 ISLR 包中的 Auto 数据集(与您展示的示例相同)。
import pandas.rpy.common as com
import seaborn as sns
%matplotlib inline
# load the R package ISLR
infert = com.importr("ISLR")
# load the Auto dataset
auto_df = com.load_data('Auto')
# calculate the correlation matrix
corr = auto_df.corr()
# plot the heatmap
sns.heatmap(corr,
xticklabels=corr.columns,
yticklabels=corr.columns)
If you wanted to be even more fancy, you can use Pandas Style, for example:
如果你想更花哨,你可以使用Pandas Style,例如:
cmap = cmap=sns.diverging_palette(5, 250, as_cmap=True)
def magnify():
return [dict(selector="th",
props=[("font-size", "7pt")]),
dict(selector="td",
props=[('padding', "0em 0em")]),
dict(selector="th:hover",
props=[("font-size", "12pt")]),
dict(selector="tr:hover td:hover",
props=[('max-width', '200px'),
('font-size', '12pt')])
]
corr.style.background_gradient(cmap, axis=1)\
.set_properties(**{'max-width': '80px', 'font-size': '10pt'})\
.set_caption("Hover to magify")\
.set_precision(2)\
.set_table_styles(magnify())
回答by FatiHe
If your data is in a Pandas DataFrame, you can use Seaborn's heatmap
function to create your desired plot.
如果您的数据在 Pandas DataFrame 中,您可以使用 Seaborn 的heatmap
函数来创建您想要的图。
import seaborn as sns
Var_Corr = df.corr()
# plot the heatmap and annotation on it
sns.heatmap(Var_Corr, xticklabels=Var_Corr.columns, yticklabels=Var_Corr.columns, annot=True)
From the question, it looks like the data is in a NumPy array. If that array has the name numpy_data
, before you can use the step above, you would want to put it into a Pandas DataFrame using the following:
从问题来看,数据看起来像是在 NumPy 数组中。如果该数组具有 name numpy_data
,则在使用上述步骤之前,您需要使用以下命令将其放入 Pandas DataFrame 中:
import pandas as pd
df = pd.DataFrame(numpy_data)
回答by vestland
The code below will produce this plot:
下面的代码将产生这个图:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
# A list with your data slightly edited
l = [1.0,0.00279981,0.95173379,0.02486161,-0.00324926,-0.00432099,
0.00279981,1.0,0.17728303,0.64425774,0.30735071,0.37379443,
0.95173379,0.17728303,1.0,0.27072266,0.02549031,0.03324756,
0.02486161,0.64425774,0.27072266,1.0,0.18336236,0.18913512,
-0.00324926,0.30735071,0.02549031,0.18336236,1.0,0.77678274,
-0.00432099,0.37379443,0.03324756,0.18913512,0.77678274,1.00]
# Split list
n = 6
data = [l[i:i + n] for i in range(0, len(l), n)]
# A dataframe
df = pd.DataFrame(data)
def CorrMtx(df, dropDuplicates = True):
# Your dataset is already a correlation matrix.
# If you have a dateset where you need to include the calculation
# of a correlation matrix, just uncomment the line below:
# df = df.corr()
# Exclude duplicate correlations by masking uper right values
if dropDuplicates:
mask = np.zeros_like(df, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
# Set background color / chart style
sns.set_style(style = 'white')
# Set up matplotlib figure
f, ax = plt.subplots(figsize=(11, 9))
# Add diverging colormap from red to blue
cmap = sns.diverging_palette(250, 10, as_cmap=True)
# Draw correlation plot with or without duplicates
if dropDuplicates:
sns.heatmap(df, mask=mask, cmap=cmap,
square=True,
linewidth=.5, cbar_kws={"shrink": .5}, ax=ax)
else:
sns.heatmap(df, cmap=cmap,
square=True,
linewidth=.5, cbar_kws={"shrink": .5}, ax=ax)
CorrMtx(df, dropDuplicates = False)
I put this together after it was announced that the outstanding seaborn corrplot
was to be deprecated. The snippet above makes a resembling correlation plot based on seaborn heatmap
. You can also specify the color range and select whether or not to drop duplicate correlations. Notice that I've used the same numbers as you, but that I've put them in a pandas dataframe. Regarding the choice of colors you can have a look at the documents for sns.diverging_palette. You asked for blue, but that falls out of this particular range of the color scale with your sample data. For both observations of
0.95173379, try changing to -0.95173379 and you'll get this:
在宣布seaborn corrplot
要弃用未完成的项目后,我将其放在一起。上面的代码片段基于 制作了一个类似的相关图seaborn heatmap
。您还可以指定颜色范围并选择是否删除重复的相关性。请注意,我使用了与您相同的数字,但我将它们放入了一个 Pandas 数据框中。关于颜色的选择,您可以查看sns.diverging_palette的文档。您要求使用蓝色,但这超出了您的样本数据的色标的特定范围。对于 0.95173379 的两个观测值,尝试更改为 -0.95173379,您将得到:
回答by Bernhard
You can use matplotlibfor this. There's a similar question which shows how you can achieve what you want: Plotting a 2D heatmap with Matplotlib
您可以为此使用matplotlib。有一个类似的问题,它展示了如何实现你想要的:Plotting a 2D heatmap with Matplotlib
回答by ypnos
- Use the 'jet' colormap for a transition between blue and red.
- Use
pcolor()
with thevmin
,vmax
parameters.
- 使用“jet”颜色图实现蓝色和红色之间的过渡。
- 使用
pcolor()
与vmin
,vmax
参数。
It is detailed in this answer: https://stackoverflow.com/a/3376734/21974
在这个答案中有详细说明:https: //stackoverflow.com/a/3376734/21974