pandas Python热图:更改颜色图并使不对称

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49530746/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:22:43  来源:igfitidea点击:

Python heatmap: Change colour map and make asymmetrical

pythonpandasmatplotlibheatmap

提问by Slowat_Kela

I want to build a heatmap of this data:

我想构建此数据的热图:

curation1       curation2       overlap
1      2      0
1      3      1098
1      4      11
1      5      137
1      6      105
1      7      338
2      3      351
2      4      0
2      5      1
2      6      0
2      7      0
3      4      132
3      5      215
3      6      91
3      7      191
4      5      6
4      6      10
4      7      19
5      6      37
5      7      95
6      7     146

I made a heatmap with this code:

我用这个代码制作了一个热图:

import sys
import pandas as pd
import matplotlib
matplotlib.use('Agg')
import matplotlib.ticker as ticker
import matplotlib.cm as cm
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
from matplotlib import colors

data_raw = pd.read_csv(sys.argv[1],sep = '\t')
data_raw["curation1"] = pd.Categorical(data_raw["curation1"], data_raw.curation1.unique())
data_raw["curation2"] = pd.Categorical(data_raw["curation2"], data_raw.curation2.unique())
data_matrix = data_raw.pivot("curation1", "curation2", "overlap")

fig = plt.figure()
fig, ax = plt.subplots(1,1, figsize=(12,12))
heatplot = ax.imshow(data_matrix,cmap = 'BuPu')
#ax.set_xticklabels(data_matrix.columns)
#ax.set_yticklabels(data_matrix.index)
tick_spacing = 1
#ax.xaxis.set_major_locator(ticker.MultipleLocator(tick_spacing))
#ax.yaxis.set_major_locator(ticker.MultipleLocator(tick_spacing))
ax.set_title("Overlap")
fig.savefig('output.pdf')

The output looks like this: this

输出如下所示: 这个

I have three questions:

我有三个问题:

  1. You can see the color scheme is a bit 'off' in the sense that most of the data is very lightly colored, and there is a random purple box to indicate '0'. Ideally, I would like this heatmap being different shades of green, with the darkest green being the highest number, to the lightest (but still clearly visible) green being the lowest number. I tried to play around with the 'cmap' argument, e.g. changing it to 'winter' as described in the python tutorial here; but I'm doing something wrong. Could someone please tell me where specifically I could change this?

  2. color bar: I would like to add a color bar, but I guess I need to sort out question 1 first.

  3. asymmetrical: as you can see, this plot is asymmetrical. Is it possible to plot half of a heat map (e.g. get rid of the unnecessary lines and possibly moving the axis labels to the right hand side of the plot instead?; if not this isn't a big deal because I can re-jig it in powerpoint).

  1. 您可以看到配色方案有点“偏离”,因为大多数数据的颜色非常浅,并且有一个随机的紫色框表示“0”。理想情况下,我希望这个热图是不同深浅的绿色,最深的绿色是最高的数字,最浅(但仍然清晰可见)的绿色是最低的数字。我试着玩的“CMAP”的说法,例如,将其更改为在Python教程中描述的“寒冬”在这里; 但我做错了什么。有人可以告诉我具体在哪里可以更改吗?

  2. 彩条:我想添加一个彩条,但我想我需要先理清问题1。

  3. 不对称:如您所见,此图是不对称的。是否可以绘制热图的一半(例如,去掉不必要的线条,并可能将轴标签移到图的右侧?;如果不是,这不是什么大问题,因为我可以重新调整它在powerpoint中)。

采纳答案by Aritesh

This will solve your first two problems -

这将解决您的前两个问题 -

fig = plt.figure()
fig, ax = plt.subplots(1,1, figsize=(12,12))
heatplot = ax.imshow(data_matrix,cmap = 'Greens')

cbar = fig.colorbar(heatplot, ticks=[data_raw.overlap.min(), data_raw.overlap.max()])
tick_spacing = 1
ax.set_title("Overlap")

回答by NLindros

Begin with selecting a suiting colormap (have a look here), for you purpose Greensmight be good. Note that colormaps can be reversed by adding '_r'to the name.

首先选择一个适合的颜色图(看看这里),对于你来说绿色可能是好的。请注意,可以通过在名称中添加“_r”来反转颜色图。

Since your values differs quite a lot I would use logarithmic color scale. You can do this by including color.LogNorm(from import matplotlib.colors as colors)

由于您的值差异很大,我将使用对数色标。您可以通过包含color.LogNorm(from import matplotlib.colors as colors)来执行此操作

To address your third question I would move axis to top-right and remove bottom and left line.

为了解决您的第三个问题,我会将轴移动到右上角并删除底部和左侧的线。

# Plot heatmap
f, ax = plt.subplots()
lognorm = colors.LogNorm(vmin = data.min(), vmax = data.max())
heatplot  = ax.imshow(data, vmin = 1, norm = lognorm, cmap = 'Greens')

# Move axis to top-right and remove bottom and left line
ax.xaxis.set_ticks_position('top')
ax.yaxis.set_ticks_position('right')
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)

# Use instead of ax.set_title to move title a bit higher up
f.suptitle('Overlap') 

# Add colorbar
cb = f.colorbar(heatplot)

f.show()

enter image description here

在此处输入图片说明

回答by vestland

I would use the seaborn heatmapfunction instead. The colormap Greensshould do the trick with regards to your desired color scheme. If you'd like you can check out other options in the matplotlib docs.

我会改用这个seaborn heatmap函数。颜色图Greens应该可以解决您想要的配色方案。如果您愿意,可以查看matplotlib 文档中的其他选项。

Just hihglight and ctrl+ cthe dataset in yout question and run the snippet below:

只需 hihglight 和ctrl+ 你问题中c的数据集,然后运行下面的代码片段:

# Imports
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

data_raw = pd.read_clipboard(sep='\s+')
data_matrix = data_raw.pivot("curation1", "curation2", "overlap")
data_matrix = data_matrix.fillna(0)

# A heatmap function that builds on the seaborn heatmap function
def HeatMap_function(df, title, transpose = True, colors = 'Greens', dropDuplicates = True):

    if transpose:
        df = df.T

    if dropDuplicates:    
        mask = np.zeros_like(df, dtype=np.bool)
        mask = np.invert(mask)
        mask[np.triu_indices_from(mask)] = False

    # Set background color / chart style
    sns.set_style(style = 'white')

    # Set up  matplotlib figure
    f, ax = plt.subplots(figsize=(11, 9))
    ax.set_title(title)

    # Add diverging colormap from red to blue
    # cmap = sns.diverging_palette(250, 10, as_cmap=True)
    cmap=plt.get_cmap(colors)

    # Draw correlation plot with or without duplicates
    if dropDuplicates:
        sns.heatmap(df, mask=mask, cmap=cmap, 
                square=True,
                linewidth=.5, cbar_kws={"shrink": .5}, ax=ax)
    else:
        sns.heatmap(df, cmap=cmap, 
                square=True,
                linewidth=.5, cbar_kws={"shrink": .5}, ax=ax)

    ax.xaxis.set_ticks_position('top')
    ax.yaxis.set_ticks_position('right')

# A testrun
HeatMap_function(df = data_matrix, title = 'Overlap', transpose = False,
                 colors = 'Greens', dropDuplicates = True)

And you'll get this:

你会得到这个:

enter image description here

在此处输入图片说明

Now you can also change the layout of your plot by using different combinations of transpose, colorsand dropDuplicates.

现在,您还可以使用transpose,colors和 的不同组合来更改绘图的布局dropDuplicates