Pandas 条形图——按列指定条形颜色

Question

提问by Ryan

Is there a simply way to specify bar colors by column name using Pandas DataFrame.plot(kind='bar')method?

有没有一种简单的方法可以使用 PandasDataFrame.plot(kind='bar')方法按列名指定条形颜色？

I have a script that generates multiple DataFrames from several different data files in a directory. For example it does something like this:

我有一个脚本，可以从目录中的几个不同数据文件生成多个 DataFrame。例如它做这样的事情：

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

df1.plot(kind='bar', ax=plt.subplot(121))
df2.plot(kind='bar', ax=plt.subplot(122))

plt.show()

With the following output:

具有以下输出：

Output

Unfortunately, the column colors aren't consistent for each label in the different plots. Is it possible to pass in a dictionary of (filenames:colors), so that any particular column always has the same color. For example, I could imagine creating this by zipping up the filenames with the Matplotlib color_cycle:

不幸的是，不同图中每个标签的列颜色不一致。是否可以传入（文件名：颜色）的字典，以便任何特定列始终具有相同的颜色。例如，我可以想象通过使用 Matplotlib color_cycle 压缩文件名来创建它：

data_files = ['a', 'b', 'c', 'd']
colors = plt.rcParams['axes.color_cycle']
print zip(data_files, colors)

[('a', u'b'), ('b', u'g'), ('c', u'r'), ('d', u'c')]

I could figure out how to do this directly with Matplotlib: I just thought there might be a simpler, built-in solution.

我可以弄清楚如何直接使用 Matplotlib 执行此操作：我只是认为可能有一个更简单的内置解决方案。

Edit:

编辑：

Below is a partial solution that works in pure Matplotlib. However, I'm using this in an IPython notebook that will be distributed to non-programmer colleagues, and I'd like to minimize the amount of excessive plotting code.

下面是在纯 Matplotlib 中工作的部分解决方案。但是，我在将分发给非程序员同事的 IPython 笔记本中使用它，并且我想尽量减少过多的绘图代码量。

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']
mpl_colors = plt.rcParams['axes.color_cycle']
colors = dict(zip(data_files, mpl_colors))

def bar_plotter(df, colors, sub):
    ncols = df.shape[1]
    width = 1./(ncols+2.)
    starts = df.index.values - width*ncols/2.
    plt.subplot(120+sub)
    for n, col in enumerate(df):
        plt.bar(starts + width*n, df[col].values, color=colors[col],
                width=width, label=col)
    plt.xticks(df.index.values)
    plt.grid()
    plt.legend()

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

bar_plotter(df1, colors, 1)
bar_plotter(df2, colors, 2)

plt.show()

Desired Output

期望输出

Answer 1

回答by DataSwede

You can pass a list as the colors. This will require a little bit of manual work to get it to line up, unlike if you could pass a dictionary, but may be a less cluttered way to accomplish your goal.

您可以传递一个列表作为颜色。这将需要一些手动工作才能使其对齐，这与您可以通过字典不同，但可能是实现目标的一种不那么混乱的方式。

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

color_list = ['b', 'g', 'r', 'c']


df1.plot(kind='bar', ax=plt.subplot(121), color=color_list)
df2.plot(kind='bar', ax=plt.subplot(122), color=color_list[1:])

plt.show()

enter image description here

在此处输入图片说明

EDIT Ajean came up with a simple way to return a list of the correct colors from a dictionary:

编辑 Ajean 想出了一种简单的方法来从字典中返回正确颜色的列表：

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']
color_list = ['b', 'g', 'r', 'c']
d2c = dict(zip(data_files, color_list))

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

df1.plot(kind='bar', ax=plt.subplot(121), color=map(d2c.get,df1.columns))
df2.plot(kind='bar', ax=plt.subplot(122), color=map(d2c.get,df2.columns))

plt.show()

Pandas 条形图——按列指定条形颜色

提问by Ryan

回答by DataSwede

相关推荐

最近更新

标签

Pandas 条形图——按列指定条形颜色

提问by Ryan

回答by DataSwede

相关推荐

Python：numpy/pandas 根据条件更改值

将 Pandas DataFrame 中的每个数值设为负数

在 Pandas 中将 DataFrame 名称保存为 .csv 文件名

pandas 将“现在”时间戳列添加到熊猫 df

相关推荐

最近更新

标签