Python 如何使用 Pandas DF 绘制计数条形图,按一个分类列分组并按另一个分类着色
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48939795/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to plot a count bar chart with a Pandas DF, grouping by one categorical column and colouring by another
提问by tlanigan
I have a dataframe that looks roughly like this:
我有一个大致如下所示的数据框:
Property Name industry
1 123 name1 industry 1
1 144 name1 industry 1
2 456 name2 industry 1
3 789 name3 industry 2
4 367 name4 industry 2
. ... ... ...
. ... ... ...
n 123 name1 industry 1
I want to make a bar plot that plots how many rows for each of the Names there are, and colors the bars by what industry it is. I've tried something like this:
我想制作一个条形图,绘制每个 Names 的行数,并根据它的行业为条形着色。我试过这样的事情:
ax = df['name'].value_counts().plot(kind='bar',
figsize=(14,8),
title="Number for each Owner Name")
ax.set_xlabel("Owner Names")
ax.set_ylabel("Frequency")
I get the following:
我得到以下信息:
My question is how do I colour the bars according the the industry column in the dataframe (and add a legend).
我的问题是如何根据数据框中的行业列为条形着色(并添加图例)。
Thanks!
谢谢!
回答by tlanigan
This is my answer:
这是我的回答:
def plot_bargraph_with_groupings(df, groupby, colourby, title, xlabel, ylabel):
"""
Plots a dataframe showing the frequency of datapoints grouped by one column and coloured by another.
df : dataframe
groupby: the column to groupby
colourby: the column to color by
title: the graph title
xlabel: the x label,
ylabel: the y label
"""
import matplotlib.patches as mpatches
# Makes a mapping from the unique colourby column items to a random color.
ind_col_map = {x:y for x, y in zip(df[colourby].unique(),
[plt.cm.Paired(np.arange(len(df[colourby].unique())))][0])}
# Find when the indicies of the soon to be bar graphs colors.
unique_comb = df[[groupby, colourby]].drop_duplicates()
name_ind_map = {x:y for x, y in zip(unique_comb[groupby], unique_comb[colourby])}
c = df[groupby].value_counts().index.map(lambda x: ind_col_map[name_ind_map[x]])
# Makes the bargraph.
ax = df[groupby].value_counts().plot(kind='bar',
figsize=FIG_SIZE,
title=title,
color=[c.values])
# Makes a legend using the ind_col_map
legend_list = []
for key in ind_col_map.keys():
legend_list.append(mpatches.Patch(color=ind_col_map[key], label=key))
# display the graph.
plt.legend(handles=legend_list)
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
回答by TYZ
It might be a little bit too complicated but this does the work. I first defined the mappings from name to industry and from industry to color (it seems like there are only two industries but you can adjust the dictionary to your case):
这可能有点太复杂了,但这确实有效。我首先定义了从名称到行业以及从行业到颜色的映射(似乎只有两个行业,但您可以根据自己的情况调整字典):
ind_col_map = {
"industry1": "red",
"industry2": "blue"
}
unique_comb = df[["Name","industry"]].drop_duplicates()
name_ind_map = {x:y for x, y in zip(unique_comb["Name"],unique_comb["industry"])}
Then the color can be generated by using the above two mappings:
然后就可以使用上面的两个映射来生成颜色了:
c = df['Name'].value_counts().index.map(lambda x: ind_col_map[name_ind_map[x]])
Finally, you only need to simply add color
to your plotting function:
最后,您只需要简单地添加color
到绘图函数中:
ax = df['Name'].value_counts().plot(kind='bar',
figsize=(14,8),
title="Number for each Owner Name", color=c)
ax.set_xlabel("Owner Names")
ax.set_ylabel("Frequency")
plt.show()