pandas 每个色调带有堆叠条的计数图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50319614/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
count plot with stacked bars per hue
提问by Marcello
I am looking for an efficient way of drawing a count plot with stacked bars according to "hue". Standard hue behavior is to split a count into parallel bars according to the value of a second column, what I am looking for is an efficient way to have the hue bars stacked in order to quickly compare totals.
我正在寻找一种根据“色调”绘制带有堆叠条的计数图的有效方法。标准色调行为是根据第二列的值将计数拆分为平行条,我正在寻找的是一种将色调条堆叠以便快速比较总数的有效方法。
Let me explain with an example from the titanic dataset:
让我用一个来自 Titanic 数据集的例子来解释:
import pandas as pd
import numpy as np
import seaborn as sns
%matplotlib inline
df = sns.load_dataset('titanic')
sns.countplot(x='survived',hue='class',data=df)
gives standard Seaborn behavior with countplot and hue
给出带有计数图和色调的标准 Seaborn 行为
what I am looking for is something like stacked bars per hue
我正在寻找的是类似于每个色调的堆叠条
to get the last image I used the following code
为了获得最后一张图片,我使用了以下代码
def aggregate(rows,columns,df):
column_keys = df[columns].unique()
row_keys = df[rows].unique()
agg = { key : [ len(df[(df[rows]==value) & (df[columns]==key)]) for value in row_keys]
for key in column_keys }
aggdf = pd.DataFrame(agg,index = row_keys)
aggdf.index.rename(rows,inplace=True)
return aggdf
aggregate('survived','class',df).plot(kind='bar',stacked=True)
I am sure there is some more efficient way. I know seaborn is not very stacked bars friendly... so I tried to rearrange the dataset with my function and used matplotlib, but I guess there is a more clever way to do that as well.
我相信有一些更有效的方法。我知道 seaborn 不是很友好的堆叠条形图……所以我尝试用我的函数重新排列数据集并使用 matplotlib,但我想还有一种更聪明的方法可以做到这一点。
Thank you very much!
非常感谢!
回答by ALollz
You were basically there with your last part, using DataFrame.plot()
with bar
and stacked=True
.
您的最后一部分基本上就在那里,使用DataFrame.plot()
withbar
和stacked=True
。
Instead of your aggregate
function, you can accomplish what you want with a groupby
+ pivot
.
aggregate
您可以使用groupby
+完成您想要的功能,而不是您的功能pivot
。
df_plot = df.groupby(['class', 'survived']).size().reset_index().pivot(columns='class', index='survived', values=0)
class First Second Third
survived
0 80 97 372
1 136 87 119
From here you can just plot it as a bar
with the stacked=True
argument
从这里您可以将其绘制为bar
带有stacked=True
参数的 a
df_plot.plot(kind='bar', stacked=True)