pandas 使用熊猫分组数据的堆积条形图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34917727/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Stacked bar plot by grouped data with pandas
提问by justanothercoder
Let's assume I have pandas
dataframe which has many features and I am interested in two. I'll call them feature1
and feature2
.
让我们假设我有pandas
一个具有许多功能的数据框,我对两个感兴趣。我会打电话给他们feature1
和feature2
。
feature1
can have three possible values.
feature2
can have two possible values.
feature1
可以有三个可能的值。
feature2
可以有两个可能的值。
I need bar plot grouped by feature1
and stacked by count of rows with each value of feature2
. (So that there will be three stacks each with two bars).
我需要按feature1
行数分组和堆叠的条形图,每个值为feature2
. (这样会有三个堆栈,每个堆栈都有两个条)。
How to achieve this?
如何实现这一目标?
At the moment I have
目前我有
import pandas as pd
df = pd.read_csv('data.csv')
df['feature1'][df['feature2'] == 0].value_counts().plot(kind='bar',label='0')
df['feature1'][df['feature2'] == 1].value_counts().plot(kind='bar',label='1')
but that is not what I actually want because it doesn't stack them.
但这不是我真正想要的,因为它不会堆叠它们。
回答by justanothercoder
Also, I have found another way to do this (with pandas):
另外,我找到了另一种方法来做到这一点(使用Pandas):
df.groupby(['feature1', 'feature2']).size().unstack().plot(kind='bar', stacked=True)
df.groupby(['feature1', 'feature2']).size().unstack().plot(kind='bar', stacked=True)
回答by ronrest
Im not sure how to do this in matplotlib (pandas default plotting library), but if you are willing to try a different plotting library, it is quite easy to do it with Bokeh.
我不确定如何在 matplotlib(pandas 默认绘图库)中执行此操作,但是如果您愿意尝试不同的绘图库,使用 Bokeh 很容易做到。
Here is an example
这是一个例子
import pandas as pd
from bokeh.charts import Bar, output_file, show
x = pd.DataFrame({"gender": ["m","f","m","f","m","f"],
"enrolments": [500,20,100,342,54,47],
"class": ["comp-sci", "comp-sci",
"psych", "psych",
"history", "history"]})
bar = Bar(x, values='enrolments', label='class', stack='gender',
title="Number of students enrolled per class",
legend='top_right',bar_width=1.0)
output_file("myPlot.html")
show(bar)
回答by rtphokie
size produces a column with a simple row count for that grouping, its what produces the values for the y axis. unstack produces the row and column information necessary for matplotlib to create the stacked bar graph.
size 为该分组生成一个带有简单行数的列,它生成 y 轴的值。unstack 生成 matplotlib 创建堆积条形图所需的行和列信息。
Essentially it takes
基本上需要
>>> s
one a 1.0
b 2.0
two a 3.0
b 4.0
and produces:
并产生:
>>> s.unstack(level=-1)
a b
one 1.0 2.0
two 3.0 4.0