pandas 如何在熊猫中制作非数值数据的条形图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34251641/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:22:48  来源:igfitidea点击:

How to make a bar plot of non-numerical data in pandas

pythonpandasmatplotlibseaborn

提问by Jean Nassar

Suppose I had this data:

假设我有这个数据:

>>> df = pd.DataFrame(data={"age": [11, 12, 11, 11, 13, 11, 12, 11],
                        "response": ["Yes", "No", "Yes", "Yes", "Yes", "No", "Yes", "Yes"]})
>>> df
    age response
0   11  Yes
1   12  No
2   11  Yes
3   11  Yes
4   13  Yes
5   11  No
6   12  Yes
7   11  Yes

I would like to make a bar plot that shows the yes or no responses aggregated by age. Would it be possible at all? I have tried histand kind=bar, but neither was able to sort by age, instead graphing both age and response separately.

我想制作一个条形图,显示按年龄汇总的是或否响应。有可能吗?我试过histand kind=bar,但都不能按年龄排序,而是分别绘制年龄和响应。

It would look like this:

它看起来像这样:

  ^
4 |   o
3 |   o
2 |   o
1 |   ox      ox      o
0 .----------------------->
      11      12      13  

where ois "Yes", and xis "No".

哪里o是“是”,哪里是“x否”。

Also, would it be possible to make the numbers grouped? If you had a range from 11 to 50, for instance, you might be able to put it in 5-year bins. Also, would it be possible to show percentages or counts on the axis or on the individual bar?

另外,是否可以将数字分组?例如,如果您的范围是 11 到 50,您也许可以将其放入 5 年期。另外,是否可以在轴或单个条上显示百分比或计数?

回答by Learner

To generate a multiple bar plot, you would first need to group by age and response and then unstack the dataframe:

要生成多条形图,您首先需要按年龄和响应分组,然后拆开数据框:

df=df.groupby(['age','response']).size()
df=df.unstack()
df.plot(kind='bar')

Here is the output plot:

这是输出图:

Bar plot

条形图

回答by Stefan

To binyour data, take a look at pandas.cut()see docs. For categorical plots, I've found the seabornspackage quite helpful - see the tutorial on categorical plots. Below an example for a plot of the yes/no counts for the bins you mention using a random sample:

对于bin您的数据,请查看pandas.cut()see docs。对于分类图,我发现该seaborns软件包非常有用 -请参阅有关分类图的教程。下面是您使用随机样本提到的 bin 的是/否计数图的示例:

df = pd.DataFrame(data={"age": randint(10, 50, 1000),
                    "response": [choice(['Yes', 'No']) for i in range(1000)]})

df['age_group'] = pd.cut(df.age, bins=[g for g in range(10, 51, 5)], include_lowest=True)
df.head()

   age response age_group
0   48      Yes  (45, 50]
1   31       No  (30, 35]
2   25      Yes  (20, 25]
3   29      Yes  (25, 30]
4   19      Yes  (15, 20]

import seaborn as sns
sns.countplot(y='response', hue='age_group', data=df, palette="Greens_d")

enter image description here

在此处输入图片说明