Python 使用 pandas 和 matplotlib 绘制分类数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31029560/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:21:12  来源:igfitidea点击:

Plotting categorical data with pandas and matplotlib

pythonpandas

提问by Ivan

I have a data frame with categorical data:

我有一个包含分类数据的数据框:

     colour  direction
1    red     up
2    blue    up
3    green   down
4    red     left
5    red     right
6    yellow  down
7    blue    down

I want to generate some graphs, like pie charts and histograms based on the categories. Is it possible without creating dummy numeric variables? Something like

我想根据类别生成一些图形,例如饼图和直方图。是否可以不创建虚拟数字变量?就像是

df.plot(kind='hist')

采纳答案by Alexander

You can simply use value_countson the series:

您可以简单地value_counts在系列上使用:

df['colour'].value_counts().plot(kind='bar')

enter image description here

在此处输入图片说明

回答by steboc

like this :

像这样 :

df.groupby('colour').size().plot(kind='bar')

回答by Primer

You might find useful mosaicplot from statsmodels. Which can also give statistical highlighting for the variances.

您可能会mosaic从 statsmodels 中找到有用的图。这也可以为差异提供统计突出显示。

from statsmodels.graphics.mosaicplot import mosaic
plt.rcParams['font.size'] = 16.0
mosaic(df, ['direction', 'colour']);

enter image description here

在此处输入图片说明

But beware of the 0 sized cell - they will cause problems with labels.

但要注意 0 大小的单元格 - 它们会导致标签出现问题。

See this answerfor details

有关详细信息,请参阅此答案

回答by Jarno

You could also use countplotfrom seaborn. This package builds on pandasto create a high level plotting interface. It gives you good styling and correct axis labels for free.

您也可以使用countplotfrom seaborn。这个包建立在pandas创建一个高级绘图界面的基础上。它免费为您提供良好的样式和正确的轴标签。

import pandas as pd
import seaborn as sns
sns.set()

df = pd.DataFrame({'colour': ['red', 'blue', 'green', 'red', 'red', 'yellow', 'blue'],
                   'direction': ['up', 'up', 'down', 'left', 'right', 'down', 'down']})
sns.countplot(df['colour'], color='gray')

enter image description here

在此处输入图片说明

It also supports coloring the bars in the right color with a little trick

它还支持通过一个小技巧将条形着色为正确的颜色

sns.countplot(df['colour'],
              palette={color: color for color in df['colour'].unique()})

enter image description here

在此处输入图片说明

回答by Roman Orac

To plot multiple categorical features as bar charts on the same plot, I would suggest:

要将多个分类特征绘制为同一图上的条形图,我建议:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(
    {
        "colour": ["red", "blue", "green", "red", "red", "yellow", "blue"],
        "direction": ["up", "up", "down", "left", "right", "down", "down"],
    }
)

categorical_features = ["colour", "direction"]
fig, ax = plt.subplots(1, len(categorical_features))
for i, categorical_feature in enumerate(df[categorical_features]):
    df[categorical_feature].value_counts().plot("bar", ax=ax[i]).set_title(categorical_feature)
fig.show()

enter image description here

在此处输入图片说明