Python 使用 pandas 和 matplotlib 绘制分类数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31029560/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Plotting categorical data with pandas and matplotlib
提问by Ivan
I have a data frame with categorical data:
我有一个包含分类数据的数据框:
colour direction
1 red up
2 blue up
3 green down
4 red left
5 red right
6 yellow down
7 blue down
I want to generate some graphs, like pie charts and histograms based on the categories. Is it possible without creating dummy numeric variables? Something like
我想根据类别生成一些图形,例如饼图和直方图。是否可以不创建虚拟数字变量?就像是
df.plot(kind='hist')
采纳答案by Alexander
回答by steboc
like this :
像这样 :
df.groupby('colour').size().plot(kind='bar')
回答by Primer
You might find useful mosaic
plot from statsmodels. Which can also give statistical highlighting for the variances.
您可能会mosaic
从 statsmodels 中找到有用的图。这也可以为差异提供统计突出显示。
from statsmodels.graphics.mosaicplot import mosaic
plt.rcParams['font.size'] = 16.0
mosaic(df, ['direction', 'colour']);
But beware of the 0 sized cell - they will cause problems with labels.
但要注意 0 大小的单元格 - 它们会导致标签出现问题。
See this answerfor details
有关详细信息,请参阅此答案
回答by Jarno
You could also use countplot
from seaborn
. This package builds on pandas
to create a high level plotting interface. It gives you good styling and correct axis labels for free.
您也可以使用countplot
from seaborn
。这个包建立在pandas
创建一个高级绘图界面的基础上。它免费为您提供良好的样式和正确的轴标签。
import pandas as pd
import seaborn as sns
sns.set()
df = pd.DataFrame({'colour': ['red', 'blue', 'green', 'red', 'red', 'yellow', 'blue'],
'direction': ['up', 'up', 'down', 'left', 'right', 'down', 'down']})
sns.countplot(df['colour'], color='gray')
It also supports coloring the bars in the right color with a little trick
它还支持通过一个小技巧将条形着色为正确的颜色
sns.countplot(df['colour'],
palette={color: color for color in df['colour'].unique()})
回答by Roman Orac
To plot multiple categorical features as bar charts on the same plot, I would suggest:
要将多个分类特征绘制为同一图上的条形图,我建议:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(
{
"colour": ["red", "blue", "green", "red", "red", "yellow", "blue"],
"direction": ["up", "up", "down", "left", "right", "down", "down"],
}
)
categorical_features = ["colour", "direction"]
fig, ax = plt.subplots(1, len(categorical_features))
for i, categorical_feature in enumerate(df[categorical_features]):
df[categorical_feature].value_counts().plot("bar", ax=ax[i]).set_title(categorical_feature)
fig.show()