Pandas:根据字符串计数创建直方图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29525120/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:09:59  来源:igfitidea点击:

Pandas: Creating a histogram from string counts

pandas

提问by Kyle

I need to create a histogram from a dataframe column that contains the values "Low', 'Medium', or 'High'. When I try to do the usual df.column.hist(), i get the following error.

我需要从包含值“低”、“中”或“高”的数据框列创建直方图。当我尝试执行通常的 df.column.hist() 时,出现以下错误。

ex3.Severity.value_counts()
Out[85]: 
Low       230
Medium     21
High       16
dtype: int64

ex3.Severity.hist()


TypeError                                 Traceback (most recent call last)
<ipython-input-86-7c7023aec2e2> in <module>()
----> 1 ex3.Severity.hist()

C:\Users\C06025A\Anaconda\lib\site-packages\pandas\tools\plotting.py in hist_series(self, by, ax, grid, xlabelsize, xrot, ylabelsize, yrot, figsize, bins, **kwds)
2570         values = self.dropna().values
2571 
->2572         ax.hist(values, bins=bins, **kwds)
2573         ax.grid(grid)
2574         axes = np.array([ax])

C:\Users\C06025A\Anaconda\lib\site-packages\matplotlib\axes\_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
5620             for xi in x:
5621                 if len(xi) > 0:
->5622                     xmin = min(xmin, xi.min())
5623                     xmax = max(xmax, xi.max())
5624             bin_range = (xmin, xmax)

TypeError: unorderable types: str() < float()

回答by Joe

ex3.Severity.value_counts().plot(kind='bar')

Is what you actually want.

是你真正想要的。

When you do:

当你这样做时:

ex3.Severity.value_counts().hist()

it gets the axes the wrong way round i.e. it tries to partition your y axis (counts) into bins, and then plots the number of string labels in each bin.

它以错误的方式获取轴,即它尝试将您的 y 轴(计数)划分为 bin,然后绘制每个 bin 中字符串标签的数量。

回答by Kirell

It is a matplotlib issue which cannot order string together, however you can achieve the desired result by labeling the x-ticks:

这是一个 matplotlib 问题,无法将字符串排序在一起,但是您可以通过标记 x-ticks 来获得所需的结果:

# emulate your ex3.Severity.value_counts()
data = {'Low': 2, 'Medium': 4, 'High': 5}
df = pd.Series(data)

plt.bar(range(len(df)), df.values, align='center')
plt.xticks(range(len(df)), df.index.values, size='small')
plt.show()

histogram

直方图

回答by EdChum

You assumed that because your data was composed of strings that calling plot()on this would automatically perform the value_counts()but this is not the case hence the error, all you needed to do was:

您假设因为您的数据由字符串组成,调用plot()this 会自动执行,value_counts()但事实并非如此,因此出现错误,您需要做的就是:

ex3.Severity.value_counts().hist()

回答by Ouyang Ze

Just an updated answer (as this comes up a lot.) Pandas has a nice module for styling dataframes in many ways, such as the case mentioned above....

只是一个更新的答案(因为这个问题经常出现。)Pandas 有一个很好的模块,可以在很多方面为数据框设置样式,例如上面提到的案例......

ex3.Severity.value_counts().to_frame().style.bar()

ex3.Severity.value_counts().to_frame().style.bar()

...will print the dataframe with bars built-in (as sparklines, using excel-terminology). Nice for quick analysis on jupyter notebooks.

...将打印带有内置条的数据框(作为迷你图,使用 excel 术语)。非常适合对 jupyter 笔记本进行快速分析。

see pandas styling docs

查看Pandas造型文档