Python/Pandas DataFrame 中的频率图

Question

提问by SMU

I have a parsed very large dataframe with some values like this and several columns:

我有一个解析过的非常大的数据框，其中包含一些像这样的值和几列：

Name Age Points ...
XYZ  42  32pts  ...
ABC  41  32pts  ...
DEF  32  35pts
GHI  52  35pts
JHK  72  35pts
MNU  43  42pts
LKT  32  32pts
LKI  42  42pts
JHI  42  35pts
JHP  42  42pts
XXX  42  42pts
XYY  42  35pts

I have imported numpy and matplotlib.

我已经导入了 numpy 和 matplotlib。

I need to plot a graph of the number of times the value in the column 'Points' occurs. I dont need to have any bins for the plotting. So it is more of a plot to see how many times the same score of points occurs over a large dataset.

我需要绘制“点”列中值出现的次数的图形。我不需要任何用于绘图的垃圾箱。因此，它更像是一个图，可以查看在大型数据集上出现相同分数的次数。

So essentially the bar plot (or histogram, if you can call it that) should show that 32pts occurs thrice, 35pts occurs 5 times and 42pts occurs 4 times. If I can plot the values in sorted order, all the more better. I have tried df.hist() but it is not working for me. Any clues? Thanks.

所以基本上条形图（或直方图，如果你可以这样称呼它）应该显示 32pts 出现三次，35pts 出现 5 次，42pts 出现 4 次。如果我可以按排序顺序绘制值，那就更好了。我试过 df.hist() 但它对我不起作用。有什么线索吗？谢谢。

Answer 1

回答by Paul H

Just plot the results of the dataframe's value_countmethod directly:

只需直接绘制数据框value_count方法的结果：

import matplotlib.pyplot as plt
import pandas

data = load_my_data()
fig, ax = plt.subplots()
data['Points'].value_counts().plot(ax=ax, kind='bar')

If you want to remove the string 'pnts' from all of the elements in your column, you can do something like this:

如果要从列中的所有元素中删除字符串 'pnts'，可以执行以下操作：

df['points_int'] = df['Points'].str.replace('pnts', '').astype(int)

That assumes they all end with 'pnts'. If it varying from line to line, you need to look into regular expressions like this: Split columns using pandas

假设它们都以“pnts”结尾。如果它从一行到另一行都不同，您需要查看这样的正则表达式： Split columns using pandas

And the official docs: http://pandas.pydata.org/pandas-docs/stable/text.html#text-string-methods

和官方文档：http: //pandas.pydata.org/pandas-docs/stable/text.html#text-string-methods

Answer 2

回答by Yogesh Kumar

Seaborn package has countplotfunction which can be made use of to make frequency plot:

Seaborn 包具有countplot可用于制作频率图的功能：

import seaborn as sns

ax = sns.countplot(x="Points",data=df)

Python/Pandas DataFrame 中的频率图

提问by SMU

回答by Paul H

回答by Yogesh Kumar

相关推荐

最近更新

标签

Python/Pandas DataFrame 中的频率图

提问by SMU

回答by Paul H

回答by Yogesh Kumar

相关推荐

Python 将数据从 Django 传递到 D3

Python UnicodeDecodeError: 'utf8' 编解码器无法解码位置 0 中的字节 0xa5：起始字节无效

Python 在 PyCharm 中重命名文件

Python 如何从安卓平板电脑访问我的 127.0.0.1:8000

相关推荐

最近更新

标签