pandas 熊猫数据框的直方图

Question

提问by ca_san

I couldn't find anywhere on the site a similar question.

我在网站上的任何地方都找不到类似的问题。

I have a fairly large file, with over 100000 lines and I read it using pandas:

我有一个相当大的文件，超过 100000 行，我使用 Pandas 读取它：

df = pd.read_excel("somefile.xls",index_col='Offense Type')

ended up with a dataframe consisting of the first column (the index column) and another column, 'Offense_type' and 'Hour' respectively.

最后得到一个数据框，分别由第一列（索引列）和另一列 'Offense_type' 和 'Hour' 组成。

'Offense Type' consists of a series of "cathegories" say cat1, cat2, cat3, etc... 'Hour' consists of a series of integer numbers between 1 and 24.

“进攻类型”由一系列“类别”组成，例如 cat1、cat2、cat3 等……“小时”由一系列 1 到 24 之间的整数组成。

What I would like to do is obtain a histogram of the ocurrences of each number in the dataframe (there aren't that many cathegories It's at most 10 of them)

我想要做的是获取数据框中每个数字出现的直方图（没有那么多类别，最多 10 个）

Here's an ASCII representation of what I want to get"

这是我想要得到的 ASCII 表示”

(the x's represent the bars in the histogram, they will surely be at a much higher value than 1,2 or 3)

（x 代表直方图中的条形，它们的值肯定会比 1,2 或 3 高得多）

   x        x         # And so on
 x x  x     x x  x    #
 x x  x  x  x x  x    #
 1 2 11 20  5 8 18    #
   Cat1      Cat2     #

But i'm getting a single barplot for every line in df using:

但是我使用以下命令为 df 中的每一行获取一个条形图：

df.plot(kind='bar')

which is basically unreadable:

这基本上是不可读的：

histogram_of_dataframe

I've also tried with the hist() and Histogram() function with no luck.

我也尝试过 hist() 和 Histogram() 函数，但没有成功。

Here's some sample data:

以下是一些示例数据：

sample_data

样本数据

Answer 1

回答by ca_san

After a long night, I got the answer since every event was ocurring only once I added an extra column in the file with the number one and then indexed the dataframe by this:

经过一个漫长的夜晚，我得到了答案，因为每个事件只发生一次，我在文件中添加了一个额外的列，然后用数字为第一列索引数据框：

df = pd.read_excel("somefile.xls",index_col='Numberone')

And then simply tried this:

然后简单地尝试这个：

df.hist(by=df['Offense Type'])

finally getting exactly what I wanted

终于得到了我想要的

pandas 熊猫数据框的直方图

提问by ca_san

回答by ca_san

相关推荐

最近更新

标签

pandas 熊猫数据框的直方图

提问by ca_san

回答by ca_san

相关推荐

pandas 无法按熊猫数据框中的时间戳编制索引

无法在 Python 2.x 下从 Pandas 的列名中删除 unicode char

Pandas to_csv with quoting=3 (QUOTE_NONNUMERIC) 不起作用

Pandas：随机删除行而不混洗数据集

相关推荐

最近更新

标签