Python 绘制 Pandas DataFrame 的出现次数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21331722/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Plot number of occurrences from Pandas DataFrame
提问by Timofey
I have a DataFrame with two columns. One of them is containing timestamps and another one - id of some action. Something like that:
我有一个包含两列的 DataFrame。其中一个包含时间戳,另一个包含某些操作的 ID。类似的东西:
2000-12-29 00:10:00 action1
2000-12-29 00:20:00 action2
2000-12-29 00:30:00 action2
2000-12-29 00:40:00 action1
2000-12-29 00:50:00 action1
...
2000-12-31 00:10:00 action1
2000-12-31 00:20:00 action2
2000-12-31 00:30:00 action2
I would like to know how many actions of certain type have been performed in a certain day. I.e. for every day, I need to count the number of occurrences of actionX and plot this data with date on X axis and number of occurrences of actionX on Y axes, for each date.
我想知道某天执行了多少特定类型的操作。即对于每一天,我需要计算 actionX 的出现次数,并用 X 轴上的日期和 Y 轴上的 actionX 为每个日期绘制此数据。
Of course I can count actions for each day naively just by iterating through my dataset. But what's the "right way" to do in with pandas/matplotlib?
当然,我可以通过迭代我的数据集来天真地计算每天的动作。但是使用 pandas/matplotlib 的“正确方法”是什么?
采纳答案by mkln
Starting from
从...开始
mydate col_name
0 2000-12-29 00:10:00 action1
1 2000-12-29 00:20:00 action2
2 2000-12-29 00:30:00 action2
3 2000-12-29 00:40:00 action1
4 2000-12-29 00:50:00 action1
5 2000-12-31 00:10:00 action1
6 2000-12-31 00:20:00 action2
7 2000-12-31 00:30:00 action2
You can do
你可以做
df['mydate'] = pd.to_datetime(df['mydate'])
df = df.set_index('mydate')
df['day'] = df.index.date
counts = df.groupby(['day', 'col_name']).agg(len)
but perhaps there's an even more straightforward way. the above should work anyway.
但也许还有更直接的方法。无论如何,以上应该有效。
If you want to use counts as a DataFrame, I'd then transform it back
如果您想将计数用作 DataFrame,我会将其转换回
counts = pd.DataFrame(counts, columns=['count'])
回答by David Hagan
You can get the counts by using
您可以使用以下方法获取计数
df.groupby([df.index.date, 'action']).count()
or you can plot directly using this method
或者您可以使用此方法直接绘图
df.groupby([df.index.date, 'action']).count().plot(kind='bar')
You could also just store the results to countand then plot it separately. This is assuming that your index is already in datetimeindex format, otherwise follow the directions of @mkln above.
您也可以将结果存储到count,然后单独绘制。这是假设您的索引已经是 datetimeindex 格式,否则请按照上面@mkln 的指示进行操作。

