pandas 在熊猫中,我如何为 datetime 列分组 weekday() ?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13740672/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:31:23  来源:igfitidea点击:

in pandas how can I groupby weekday() for a datetime column?

pythonpandas

提问by monkut

I'd like to filter out weekend data and only look at data for weekdays (mon(0)-fri(4)). I'm new to pandas, what's the best way to accomplish this in pandas?

我想过滤掉周末数据,只查看工作日(mon(0)-fri(4))的数据。我是Pandas的新手,在Pandas中完成此任务的最佳方法是什么?

import datetime
from pandas import *

data = read_csv("data.csv")
data.my_dt 

Out[52]:
0     2012-10-01 02:00:39
1     2012-10-01 02:00:38
2     2012-10-01 02:01:05
3     2012-10-01 02:01:07
4     2012-10-01 02:02:03
5     2012-10-01 02:02:09
6     2012-10-01 02:02:03
7     2012-10-01 02:02:35
8     2012-10-01 02:02:33
9     2012-10-01 02:03:01
10    2012-10-01 02:08:53
11    2012-10-01 02:09:04
12    2012-10-01 02:09:09
13    2012-10-01 02:10:20
14    2012-10-01 02:10:45
...

I'd like to do something like:

我想做类似的事情:

weekdays_only = data[data.my_dt.weekday() < 5]

AttributeError: 'numpy.int64' object has no attribute 'weekday'

AttributeError: 'numpy.int64' 对象没有属性 'weekday'

but this doesn't work, I haven't quite grasped how column datetime objects are accessed.

但这不起作用,我还没有完全掌握如何访问列日期时间对象。

The eventual goal being to arrange hierarchically to weekday hour-range, something like:

最终目标是按层次排列到工作日的小时范围,例如:

monday, 0-6, 7-12, 13-18, 19-23
tuesday, 0-6, 7-12, 13-18, 19-23

回答by Maximilian

your call to the function "weekday" does not work as it operates on the index of data.my_dt, which is an int64 array (this is where the error message comes from)

您对函数“weekday”的调用不起作用,因为它对 data.my_dt 的索引进行操作,这是一个 int64 数组(这是错误消息的来源)

you could create a new column in data containing the weekdays using something like:

您可以使用以下内容在包含工作日的数据中创建一个新列:

data['weekday'] = data['my_dt'].apply(lambda x: x.weekday())

then you can filter for weekdays with:

然后您可以使用以下方法过滤工作日:

weekdays_only = data[data['weekday'] < 5 ]

I hope this helps

我希望这有帮助

回答by Kartik

Faster way would be to use DatetimeIndex.weekday, like so:

更快的方法是使用DatetimeIndex.weekday,像这样:

temp = pd.DatetimeIndex(data['my_dt'])
data['weekday'] = temp.weekday

Much much faster, especially for a large number of rows. For further info, check thisanswer.

快得多,尤其是对于大量行。有关更多信息,请查看答案。