pandas 按时间过滤熊猫数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35052691/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:34:44  来源:igfitidea点击:

filter pandas dataframe by time

pythondatetimepandas

提问by Neil

I have a pandas dataframe which I want to subset on time greater or less than 12pm. First i convert my string datetime to datetime[64]ns object in pandas.

我有一个 Pandas 数据框,我想在大于或小于下午 12 点的时间进行子集化。首先,我将字符串 datetime 转换为 pandas 中的 datetime[64]ns 对象。

segments_data['time'] = pd.to_datetime((segments_data['time']))

Then I separate time,date,month,year & dayofweek like below.

然后我将时间、日期、月份、年份和星期几分开,如下所示。

import datetime as dt

segments_data['date'] = segments_data.time.dt.date
segments_data['year'] = segments_data.time.dt.year
segments_data['month'] = segments_data.time.dt.month
segments_data['dayofweek'] = segments_data.time.dt.dayofweek
segments_data['time'] = segments_data.time.dt.time

My time column looks like following.

我的时间列如下所示。

segments_data['time']
Out[1906]: 
  07:43:00
  07:52:00
  08:00:00
  08:42:00
  09:18:00
  09:18:00
  09:18:00
  09:23:00
  12:32:00
  12:43:00
  12:55:00
  Name: time, dtype: object

Now I want to subset dataframe with time greater than 12pm and time less than 12pm.

现在我想对时间大于 12 pm 和时间小于 12 pm 的数据帧进行子集。

segments_data.time[segments_data['time'] < 12:00:00]

It doesn't work because timeis a string object.

它不起作用,因为timestring object.

回答by Bob Baxley

Leave a column as the raw datetime, call it ts:

保留一列作为原始日期时间,将其命名为ts

segments_data['ts'] = pd.to_datetime((segments_data['time']))

Next you can cast the datetime to an H:M:Sstring and use between(start,end)seems to work:

接下来,您可以将日期时间转换为H:M:S字符串并使用between(start,end)似乎有效:

In [227]:
segments_data=pd.DataFrame(x,columns=['ts'])
segments_data.ts = pd.to_datetime(segments_data.ts)
segments_data
Out[227]:
ts
0   2016-01-28 07:43:00
1   2016-01-28 07:52:00
2   2016-01-28 08:00:00
3   2016-01-28 08:42:00
4   2016-01-28 09:18:00
5   2016-01-28 09:18:00
6   2016-01-28 09:18:00
7   2016-01-28 09:23:00
8   2016-01-28 12:32:00
9   2016-01-28 12:43:00
10  2016-01-28 12:55:00

In [228]:    
 segments_data[segments_data.ts.dt.strftime('%H:%M:%S').between('00:00:00','12:00:00')]
Out[228]:
ts
0   2016-01-28 07:43:00
1   2016-01-28 07:52:00
2   2016-01-28 08:00:00
3   2016-01-28 08:42:00
4   2016-01-28 09:18:00
5   2016-01-28 09:18:00
6   2016-01-28 09:18:00
7   2016-01-28 09:23:00