pandas 按时间过滤熊猫数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35052691/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
filter pandas dataframe by time
提问by Neil
I have a pandas dataframe which I want to subset on time greater or less than 12pm. First i convert my string datetime to datetime[64]ns object in pandas.
我有一个 Pandas 数据框,我想在大于或小于下午 12 点的时间进行子集化。首先,我将字符串 datetime 转换为 pandas 中的 datetime[64]ns 对象。
segments_data['time'] = pd.to_datetime((segments_data['time']))
Then I separate time,date,month,year & dayofweek like below.
然后我将时间、日期、月份、年份和星期几分开,如下所示。
import datetime as dt
segments_data['date'] = segments_data.time.dt.date
segments_data['year'] = segments_data.time.dt.year
segments_data['month'] = segments_data.time.dt.month
segments_data['dayofweek'] = segments_data.time.dt.dayofweek
segments_data['time'] = segments_data.time.dt.time
My time column looks like following.
我的时间列如下所示。
segments_data['time']
Out[1906]:
07:43:00
07:52:00
08:00:00
08:42:00
09:18:00
09:18:00
09:18:00
09:23:00
12:32:00
12:43:00
12:55:00
Name: time, dtype: object
Now I want to subset dataframe with time greater than 12pm and time less than 12pm.
现在我想对时间大于 12 pm 和时间小于 12 pm 的数据帧进行子集。
segments_data.time[segments_data['time'] < 12:00:00]
It doesn't work because time
is a string object
.
它不起作用,因为time
是string object
.
回答by Bob Baxley
Leave a column as the raw datetime, call it ts
:
保留一列作为原始日期时间,将其命名为ts
:
segments_data['ts'] = pd.to_datetime((segments_data['time']))
Next you can cast the datetime to an H:M:S
string and use between(start,end)
seems to work:
接下来,您可以将日期时间转换为H:M:S
字符串并使用between(start,end)
似乎有效:
In [227]:
segments_data=pd.DataFrame(x,columns=['ts'])
segments_data.ts = pd.to_datetime(segments_data.ts)
segments_data
Out[227]:
ts
0 2016-01-28 07:43:00
1 2016-01-28 07:52:00
2 2016-01-28 08:00:00
3 2016-01-28 08:42:00
4 2016-01-28 09:18:00
5 2016-01-28 09:18:00
6 2016-01-28 09:18:00
7 2016-01-28 09:23:00
8 2016-01-28 12:32:00
9 2016-01-28 12:43:00
10 2016-01-28 12:55:00
In [228]:
segments_data[segments_data.ts.dt.strftime('%H:%M:%S').between('00:00:00','12:00:00')]
Out[228]:
ts
0 2016-01-28 07:43:00
1 2016-01-28 07:52:00
2 2016-01-28 08:00:00
3 2016-01-28 08:42:00
4 2016-01-28 09:18:00
5 2016-01-28 09:18:00
6 2016-01-28 09:18:00
7 2016-01-28 09:23:00