pandas 在熊猫数据框中的特定时间之间选择数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19179214/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Selecting Data between Specific hours in a pandas dataframe
提问by itsaruns
My Pandas Dataframe frame looks something like this
我的 Pandas Dataframe 框架看起来像这样
1. 2013-10-09 09:00:05
2. 2013-10-09 09:05:00
3. 2013-10-09 10:00:00
4. ............
5. ............
6. ............
7. 2013-10-10 09:00:05
8. 2013-10-10 09:05:00
9. 2013-10-10 10:00:00
I want the data lying in between hours 9 and 10 ...if anyone has worked on something like this ,it would be really helpful.
我希望数据介于 9 到 10 小时之间……如果有人做过这样的工作,那将非常有帮助。
回答by Jeff
In [7]: index = date_range('20131009 08:30','20131010 10:05',freq='5T')
In [8]: df = DataFrame(randn(len(index),2),columns=list('AB'),index=index)
In [9]: df
Out[9]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 308 entries, 2013-10-09 08:30:00 to 2013-10-10 10:05:00
Freq: 5T
Data columns (total 2 columns):
A 308 non-null values
B 308 non-null values
dtypes: float64(2)
In [10]: df.between_time('9:00','10:00')
Out[10]:
A B
2013-10-09 09:00:00 -0.664639 1.597453
2013-10-09 09:05:00 1.197290 -0.500621
2013-10-09 09:10:00 1.470186 -0.963553
2013-10-09 09:15:00 0.181314 -0.242415
2013-10-09 09:20:00 0.969427 -1.156609
2013-10-09 09:25:00 0.261473 0.413926
2013-10-09 09:30:00 -0.003698 0.054953
2013-10-09 09:35:00 0.418147 -0.417291
2013-10-09 09:40:00 0.413565 -1.096234
2013-10-09 09:45:00 0.460293 1.200277
2013-10-09 09:50:00 -0.702444 -0.041597
2013-10-09 09:55:00 0.548385 -0.832382
2013-10-09 10:00:00 -0.526582 0.758378
2013-10-10 09:00:00 0.926738 0.178204
2013-10-10 09:05:00 -1.178534 0.184205
2013-10-10 09:10:00 1.408258 0.948526
2013-10-10 09:15:00 0.523318 0.327390
2013-10-10 09:20:00 -0.193174 0.863294
2013-10-10 09:25:00 1.355610 -2.160864
2013-10-10 09:30:00 1.930622 0.174683
2013-10-10 09:35:00 0.273551 0.870682
2013-10-10 09:40:00 0.974756 -0.327763
2013-10-10 09:45:00 1.808285 0.080267
2013-10-10 09:50:00 0.842119 0.368689
2013-10-10 09:55:00 1.065585 0.802003
2013-10-10 10:00:00 -0.324894 0.781885
回答by ak3191
Make a new column for the time after splitting your original column . Use the below code to split your time for hours, minutes, and seconds:-
拆分原始列后,为时间创建一个新列。使用以下代码将您的时间划分为小时、分钟和秒:-
df[['h','m','s']] = df['Time'].astype(str).str.split(':', expand=True).astype(int)
Once you are done with that, you have to select the data by filtering it out:-
完成后,您必须通过过滤掉数据来选择数据:-
df9to10 =df[df['h'].between(9, 10, inclusive=True)]
And, it's dynamic, if you want to take another period between apart from 9 and 10.
而且,它是动态的,如果你想在 9 到 10 之间再花一个时间。
回答by boates
Assuming your original dataframe is called "df" and your time column is called "time" this would work: (where start_time and end_time correspond to the time interval that you'd like)
假设您的原始数据框称为“df”,而您的时间列称为“时间”,这将起作用:(其中 start_time 和 end_time 对应于您想要的时间间隔)
>>> df_new = df[(df['time'] > start_time) & (df['time'] < end_time)]

