Pandas 删除时间范围之外的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14539992/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Drop Rows Outside of Time Range
提问by Jeff
I am trying to go through every row in a DataFrame index and remove all rows that are not between a certain time.
我试图遍历 DataFrame 索引中的每一行并删除不在特定时间之间的所有行。
I have been looking for solutions but none of them separate the Date from the Time, and all I want to do is drop the rows that are outside of a Time range.
我一直在寻找解决方案,但没有一个将日期与时间分开,我想要做的就是删除时间范围之外的行。
回答by Andy Hayden
You can use the between_timefunction directly:
您可以between_time直接使用该功能:
ts.between_time(datetime.time(18), datetime.time(9), include_start=False, include_end=False)
Original answer:
原答案:
You can use the indexer_between_timeIndexmethod.
您可以使用该indexer_between_timeIndex方法。
For example, to includethose times between 9am and 6pm (inclusive):
例如,要包括上午 9 点到下午 6 点(含)之间的时间:
ts.ix[ts.index.indexer_between_time(datetime.time(9), datetime.time(18))]
to do the opposite and excludethose times between 6pm and 9am (exclusive):
做相反的事情并排除下午 6 点到上午 9 点之间的时间(独家):
ts.ix[ts.index.indexer_between_time(datetime.time(18), datetime.time(9),
include_start=False, include_end=False)]
Note: indexer_between_time's arguments include_startand include_endare by default True, setting include_startto Falsemeans that datetimes whose time-part is precisely start_time(the first argument), in this case 6pm, will not be included.
注意:indexer_between_time的参数include_start和include_end默认情况下True,设置include_start为False意味着时间部分恰好是start_time(第一个参数)的日期时间,在这种情况下是下午 6 点,将不包括在内。
Example:
例子:
In [1]: rng = pd.date_range('1/1/2000', periods=24, freq='H')
In [2]: ts = pd.Series(pd.np.random.randn(len(rng)), index=rng)
In [3]: ts.ix[ts.index.indexer_between_time(datetime.time(10), datetime.time(14))]
Out[3]:
2000-01-01 10:00:00 1.312561
2000-01-01 11:00:00 -1.308502
2000-01-01 12:00:00 -0.515339
2000-01-01 13:00:00 1.536540
2000-01-01 14:00:00 0.108617
Note: the same syntax (using ix) works for a DataFrame:
注意:相同的语法(使用ix)适用于 DataFrame:
In [4]: df = pd.DataFrame(ts)
In [5]: df.ix[df.index.indexer_between_time(datetime.time(10), datetime.time(14))]
Out[5]:
0
2000-01-03 10:00:00 1.312561
2000-01-03 11:00:00 -1.308502
2000-01-03 12:00:00 -0.515339
2000-01-03 13:00:00 1.536540
2000-01-03 14:00:00 0.108617
回答by Ivelin
You can also do:
你也可以这样做:
?rng = pd.date_range('1/1/2000', periods=24, freq='H')
ts = pd.Series(pd.np.random.randn(len(rng)), index=rng)
ts.ix[datetime.time(10):datetime.time(14)]
Out[4]:
2000-01-01 10:00:00 -0.363420
2000-01-01 11:00:00 -0.979251
2000-01-01 12:00:00 -0.896648
2000-01-01 13:00:00 -0.051159
2000-01-01 14:00:00 -0.449192
Freq: H, dtype: float64
DataFrame works same way.
DataFrame 的工作方式相同。

