Pandas:在两个日期之间选择 DataFrame 行(日期时间索引)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44547401/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:47:08  来源:igfitidea点击:

Pandas: Selecting DataFrame rows between two dates (Datetime Index)

pythonpandasdatetimedataframeindexing

提问by user3142067

I have a Pandas DataFrame with a DatetimeIndex and one column MSE Lossthe index is formatted as follows:

我有一个带有 DatetimeIndex 的 Pandas DataFrame 和一列MSE Loss索引的格式如下:

DatetimeIndex(['2015-07-16 07:14:41', '2015-07-16 07:14:48',
           '2015-07-16 07:14:54', '2015-07-16 07:15:01',
           '2015-07-16 07:15:07', '2015-07-16 07:15:14',...]

It includes several days.

它包括几天。

I want to select all the rows (all times) of a particular days without specifically knowing the actual time intervals. For example: Between 2015-07-16 07:00:00and 2015-07-16 23:00:00

我想选择特定日期的所有行(所有时间),而无需特别了解实际时间间隔。例如:介于2015-07-16 07:00:00和之间2015-07-16 23:00:00

I tried the approach outlined here: here

我尝试了此处概述的方法:here

But df[date_from:date_to]

df[date_from:date_to]

outputs:

输出:

KeyError: Timestamp('2015-07-16 07:00:00')

So it wants exact indices. Furthermore, I don't have a datecolumn. Only an index with the dates.

所以它需要精确的索引。此外,我没有date专栏。只有带有日期的索引。

What is the best way to select a whole day by just providing a date 2015-07-16and then how could I select a specific time range within a particular day?

仅通过提供日期来选择一整天的最佳方法是什么2015-07-16,然后如何选择特定日期内的特定时间范围?

采纳答案by Andrew L

Option 1:

选项 1

Sample df:

示例 df:

df
                      a
2015-07-16 07:14:41  12
2015-07-16 07:14:48  34
2015-07-16 07:14:54  65
2015-07-16 07:15:01  34
2015-07-16 07:15:07  23
2015-07-16 07:15:14   1

It looks like you're trying this without .loc(won't work without it):

看起来您正在尝试此操作.loc(没有它就无法工作):

df.loc['2015-07-16 07:00:00':'2015-07-16 23:00:00']
                      a
2015-07-16 07:14:41  12
2015-07-16 07:14:48  34
2015-07-16 07:14:54  65
2015-07-16 07:15:01  34
2015-07-16 07:15:07  23
2015-07-16 07:15:14   1

Option 2:

选项2

You can use boolean indexing on the index:

您可以在索引上使用布尔索引:

df[(df.index.get_level_values(0) >= '2015-07-16 07:00:00') & (df.index.get_level_values(0) <= '2015-07-16 23:00:00')]

回答by JrtPec

You can use truncate:

您可以使用truncate

begin = pd.Timestamp('2015-07-16 07:00:00')
end = pd.Timestamp('2015-07-16 23:00:00')

df.truncate(before=begin, after=end)