pandas 当时间戳未被归类为索引时,如何按时间戳对数据帧进行切片?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37646501/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:20:25  来源:igfitidea点击:

How can I slice a dataframe by timestamp, when timestamp isn't classified as index?

pythonpandasdataframesplittimestamp

提问by dot.Py

How can I split my pandas dataframe by using the timestamp on it?

如何通过使用时间戳来拆分我的 Pandas 数据帧?

I got the following prices when I call df30m:

我打电话时得到以下价格df30m

               Timestamp    Open    High     Low   Close     Volume
0    2016-05-01 19:30:00  449.80  450.13  449.80  449.90    74.1760
1    2016-05-01 20:00:00  449.90  450.27  449.90  450.07    63.5840
2    2016-05-01 20:30:00  450.12  451.00  450.02  450.51    64.1080
3    2016-05-01 21:00:00  450.51  452.05  450.50  451.22    75.7390
4    2016-05-01 21:30:00  451.21  451.64  450.81  450.87    71.1190
5    2016-05-01 22:00:00  450.87  452.05  450.87  451.07    73.8430
6    2016-05-01 22:30:00  451.09  451.70  450.91  450.91    68.1490
7    2016-05-01 23:00:00  450.91  450.98  449.97  450.61    84.5430
8    2016-05-01 23:30:00  450.61  451.50  450.55  451.45   111.2370
9    2016-05-02 00:00:00  451.47  452.31  450.69  451.19   190.0750
10   2016-05-02 00:30:00  451.20  451.68  450.45  450.82   186.0930
11   2016-05-02 01:00:00  450.83  451.64  450.65  450.73   112.4630
12   2016-05-02 01:30:00  450.73  451.10  450.31  450.56   137.7530
13   2016-05-02 02:00:00  450.56  452.01  449.98  450.27   151.6140
14   2016-05-02 02:30:00  450.27  451.30  450.23  451.11    99.5490
15   2016-05-02 03:00:00  451.29  451.29  450.17  450.33   178.9860
16   2016-05-02 03:30:00  450.44  451.20  450.44  450.75    65.1480
17   2016-05-02 04:00:00  450.79  451.20  450.75  451.00    78.0430
18   2016-05-02 04:30:00  451.00  451.11  450.85  451.11    64.7250
19   2016-05-02 05:00:00  451.11  451.64  451.00  451.12    73.4840
20   2016-05-02 05:30:00  451.12  451.83  450.67  451.33    94.1950
21   2016-05-02 06:00:00  451.35  451.37  450.17  450.18   227.7480
22   2016-05-02 06:30:00  450.18  450.43  450.17  450.17    83.0270
23   2016-05-02 07:00:00  450.17  450.43  448.90  449.41   170.4950
24   2016-05-02 07:30:00  449.38  450.00  448.56  448.56   243.0420
25   2016-05-02 08:00:00  448.67  448.67  446.21  448.00   525.7090
26   2016-05-02 08:30:00  448.12  448.49  445.00  445.00   673.5810
27   2016-05-02 09:00:00  445.00  445.51  440.11  444.20  1392.9049
28   2016-05-02 09:30:00  444.24  444.36  440.11  442.00   438.6860
29   2016-05-02 10:00:00  441.91  443.20  440.05  442.24   400.5850
...                  ...     ...     ...     ...     ...        ...
1651 2016-06-05 05:00:00  578.74  579.00  577.92  578.39    93.6980
1652 2016-06-05 05:30:00  578.40  578.48  574.52  575.26    98.1580
1653 2016-06-05 06:00:00  575.24  576.02  572.47  574.06   126.8620
1654 2016-06-05 06:30:00  574.06  576.35  574.06  576.34   125.4120
1655 2016-06-05 07:00:00  576.34  576.34  574.73  575.83    34.8070
1656 2016-06-05 07:30:00  575.84  576.27  574.91  575.58    74.8180
1657 2016-06-05 08:00:00  575.58  578.57  575.58  578.36   123.2560
1658 2016-06-05 08:30:00  578.23  578.47  576.18  577.25    43.6590
1659 2016-06-05 09:00:00  577.20  578.85  576.70  577.27    95.3900
1660 2016-06-05 09:30:00  577.36  578.18  576.70  576.70    51.0250
1661 2016-06-05 10:00:00  576.70  576.70  574.55  575.39   101.0590
1662 2016-06-05 10:30:00  575.41  576.44  575.18  576.44    86.4340
1663 2016-06-05 11:00:00  576.50  577.89  576.50  577.80   113.0600
1664 2016-06-05 11:30:00  577.80  578.10  576.03  576.98    57.5050
1665 2016-06-05 12:00:00  576.98  577.55  576.59  577.54    56.1070
1666 2016-06-05 12:30:00  577.54  583.00  570.93  572.82   872.8200
1667 2016-06-05 13:00:00  572.94  573.19  569.64  572.50   310.0020
1668 2016-06-05 13:30:00  572.50  574.37  572.50  574.09    59.3410
1669 2016-06-05 14:00:00  574.09  574.19  571.51  572.98   155.4310
1670 2016-06-05 14:30:00  572.98  573.57  572.02  573.47    76.9270
1671 2016-06-05 15:00:00  573.62  575.10  572.97  573.37    59.1430
1672 2016-06-05 15:30:00  573.37  574.39  573.37  574.38    77.3270
1673 2016-06-05 16:00:00  574.39  575.59  574.38  575.59    52.0150
1674 2016-06-05 16:30:00  575.00  575.59  574.50  575.00    66.9300
1675 2016-06-05 17:00:00  575.00  576.83  574.38  576.60    50.2990
1676 2016-06-05 17:30:00  576.60  577.50  575.50  576.86   104.5200
1677 2016-06-05 18:00:00  576.86  577.21  575.44  575.80    55.3270
1678 2016-06-05 18:30:00  575.77  575.80  574.52  574.77    78.7760
1679 2016-06-05 19:00:00  574.73  575.18  572.52  574.47   126.4300
1680 2016-06-05 19:30:00  574.49  574.87  573.80  574.32    10.4930

As you can see, it contains the last 35 days grouped by intervals of 30 min.

如您所见,它包含按 30 分钟间隔分组的过去 35 天。

I wanna manipulate this price history in different time windows.

我想在不同的时间窗口操纵这个价格历史。

So, as a beginner example, I would like to fetch only the info from the last 1 day.

因此,作为初学者示例,我只想获取过去 1 天的信息。

How can I filter this dataframe to show the info from the last 1 day?

如何过滤此数据框以显示过去 1 天的信息?

This is what I've tried:

这是我尝试过的:

import datetime

d0 = datetime.datetime.today()
d1 = datetime.datetime.today() - datetime.timedelta(days=1)

print d0
>>> 2016-06-05 17:10:37.633824

print d1
>>> 2016-06-04 17:10:37.633967  

df_1d = df30m['Timestamp'] > d1

print df_1d

This returns me a pandas series filled with True or False

这会返回一个充满 True 或 False 的Pandas系列

0    False
1    False
2    False
3    False
4    False
...
1676    True
1677    True
1678    True
1679    True
1680    True

Also I've tried to use the between_time()module.

我也尝试使用该between_time()模块。

df_1d = df30m.between_time(d0, d1)

But I got the following error message:

但我收到以下错误消息:

TypeError: Index must be DatetimeIndex

Please, can anyone show me a pythonic way to slice my dataframe?

拜托,任何人都可以向我展示一种pythonic方式来切片我的数据帧吗?

采纳答案by Alexander

You can use locto index your data. Do you know if your timestamps at datetime.datetime formats or Pandas Timestamps?

您可以loc用来索引您的数据。你知道你的时间戳是 datetime.datetime 格式还是 Pandas Timestamps?

df30m.loc[(df30m.Timestamp <= d0) & (df30m.Timestamp >= d1)]

You can set the index to the Timestamp column and then index as follows:

您可以将索引设置为 Timestamp 列,然后按如下方式进行索引:

df.set_index('Timestamp', inplace=True)
df[d1:d0]