Python Pandas:每月或每周拆分一个 TimeSerie
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41625077/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Pandas: Split a TimeSerie per month or week
提问by Radar
I have a Timeserie that spans few year, in the following format:
我有一个跨越几年的 Timeserie,格式如下:
timestamp open high low close volume
0 2009-01-02 05:00:00 900.00 906.75 898.00 904.75 15673.0
1 2009-01-02 05:30:00 904.75 907.75 903.75 905.50 4600.0
2 2009-01-02 06:00:00 905.50 907.25 904.50 904.50 3472.0
3 2009-01-02 06:30:00 904.50 905.00 903.25 904.75 6074.0
4 2009-01-02 07:00:00 904.75 905.50 897.00 898.25 12538.0
What would be the simplest way to split that dataframe into multiple dataframes of 1 week or 1 month worth of data?77
将该数据帧拆分为 1 周或 1 个月数据的多个数据帧的最简单方法是什么?77
EDIT: as an example a dataframe containing 1 year of data would be split in 52 dataframes containing a week of data and returned as a list of 52 dataframes
编辑:例如,包含 1 年数据的数据帧将被拆分为包含一周数据的 52 个数据帧,并作为 52 个数据帧的列表返回
(the data can be reconstructed with the formula below)
(数据可以用下面的公式重构)
import pandas as pd
from pandas import Timestamp
dikt={'close': {0: 904.75, 1: 905.5, 2: 904.5, 3: 904.75, 4: 898.25}, 'low': {0: 898.0, 1: 903.75, 2: 904.5, 3: 903.25, 4: 897.0}, 'open': {0: 900.0, 1: 904.75, 2: 905.5, 3: 904.5, 4: 904.75}, 'high': {0: 906.75, 1: 907.75, 2: 907.25, 3: 905.0, 4: 905.5}, 'volume': {0: 15673.0, 1: 4600.0, 2: 3472.0, 3: 6074.0, 4: 12538.0}, 'timestamp': {0: Timestamp('2009-01-02 05:00:00'), 1: Timestamp('2009-01-02 05:30:00'), 2: Timestamp('2009-01-02 06:00:00'), 3: Timestamp('2009-01-02 06:30:00'), 4: Timestamp('2009-01-02 07:00:00')}}
pd.DataFrame(dikt, columns=['timestamp', 'open', 'high', 'low', 'close', 'volume'])
采纳答案by piRSquared
use groupby
with pd.TimeGrouper
and list comprehensions
使用groupby
与pd.TimeGrouper
和list解析
weeks = [g for n, g in df.set_index('timestamp').groupby(pd.TimeGrouper('W'))]
months = [g for n, g in df.set_index('timestamp').groupby(pd.TimeGrouper('M'))]
You can reset the index if you need
如果需要,您可以重置索引
weeks = [g.reset_index()
for n, g in df.set_index('timestamp').groupby(pd.TimeGrouper('W'))]
months = [g.reset_index()
for n, g in df.set_index('timestamp').groupby(pd.TimeGrouper('M'))]
in a dict
在一个 dict
weeks = {n: g.reset_index()
for n, g in df.set_index('timestamp').groupby(pd.TimeGrouper('W'))}
months = {n: g.reset_index()
for n, g in df.set_index('timestamp').groupby(pd.TimeGrouper('M'))}
回答by toto_tico
The pd.TimeGrouper
is deprecatedand will be removed, you can use pd.Grouper
instead.
将pd.TimeGrouper
被弃用,并且将被删除,您可以使用pd.Grouper
来代替。
weeks = [g for n, g in df.groupby(pd.Grouper(key='timestamp',freq='W'))]
months = [g for n, g in df.groupby(pd.Grouper(key='timestamp',freq='M'))]
This way you can also avoid setting the timestamp
as index.
这样你也可以避免设置timestamp
as index。
Also, if your timestamp is part of a multi index, you can refer to it using using the level
parameter (e.g. pd.Grouper(level='timestamp', freq='W')
). Than @jtromans for the heads up.
此外,如果您的时间戳是多索引的一部分,您可以使用level
参数(例如pd.Grouper(level='timestamp', freq='W')
)来引用它。比@jtromans 提神。
回答by rtkaleta
Convert the timestamp
column into DateTimeIndex, then you can slice into it in a variety of ways.
将timestamp
列转换为DateTimeIndex,然后您可以通过多种方式对其进行切片。
回答by coredump
I would use group by for this, assume df stores the data
我会为此使用 group by,假设 df 存储数据
df = df.set_index('timestamp')
df.groupby(pd.TimeGrouper(freq='D'))
then resulting groups would contain all the dataframes you are looking for. this answer is referenced here
那么结果组将包含您正在寻找的所有数据帧。此处引用了此答案