pandas 从午夜以外的时间开始重新采样每日熊猫时间序列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20374736/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:24:23  来源:igfitidea点击:

Resample daily pandas timeseries with start at time other than midnight

pythonpandas

提问by ajt

I have a pandas timeseries of 10-min freqency data and need to find the maximum value in each 24-hour period. However, this 24-hour period needs to start each day at 5AM - not the default midnight which pandas assumes.

我有一个 10 分钟频率数据的Pandas时间序列,需要在每个 24 小时内找到最大值。但是,这个 24 小时周期需要从每天凌晨 5 点开始——而不是Pandas假设的默认午夜。

I've been checking out DateOffsetbut so far am drawing blanks. I might have expected something akin to pandas.tseries.offsets.Week(weekday=n), e.g. pandas.tseries.offsets.Week(hour=5), but this is not supported as far as I can tell.

我一直在检查,DateOffset但到目前为止正在绘制空白。我可能已经预料到类似于pandas.tseries.offsets.Week(weekday=n),例如pandas.tseries.offsets.Week(hour=5),但据我所知,这不受支持。

I can do a nasty work around by shifting the data first, but it's unintuitive and even coming back to the same code after just a week I have problems wrapping my head around the shift direction!

我可以通过先shift输入数据来做一个讨厌的工作,但它是不直观的,甚至在仅仅一个星期后又回到相同的代码我在转变方向时遇到了问题!

Any more elegant ideas would be much appreciated.

任何更优雅的想法将不胜感激。

回答by joris

The basekeyword can do the trick (see docs):

base关键字可以做的伎俩(见文档):

s.resample('24h', base=5)

Eg:

例如:

In [35]: idx = pd.date_range('2012-01-01 00:00:00', freq='5min', periods=24*12*3)

In [36]: s = pd.Series(np.arange(len(idx)), index=idx)

In [38]: s.resample('24h', base=5)
Out[38]: 
2011-12-31 05:00:00     29.5
2012-01-01 05:00:00    203.5
2012-01-02 05:00:00    491.5
2012-01-03 05:00:00    749.5
Freq: 24H, dtype: float64

回答by ajt

I've just spotted an answered question which didn't come up on Google or Stack Overflow previously:

我刚刚发现了一个以前没有出现在 Google 或 Stack Overflow 上的已回答问题:

Resample hourly TimeSeries with certain starting hour

使用特定的开始时间重新采样每小时的 TimeSeries

This uses the base parameter, which looks like an addition subsequent to Wes McKinney's Python for Data Analysis. I've given the parameter a go and it seems to do the trick.

这使用了 base 参数,它看起来像是 Wes McKinney 的 Python for Data Analysis 之后的一个补充。我已经试过了这个参数,它似乎可以解决问题。