在 Pandas 中将一个时间序列插入另一个时间序列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18955250/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:11:05  来源:igfitidea点击:

Interpolating one time series onto another in pandas

pythonnumpypandas

提问by elfnor

I have one set of values measured at regular times. Say:

我有一组定期测量的值。说:

import pandas as pd
import numpy as np
rng = pd.date_range('2013-01-01', periods=12, freq='H')
data = pd.Series(np.random.randn(len(rng)), index=rng)

And another set of more arbitrary times, for example, (in reality these times are not a regular sequence)

例如,另一组更随意的时间(实际上这些时间不是规则序列)

ts_rng = pd.date_range('2013-01-01 01:11:21', periods=7, freq='87Min')
ts = pd.Series(index=ts_rng)

I want to know the value of data interpolated at the times in ts.
I can do this in numpy:

我想知道在 ts 时间内插值的数据的值。
我可以在 numpy 中做到这一点:

x = np.asarray(ts_rng,dtype=np.float64)
xp = np.asarray(data.index,dtype=np.float64)
fp = np.asarray(data)
ts[:] = np.interp(x,xp,fp)

But I feel pandas has this functionality somewhere in resample, reindexetc. but I can't quite get it.

但我觉得 Pandas 在等的某个地方有这个功能 resamplereindex但我不太明白。

回答by Viktor Kerkez

You can concatenate the two time series and sort by index. Since the values in the second series are NaNyou can interpolateand the just select out the values that represent the points from the second series:

您可以连接两个时间序列并按索引排序。由于第二个系列中的值是NaN可以的interpolate,只需选择代表第二个系列中的点的值:

 pd.concat([data, ts]).sort_index().interpolate().reindex(ts.index)

or

或者

 pd.concat([data, ts]).sort_index().interpolate()[ts.index]

回答by tschm

Assume you would like to evaluate a time series ts on a different datetime_index. This index and the index of ts may overlap. I recommend to use the following groupby trick. This essentially gets rid of dubious double stamps. I then forward interpolate but feel free to apply more fancy methods

假设您想在不同的 datetime_index 上评估时间序列 ts。这个索引和 ts 的索引可能会重叠。我建议使用以下 groupby 技巧。这基本上摆脱了可疑的双重邮票。然后我向前插值但可以随意应用更多花哨的方法

def interpolate(ts, datetime_index):
    x = pd.concat([ts, pd.Series(index=datetime_index)])
    return x.groupby(x.index).first().sort_index().fillna(method="ffill")[datetime_index]

回答by ashkan

Here's a clean one liner:

这是一个干净的单衬:

ts = np.interp( ts_rng.asi8 ,data.index.asi8, data[0] )