pandas 带有熊猫的 OLS:日期时间索引作为预测器
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14361634/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
OLS with pandas: datetime index as predictor
提问by leroygr
I would like to use pandas OLS function to fit a trendline to my data Series. Does anyone knows how to use the datetime index from the pandas Series as predictor in the OLS?
我想使用 Pandas OLS 函数来拟合我的数据系列的趋势线。有谁知道如何使用 Pandas 系列中的日期时间索引作为 OLS 中的预测器?
For example, let say that I have a simple time series:
例如,假设我有一个简单的时间序列:
>>> ts
2001-12-31 19.828763
2002-12-31 20.112191
2003-12-31 19.509116
2004-12-31 19.913656
2005-12-31 19.701649
2006-12-31 20.022819
2007-12-31 20.103024
2008-12-31 20.132712
2009-12-31 19.850609
2010-12-31 19.290640
2011-12-31 19.936210
2012-12-31 19.664813
Freq: A-DEC
I would like to do an OLS on it using the index as predictor:
我想使用索引作为预测器对其进行 OLS:
model = pd.ols(y=ts,x=ts.index,intercept=True)
But as x is a list of datetime index, the function returns an error. Anyone has an idea?
但由于 x 是日期时间索引列表,该函数返回错误。有人有想法吗?
I could use linregress from scipy.statsbut I wonder if it is possible with Pandas.
我可以使用 scipy.stats 中的linregress,但我想知道 Pandas 是否可行。
Thanks, Greg
谢谢,格雷格
采纳答案by Theodros Zelleke
The problem is that you cannot pass an Indexto ols.
Change it to a Series:
问题是你不能传递Indexto ols。
将其更改为Series:
In [153]: ts
Out[153]:
2011-01-01 00:00:00 19.828763
2011-01-01 01:00:00 20.112191
2011-01-01 02:00:00 19.509116
Freq: H, Name: 1
In [158]: type(ts.index)
Out[158]: pandas.tseries.index.DatetimeIndex
In [154]: df = ts.reset_index()
In [155]: df
Out[155]:
index 1
0 2011-01-01 00:00:00 19.828763
1 2011-01-01 01:00:00 20.112191
2 2011-01-01 02:00:00 19.509116
In [160]: type(df['index'])
Out[160]: pandas.core.series.Series
In [156]: model = pd.ols(y=df[1], x=df['index'], intercept=True)
In [163]: model
Out[163]:
-------------------------Summary of Regression Analysis-------------------------
Formula: Y ~ <x> + <intercept>
Number of Observations: 3
Number of Degrees of Freedom: 1
R-squared: -0.0002
Adj R-squared: -0.0002
Rmse: 0.3017
F-stat (1, 2): -inf, p-value: 1.0000
Degrees of Freedom: model 0, resid 2
-----------------------Summary of Estimated Coefficients------------------------
Variable Coef Std Err t-stat p-value CI 2.5% CI 97.5%
--------------------------------------------------------------------------------
x 0.0000 0.0000 0.00 0.9998 -0.0000 0.0000
intercept 0.0000 76683.4934 0.00 1.0000 -150299.6471 150299.6471
---------------------------------End of Summary---------------------------------

