Python pandas 没有属性 ols - 错误(滚动 OLS)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44707384/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pandas has no attribute ols - Error (rolling OLS)
提问by Desta Haileselassie Hagos
For my evaluation, I wanted to run a rolling 1000 window OLS regression estimation
of the dataset found in this URL:
https://drive.google.com/open?id=0B2Iv8dfU4fTUa3dPYW5tejA0bzgusing the following Python
script.
对于我的评估,我想
使用以下脚本运行OLS regression estimation
在此 URL 中找到的数据集的滚动 1000 窗口:https:
//drive.google.com/open?id=Python
0B2Iv8dfU4fTUa3dPYW5tejA0bzg。
# /usr/bin/python -tt
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from statsmodels.formula.api import ols
df = pd.read_csv('estimated.csv', names=('x','y'))
model = pd.stats.ols.MovingOLS(y=df.Y, x=df[['y']],
window_type='rolling', window=1000, intercept=True)
df['Y_hat'] = model.y_predict
However, when I run my Python script, I am getting this error: AttributeError: module 'pandas.stats' has no attribute 'ols'
. Could this error be from the version that I am using? The pandas
installed on my Linux node has a version of 0.20.2
但是,当我运行 Python 脚本时,出现以下错误:AttributeError: module 'pandas.stats' has no attribute 'ols'
. 这个错误可能来自我使用的版本吗?在pandas
安装我的Linux节点上有一个版本的0.20.2
回答by Alexander
pd.stats.ols.MovingOLS
was removed in Pandas version 0.20.0
pd.stats.ols.MovingOLS
在 Pandas 版本 0.20.0 中被删除
http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#whatsnew-0200-prior-deprecations
http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#whatsnew-0200-prior-deprecations
https://github.com/pandas-dev/pandas/pull/11898
https://github.com/pandas-dev/pandas/pull/11898
I can't find an 'off the shelf' solution for what should be such an obvious use case as rolling regressions.
对于滚动回归这样一个明显的用例,我找不到“现成的”解决方案。
The following should do the trick without investing too much time in a more elegant solution. It uses numpy to calculate the predicted value of the regression based on the regression parameters and the X values in the rolling window.
以下应该可以解决问题,而无需在更优雅的解决方案上投入太多时间。它使用 numpy 根据回归参数和滚动窗口中的 X 值计算回归的预测值。
window = 1000
a = np.array([np.nan] * len(df))
b = [np.nan] * len(df) # If betas required.
y_ = df.y.values
x_ = df[['x']].assign(constant=1).values
for n in range(window, len(df)):
y = y_[(n - window):n]
X = x_[(n - window):n]
# betas = Inverse(X'.X).X'.y
betas = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
y_hat = betas.dot(x_[n, :])
a[n] = y_hat
b[n] = betas.tolist() # If betas required.
The code above is equivalent to the following and about 35% faster:
上面的代码等效于以下代码,速度提高了约 35%:
model = pd.stats.ols.MovingOLS(y=df.y, x=df.x, window_type='rolling', window=1000, intercept=True)
y_pandas = model.y_predict