Python pandas 没有属性 ols - 错误(滚动 OLS)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44707384/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:51:48  来源:igfitidea点击:

Python pandas has no attribute ols - Error (rolling OLS)

pythonpython-3.xpandaslinear-regressionstatsmodels

提问by Desta Haileselassie Hagos

For my evaluation, I wanted to run a rolling 1000 window OLS regression estimationof the dataset found in this URL: https://drive.google.com/open?id=0B2Iv8dfU4fTUa3dPYW5tejA0bzgusing the following Pythonscript.

对于我的评估,我想 使用以下脚本运行OLS regression estimation在此 URL 中找到的数据集的滚动 1000 窗口:https: //drive.google.com/open?id=Python0B2Iv8dfU4fTUa3dPYW5tejA0bzg。

# /usr/bin/python -tt

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from statsmodels.formula.api import ols

df = pd.read_csv('estimated.csv', names=('x','y'))

model = pd.stats.ols.MovingOLS(y=df.Y, x=df[['y']], 
                               window_type='rolling', window=1000, intercept=True)
df['Y_hat'] = model.y_predict

However, when I run my Python script, I am getting this error: AttributeError: module 'pandas.stats' has no attribute 'ols'. Could this error be from the version that I am using? The pandasinstalled on my Linux node has a version of 0.20.2

但是,当我运行 Python 脚本时,出现以下错误:AttributeError: module 'pandas.stats' has no attribute 'ols'. 这个错误可能来自我使用的版本吗?在pandas安装我的Linux节点上有一个版本的0.20.2

回答by Alexander

pd.stats.ols.MovingOLSwas removed in Pandas version 0.20.0

pd.stats.ols.MovingOLS在 Pandas 版本 0.20.0 中被删除

http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#whatsnew-0200-prior-deprecations

http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#whatsnew-0200-prior-deprecations

https://github.com/pandas-dev/pandas/pull/11898

https://github.com/pandas-dev/pandas/pull/11898

I can't find an 'off the shelf' solution for what should be such an obvious use case as rolling regressions.

对于滚动回归这样一个明显的用例,我找不到“现成的”解决方案。

The following should do the trick without investing too much time in a more elegant solution. It uses numpy to calculate the predicted value of the regression based on the regression parameters and the X values in the rolling window.

以下应该可以解决问题,而无需在更优雅的解决方案上投入太多时间。它使用 numpy 根据回归参数和滚动窗口中的 X 值计算回归的预测值。

window = 1000
a = np.array([np.nan] * len(df))
b = [np.nan] * len(df)  # If betas required.
y_ = df.y.values
x_ = df[['x']].assign(constant=1).values
for n in range(window, len(df)):
    y = y_[(n - window):n]
    X = x_[(n - window):n]
    # betas = Inverse(X'.X).X'.y
    betas = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
    y_hat = betas.dot(x_[n, :])
    a[n] = y_hat
    b[n] = betas.tolist()  # If betas required.

The code above is equivalent to the following and about 35% faster:

上面的代码等效于以下代码,速度提高了约 35%:

model = pd.stats.ols.MovingOLS(y=df.y, x=df.x, window_type='rolling', window=1000, intercept=True)
y_pandas = model.y_predict