pandas 我正在尝试在 Python 中的 statsmodels 中运行 Dickey-Fuller 测试,但出现错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43100441/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:18:17  来源:igfitidea点击:

I am trying to run Dickey-Fuller test in statsmodels in Python but getting error

python-2.7pandasjupyter-notebook

提问by N.R

I am trying to run Dickey-Fuller test in statsmodels in Python but getting error P Running from python 2.7 & Pandas version 0.19.2. Dataset is from Github and imported the same

我正在尝试在 Python 中的 statsmodels 中运行 Dickey-Fuller 测试,但收到错误 P Running from python 2.7 & Pandas version 0.19.2。数据集来自 Github 并导入相同

enter code here

 from statsmodels.tsa.stattools import adfuller
    def test_stationarity(timeseries):

    print 'Results of Dickey-Fuller Test:'
        dftest = ts.adfuller(timeseries, autolag='AIC' )
        dfoutput = pd.Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
        for key,value in dftest[4].items():
            dfoutput['Critical Value (%s)'%key] = value
        print dfoutput


    test_stationarity(tr)

Gives me following error :

给我以下错误:

Results of Dickey-Fuller Test:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-15-10ab4b87e558> in <module>()
----> 1 test_stationarity(tr)

<ipython-input-14-d779e1ed35b3> in test_stationarity(timeseries)
     19     #Perform Dickey-Fuller test:
     20     print 'Results of Dickey-Fuller Test:'
---> 21     dftest = ts.adfuller(timeseries, autolag='AIC' )
     22     #dftest = adfuller(timeseries, autolag='AIC')
     23     dfoutput = pd.Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])

C:\Users\SONY\Anaconda2\lib\site-packages\statsmodels\tsa\stattools.pyc in adfuller(x, maxlag, regression, autolag, store, regresults)
    209 
    210     xdiff = np.diff(x)
--> 211     xdall = lagmat(xdiff[:, None], maxlag, trim='both', original='in')
    212     nobs = xdall.shape[0]  # pylint: disable=E1103
    213 

C:\Users\SONY\Anaconda2\lib\site-packages\statsmodels\tsa\tsatools.pyc in lagmat(x, maxlag, trim, original)
    322     if x.ndim == 1:
    323         x = x[:,None]
--> 324     nobs, nvar = x.shape
    325     if original in ['ex','sep']:
    326         dropidx = nvar

ValueError: too many values to unpack

回答by Pedro Marcelino

trmust be a 1d array-like, as you can see here. I don't know what is trin your case. Assuming that you defined tras the dataframe that contains the time serie's data, you should do something like this:

tr必须是一个一维数组,正如你在这里看到的。我不知道你的情况tr是什么。假设您将tr定义为包含时间序列数据的数据帧,您应该执行以下操作:

tr = tr.iloc[:,0].values

Then adfullerwill be able to read the data.

然后,adfuller将能够读取数据。

回答by M. Paul

just change the line as:

只需将行更改为:

dftest = adfuller(timeseries.iloc[:,0].values, autolag='AIC' )

It will work. adfuller requires a 1D array list. In your case you have a dataframe. Therefore mention the column or edit the line as mentioned above.

它会起作用。adfuller 需要一维数组列表。在您的情况下,您有一个数据框。因此,如上所述提及该列或编辑该行。

回答by Avind

I am assuming since you are using the Dickey-Fuller test .you want to maintain the timeseries i.e date time column as index.So in order to do that.

我假设因为您正在使用 Dickey-Fuller 测试。您希望将时间序列(即日期时间列)作为索引进行维护。因此,为了做到这一点。

tr=tr.set_index('Month') #I am assuming here the time series column name is Month ts = tr['othercoulumnname'] #Just use the other column name here it might be count or anything

tr=tr.set_index('Month') #I am assuming here the time series column name is Month ts = tr['othercoulumnname'] #Just use the other column name here it might be count or anything

I hope this helps.

我希望这有帮助。