pandas 尽管使用频率重新索引，但 ARIMA 模型的“无法在没有频率的情况下向时间戳添加整数值”错误

Question

提问by E. Aly

I'm trying to do a time series prediction using an ARIMA model on this series:

我正在尝试使用该系列的 ARIMA 模型进行时间序列预测：

1960-01-01    12.7
1961-01-01    12.1
1962-01-01    12.7
1963-01-01    12.8
1964-01-01    12.3
1965-01-01    13.0
1966-01-01    12.5
1967-01-01    12.9
1968-01-01    12.9
1969-01-01    13.3
1970-01-01    13.2
1971-01-01    13.0
1972-01-01    12.6
1973-01-01    12.2
1974-01-01    12.4
1975-01-01    12.7
1976-01-01    12.6
1977-01-01    12.2
1978-01-01    12.5
1979-01-01    12.2
1980-01-01    12.2
1981-01-01    12.2
1982-01-01    12.1
1983-01-01    12.3
1984-01-01    11.7
1985-01-01    11.8
1986-01-01    11.5
1987-01-01    11.2
1988-01-01    11.0
1989-01-01    10.9
1990-01-01    10.8
1991-01-01    10.8
1992-01-01    10.6
1993-01-01    10.4
1994-01-01    10.2
1995-01-01    10.2
1996-01-01    10.2
1997-01-01    10.0
1998-01-01     9.8
1999-01-01     9.8
2000-01-01     9.6
2001-01-01     9.3
2002-01-01     9.4
2003-01-01     9.5
2004-01-01     9.1
2005-01-01     9.1
2006-01-01     9.0
2007-01-01     9.0
2008-01-01     9.0
2009-01-01     9.3
2010-01-01     9.2
2011-01-01     9.1
2012-01-01     9.4
2013-01-01     9.4
2014-01-01     9.2
2015-01-01     9.6
Name: Death rate, crude (per 1,000 people), dtype: float64

I use the following code to generate different (p, d, q) values then try each value and get the corresponding AIC, then choose the one that is related to the least AIC. Then use this (p, d, q) values in prediction.

我使用以下代码生成不同的 (p, d, q) 值，然后尝试每个值并获得相应的 AIC，然后选择与最少 AIC 相关的值。然后在预测中使用这个 (p, d, q) 值。

import datetime
import warnings
import itertools
from sklearn.metrics import mean_squared_error as mse

def MAPE (A, F):
    import numpy as np
    n = len(A)
    Av = np.array(A.values)
    Fv = np.array(F.values)
    mape = np.mean(np.abs((Av-Fv)/Av))*100
    mape = np.around(mape, decimals= 2)
    return mape

# Generate pdq combinations
p= d= q= range(7)
pdq = list(itertools.product(p, d, q))

# Choose min pdq corresponding to min AIC
warnings.filterwarnings('ignore')
param_aic = {}
for param in pdq:
    try:
        mod = sm.tsa.ARIMA(cmortS, order= param)
        result = mod.fit()
        param_aic[param] = result.aic
    except:
        continue

min_aic = min(param_aic.values())
min_param = ()
for pm, aic in param_aic.items():
    if aic == min_aic:
        min_param = pm

# Run the model with min pdq
model = sm.tsa.ARIMA(cmortS, order= min_param)
results = model.fit()

#Forecast validation
tp = ''
if min_param[1] > 0:
    tp = 'levels'
else:
    tp = 'linear'

train_sz = int(len(cmortS)*0.66)
train = cmortS[:train_sz]
tst = cmortS[train_sz:]
pred_strt = tst.index[0]
tst_pred = results.predict(start= pred_strt, typ= tp)
mserror = mse(tst, tst_pred)
mserror = np.round(mserror, decimals= 5)
mp = MAPE(tst, tst_pred)
print('Model order: {}, MAPE: {}%, mse: {}'.format(min_param, mp, mserror)) 

# Prediction
end_yr = '2050'
end_dt = pd.to_datetime(end_yr, format= '%Y')
strt_dt = pd.to_datetime('2014', format= '%Y')
Var_pred = results.predict(start= strt_dt, end= end_dt, typ = tp)

Var_pred

and I get the following error when I run it:

运行时出现以下错误：

ValueError: Cannot add integral value to Timestamp without freq.

Although I reindexed the series with a date range with freq= 'AS', I still get the same error.

尽管我使用 freq= 'AS' 为日期范围重新索引了该系列，但我仍然遇到相同的错误。

How can I solve that?

我该如何解决？

Answer 1

回答by Halee

Changing the final few lines of your code to this format should resolve the error message:

将代码的最后几行更改为此格式应该可以解决错误消息：

# Prediction
strt_date = pd.to_datetime('2014-01-01 01:00:00')
end_date = pd.to_datetime('2050-01-01 01:00:00')
Var_pred = results.predict(start = strt_date, end = end_date, typ = tp) 
Var_pred

pandas 尽管使用频率重新索引，但 ARIMA 模型的“无法在没有频率的情况下向时间戳添加整数值”错误

提问by E. Aly

回答by Halee

相关推荐

最近更新

标签

pandas 尽管使用频率重新索引，但 ARIMA 模型的“无法在没有频率的情况下向时间戳添加整数值”错误

提问by E. Aly

回答by Halee

相关推荐

如果相关性大于 0.75，则从 Pandas 的数据框中删除该列

错误：无法在 Pandas 中将浮点 NaN 转换为整数

使用 Python pandas 获取数据帧的所有行

Pandas 相当于 SQL case when 语句创建新变量

相关推荐

最近更新

标签