Python 分解趋势、季节性和剩余时间序列元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34457281/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 15:00:45  来源:igfitidea点击:

Decomposing trend, seasonal and residual time series elements

pythonpandasmachine-learningtime-seriesstatsmodels

提问by abutremutante

I have a DataFramewith a few time series:

我有DataFrame几个时间序列:

         divida    movav12       var  varmovav12
Date                                            
2004-01       0        NaN       NaN         NaN
2004-02       0        NaN       NaN         NaN
2004-03       0        NaN       NaN         NaN
2004-04      34        NaN       inf         NaN
2004-05      30        NaN -0.117647         NaN
2004-06      44        NaN  0.466667         NaN
2004-07      35        NaN -0.204545         NaN
2004-08      31        NaN -0.114286         NaN
2004-09      30        NaN -0.032258         NaN
2004-10      24        NaN -0.200000         NaN
2004-11      41        NaN  0.708333         NaN
2004-12      29  24.833333 -0.292683         NaN
2005-01      31  27.416667  0.068966    0.104027
2005-02      28  29.750000 -0.096774    0.085106
2005-03      27  32.000000 -0.035714    0.075630
2005-04      30  31.666667  0.111111   -0.010417
2005-05      31  31.750000  0.033333    0.002632
2005-06      39  31.333333  0.258065   -0.013123
2005-07      36  31.416667 -0.076923    0.002660

I want to decompose the first time series dividain a way that I can separate its trend from its seasonal and residual components.

我想以divida一种可以将其趋势与其季节性和残差成分分开的方式分解第一个时间序列。

I found an answer here, and am trying to use the following code:

我在这里找到了答案,并尝试使用以下代码:

import statsmodels.api as sm

s=sm.tsa.seasonal_decompose(divida.divida)

However I keep getting this error:

但是我不断收到此错误:

Traceback (most recent call last):
File "/Users/Pred_UnBR_Mod2.py", line 78, in <module> s=sm.tsa.seasonal_decompose(divida.divida)
File "/Library/Python/2.7/site-packages/statsmodels/tsa/seasonal.py", line 58, in seasonal_decompose _pandas_wrapper, pfreq = _maybe_get_pandas_wrapper_freq(x)
File "/Library/Python/2.7/site-packages/statsmodels/tsa/filters/_utils.py", line 46, in _maybe_get_pandas_wrapper_freq
freq = index.inferred_freq
AttributeError: 'Index' object has no attribute 'inferred_freq'

How can I proceed?

我该如何继续?

采纳答案by Stefan

Works fine when you convert your indexto DateTimeIndex:

做工精细,当您转换您indexDateTimeIndex

df.reset_index(inplace=True)
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
s=sm.tsa.seasonal_decompose(df.divida)

<statsmodels.tsa.seasonal.DecomposeResult object at 0x110ec3710>

Access the components via:

通过以下方式访问组件:

s.resid
s.seasonal
s.trend

回答by saravanan saminathan

Statsmodel will decompose the series only if you provide frequency. Usually all time series index will contain frequency eg: Daywise, Business days, weekly So it shows error. You can remove this error by two ways:

仅当您提供频率时,Statsmodel 才会分解系列。通常所有时间序列索引都会包含频率,例如:Daywise、工作日、每周 所以它显示错误。您可以通过两种方式消除此错误:

  1. What Stefan did is he gave the index column to pandas DateTimefunction. It uses internal function infer_freqto find the frequency and return the index with frequency.
  2. Else you can set the frequency to your index column as df.index.asfreq(freq='m'). Here mrepresents month. You can set the frequency if you have domain knowledge or by d.
  1. Stefan 所做的是将索引列赋予了 pandasDateTime函数。它使用内部函数infer_freq来查找频率并返回带有频率的索引。
  2. 否则,您可以将索引列的频率设置为df.index.asfreq(freq='m'). 这里m代表月份。如果您有领域知识或通过d.

回答by Reeves

Make it simple:

让它变得简单:

Follow three steps: 1-if not done, make the column in yyyy-mm-dd or dd-mm-yyyy( using excel). 2-Then using pandas convert it into date format as:

遵循三个步骤: 1-如果没有完成,则在 yyyy-mm-dd 或 dd-mm-yyyy(使用 excel)中制作列。2-然后使用熊猫将其转换为日期格式:

df['Date'] = pd.to_datetime(df['Date'])

df['Date'] = pd.to_datetime(df['Date'])

3-decompose it using:

3-分解它使用:

from statsmodels.tsa.seasonal import seasonal_decompose decomposition=seasonal_decompose(ts_log)

从 statsmodels.tsa.seasonal 导入seasonal_decompose 分解=seasonal_decompose(ts_log)

And finally:----enter image description here

最后:----在此处输入图片说明

回答by Matt Najarian

It depends on the index format. You can have DateTimeIndex or you can have PeriodIndex. Stefan presented the example for DateTimeIndex. Here is my example for PeriodIndex. My original DataFrame has a MultiIndex index with year in first level and month in second level. Here is how I convert it to PeriodIndex:

这取决于索引格式。您可以拥有 DateTimeIndex 或 PeriodIndex。Stefan 展示了 DateTimeIndex 的示例。这是我的 PeriodIndex 示例。我的原始 DataFrame 有一个 MultiIndex 索引,第一级为年份,第二级为月份。这是我将其转换为 PeriodIndex 的方法:

df["date"] = pd.PeriodIndex (df.index.map(lambda x: "{0}{1:02d}".format(*x)),freq="M")
df = df.set_index("date")

Now it is ready to be used by seasonal_decompose.

现在它已准备好供seasonal_decompose 使用。