Python 分解趋势、季节性和剩余时间序列元素
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34457281/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Decomposing trend, seasonal and residual time series elements
提问by abutremutante
I have a DataFrame
with a few time series:
我有DataFrame
几个时间序列:
divida movav12 var varmovav12
Date
2004-01 0 NaN NaN NaN
2004-02 0 NaN NaN NaN
2004-03 0 NaN NaN NaN
2004-04 34 NaN inf NaN
2004-05 30 NaN -0.117647 NaN
2004-06 44 NaN 0.466667 NaN
2004-07 35 NaN -0.204545 NaN
2004-08 31 NaN -0.114286 NaN
2004-09 30 NaN -0.032258 NaN
2004-10 24 NaN -0.200000 NaN
2004-11 41 NaN 0.708333 NaN
2004-12 29 24.833333 -0.292683 NaN
2005-01 31 27.416667 0.068966 0.104027
2005-02 28 29.750000 -0.096774 0.085106
2005-03 27 32.000000 -0.035714 0.075630
2005-04 30 31.666667 0.111111 -0.010417
2005-05 31 31.750000 0.033333 0.002632
2005-06 39 31.333333 0.258065 -0.013123
2005-07 36 31.416667 -0.076923 0.002660
I want to decompose the first time series divida
in a way that I can separate its trend from its seasonal and residual components.
我想以divida
一种可以将其趋势与其季节性和残差成分分开的方式分解第一个时间序列。
I found an answer here, and am trying to use the following code:
我在这里找到了答案,并尝试使用以下代码:
import statsmodels.api as sm
s=sm.tsa.seasonal_decompose(divida.divida)
However I keep getting this error:
但是我不断收到此错误:
Traceback (most recent call last):
File "/Users/Pred_UnBR_Mod2.py", line 78, in <module> s=sm.tsa.seasonal_decompose(divida.divida)
File "/Library/Python/2.7/site-packages/statsmodels/tsa/seasonal.py", line 58, in seasonal_decompose _pandas_wrapper, pfreq = _maybe_get_pandas_wrapper_freq(x)
File "/Library/Python/2.7/site-packages/statsmodels/tsa/filters/_utils.py", line 46, in _maybe_get_pandas_wrapper_freq
freq = index.inferred_freq
AttributeError: 'Index' object has no attribute 'inferred_freq'
How can I proceed?
我该如何继续?
采纳答案by Stefan
Works fine when you convert your index
to DateTimeIndex
:
做工精细,当您转换您index
到DateTimeIndex
:
df.reset_index(inplace=True)
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
s=sm.tsa.seasonal_decompose(df.divida)
<statsmodels.tsa.seasonal.DecomposeResult object at 0x110ec3710>
Access the components via:
通过以下方式访问组件:
s.resid
s.seasonal
s.trend
回答by saravanan saminathan
Statsmodel will decompose the series only if you provide frequency. Usually all time series index will contain frequency eg: Daywise, Business days, weekly So it shows error. You can remove this error by two ways:
仅当您提供频率时,Statsmodel 才会分解系列。通常所有时间序列索引都会包含频率,例如:Daywise、工作日、每周 所以它显示错误。您可以通过两种方式消除此错误:
- What Stefan did is he gave the index column to pandas
DateTime
function. It uses internal functioninfer_freq
to find the frequency and return the index with frequency. - Else you can set the frequency to your index column as
df.index.asfreq(freq='m')
. Herem
represents month. You can set the frequency if you have domain knowledge or byd
.
- Stefan 所做的是将索引列赋予了 pandas
DateTime
函数。它使用内部函数infer_freq
来查找频率并返回带有频率的索引。 - 否则,您可以将索引列的频率设置为
df.index.asfreq(freq='m')
. 这里m
代表月份。如果您有领域知识或通过d
.
回答by Reeves
Make it simple:
让它变得简单:
Follow three steps: 1-if not done, make the column in yyyy-mm-dd or dd-mm-yyyy( using excel). 2-Then using pandas convert it into date format as:
遵循三个步骤: 1-如果没有完成,则在 yyyy-mm-dd 或 dd-mm-yyyy(使用 excel)中制作列。2-然后使用熊猫将其转换为日期格式:
df['Date'] = pd.to_datetime(df['Date'])
df['Date'] = pd.to_datetime(df['Date'])
3-decompose it using:
3-分解它使用:
from statsmodels.tsa.seasonal import seasonal_decompose decomposition=seasonal_decompose(ts_log)
从 statsmodels.tsa.seasonal 导入seasonal_decompose 分解=seasonal_decompose(ts_log)
回答by Matt Najarian
It depends on the index format. You can have DateTimeIndex or you can have PeriodIndex. Stefan presented the example for DateTimeIndex. Here is my example for PeriodIndex. My original DataFrame has a MultiIndex index with year in first level and month in second level. Here is how I convert it to PeriodIndex:
这取决于索引格式。您可以拥有 DateTimeIndex 或 PeriodIndex。Stefan 展示了 DateTimeIndex 的示例。这是我的 PeriodIndex 示例。我的原始 DataFrame 有一个 MultiIndex 索引,第一级为年份,第二级为月份。这是我将其转换为 PeriodIndex 的方法:
df["date"] = pd.PeriodIndex (df.index.map(lambda x: "{0}{1:02d}".format(*x)),freq="M")
df = df.set_index("date")
Now it is ready to be used by seasonal_decompose.
现在它已准备好供seasonal_decompose 使用。