pandas Python - Statsmodels.tsa.seasonal_decompose - 数据帧头部和尾部的缺失值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34646033/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python - Statsmodels.tsa.seasonal_decompose - missing values in head and tail of dataframe
提问by abutremutante
I have the following dataframe, that I'm calling "sales_df":
我有以下数据框,我称之为“sales_df”:
Value
Date
2004-01-01 0
2004-02-01 173
2004-03-01 225
2004-04-01 230
2004-05-01 349
2004-06-01 258
2004-07-01 270
2004-08-01 223
... ...
2015-06-01 218
2015-07-01 215
2015-08-01 233
2015-09-01 258
2015-10-01 252
2015-11-01 256
2015-12-01 188
2016-01-01 70
I want to separate its trend from its seasonal component and for that I use statsmodels.tsa.seasonal_decompose through the following code:
我想将其趋势与其季节性组件分开,为此我通过以下代码使用 statsmodels.tsa.seasonal_decompose:
decomp=sm.tsa.seasonal_decompose(sales_df.Value)
df=pd.concat([sales_df,decomp.trend],axis=1)
df.columns=['sales','trend']
This is getting me this:
这让我知道:
sales trend
Date
2004-01-01 0 NaN
2004-02-01 173 NaN
2004-03-01 225 NaN
2004-04-01 230 NaN
2004-05-01 349 NaN
2004-06-01 258 NaN
2004-07-01 270 236.708333
2004-08-01 223 248.208333
2004-09-01 243 251.250000
... ... ...
2015-05-01 270 214.416667
2015-06-01 218 215.583333
2015-07-01 215 212.791667
2015-08-01 233 NaN
2015-09-01 258 NaN
2015-10-01 252 NaN
2015-11-01 256 NaN
2015-12-01 188 NaN
2016-01-01 70 NaN
Note that there are 6 NaN's in the start and in the end of the Trend's series. So I ask, is that right? Why is that happening?
请注意,趋势系列的开头和结尾有 6 个 NaN。所以我问,是这样吗?为什么会这样?
采纳答案by ranlot
This is expected as seasonal_decompose
uses a symmetric moving average by default if the filt
argument is not specified (as you did). The frequency is inferred from the time series.
https://searchcode.com/codesearch/view/86129185/
seasonal_decompose
如果filt
未指定参数(如您所做的那样),则默认情况下使用对称移动平均线是预期的。从时间序列推断频率。
https://searchcode.com/codesearch/view/86129185/