pandas Python - Statsmodels.tsa.seasonal_decompose - 数据帧头部和尾部的缺失值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34646033/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:28:02  来源:igfitidea点击:

Python - Statsmodels.tsa.seasonal_decompose - missing values in head and tail of dataframe

pythonpandasstatsmodels

提问by abutremutante

I have the following dataframe, that I'm calling "sales_df":

我有以下数据框,我称之为“sales_df”:

            Value
Date             
2004-01-01      0
2004-02-01    173
2004-03-01    225
2004-04-01    230
2004-05-01    349
2004-06-01    258
2004-07-01    270
2004-08-01    223
...           ...
2015-06-01    218
2015-07-01    215
2015-08-01    233
2015-09-01    258
2015-10-01    252
2015-11-01    256
2015-12-01    188
2016-01-01     70

I want to separate its trend from its seasonal component and for that I use statsmodels.tsa.seasonal_decompose through the following code:

我想将其趋势与其季节性组件分开,为此我通过以下代码使用 statsmodels.tsa.seasonal_decompose:

decomp=sm.tsa.seasonal_decompose(sales_df.Value)
df=pd.concat([sales_df,decomp.trend],axis=1)
df.columns=['sales','trend']

This is getting me this:

这让我知道:

            sales       trend
Date                         
2004-01-01      0         NaN
2004-02-01    173         NaN
2004-03-01    225         NaN
2004-04-01    230         NaN
2004-05-01    349         NaN
2004-06-01    258         NaN
2004-07-01    270  236.708333
2004-08-01    223  248.208333
2004-09-01    243  251.250000
...           ...         ...
2015-05-01    270  214.416667
2015-06-01    218  215.583333
2015-07-01    215  212.791667
2015-08-01    233         NaN
2015-09-01    258         NaN
2015-10-01    252         NaN
2015-11-01    256         NaN
2015-12-01    188         NaN
2016-01-01     70         NaN

Note that there are 6 NaN's in the start and in the end of the Trend's series. So I ask, is that right? Why is that happening?

请注意,趋势系列的开头和结尾有 6 个 NaN。所以我问,是这样吗?为什么会这样?

采纳答案by ranlot

This is expected as seasonal_decomposeuses a symmetric moving average by default if the filtargument is not specified (as you did). The frequency is inferred from the time series. https://searchcode.com/codesearch/view/86129185/

seasonal_decompose如果filt未指定参数(如您所做的那样),则默认情况下使用对称移动平均线是预期的。从时间序列推断频率。 https://searchcode.com/codesearch/view/86129185/