pandas 熊猫在移动的数据帧上滚动

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27479800/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:45:54  来源:igfitidea点击:

Pandas rolling on a shifted dataframe

pythonpandas

提问by euri10

Here's a piece of code, I don't get why on the last column rm-5, I get NaN for the first 4 items.

这是一段代码,我不明白为什么在最后一列 rm-5 上,前 4 项得到 NaN。

I understand that for the rm columns the 1st 4 items aren't filled because there is no data available, but if I shift the column calculation should be made, shouldn't it ?

我知道对于 rm 列,第 1 个 4 项没有填充,因为没有可用的数据,但是如果我移动列计算应该进行,不是吗?

Similarly I don't get why there are 5 and not 4 items in the rm-5 column that are NaN

同样,我不明白为什么 rm-5 列中有 5 个而不是 4 个项目是 NaN

import pandas as pd
import numpy as np

index = pd.date_range('2000-1-1', periods=100, freq='D')
df = pd.DataFrame(data=np.random.randn(100), index=index, columns=['A'])

df['rm']=pd.rolling_mean(df['A'],5)
df['rm-5']=pd.rolling_mean(df['A'].shift(-5),5)

print df.head(n=8)
print df.tail(n=8)

                   A        rm      rm-5
2000-01-01  0.109161       NaN       NaN
2000-01-02 -0.360286       NaN       NaN
2000-01-03 -0.092439       NaN       NaN
2000-01-04  0.169439       NaN       NaN
2000-01-05  0.185829  0.002341  0.091736
2000-01-06  0.432599  0.067028  0.295949
2000-01-07 -0.374317  0.064222  0.055903
2000-01-08  1.258054  0.334321 -0.132972
                   A        rm      rm-5
2000-04-02  0.499860 -0.422931 -0.140111
2000-04-03 -0.868718 -0.458962 -0.182373
2000-04-04  0.081059 -0.443494 -0.040646
2000-04-05  0.500275 -0.093048       NaN
2000-04-06 -0.253915 -0.008288       NaN
2000-04-07 -0.159256 -0.140111       NaN
2000-04-08 -1.080027 -0.182373       NaN
2000-04-09  0.789690 -0.040646       NaN

采纳答案by Hennep

You can change the order of operations. Now you are first shifting and afterwards taking the mean. Due to your first shift you create your NaN's at the end.

您可以更改操作顺序。现在你先移动,然后取平均值。由于您的第一次转变,您最终会创建 NaN。

index = pd.date_range('2000-1-1', periods=100, freq='D')
df = pd.DataFrame(data=np.random.randn(100), index=index, columns=['A'])

df['rm']=pd.rolling_mean(df['A'],5)
df['shift'] = df['A'].shift(-5)
df['rm-5-shift_first']=pd.rolling_mean(df['A'].shift(-5),5)
df['rm-5-mean_first']=pd.rolling_mean(df['A'],5).shift(-5)

print( df.head(n=8))
print( df.tail(n=8))

                   A        rm     shift  rm-5-shift_first  rm-5-mean_first
2000-01-01 -0.120808       NaN  0.830231               NaN         0.184197
2000-01-02  0.029547       NaN  0.047451               NaN         0.187778
2000-01-03  0.002652       NaN  1.040963               NaN         0.395440
2000-01-04 -1.078656       NaN -1.118723               NaN         0.387426
2000-01-05  1.137210 -0.006011  0.469557          0.253896         0.253896
2000-01-06  0.830231  0.184197 -0.390506          0.009748         0.009748
2000-01-07  0.047451  0.187778 -1.624492         -0.324640        -0.324640
2000-01-08  1.040963  0.395440 -1.259306         -0.784694        -0.784694
                   A        rm     shift  rm-5-shift_first  rm-5-mean_first
2000-04-02 -1.283123 -0.270381  0.226257          0.760370         0.760370
2000-04-03  1.369342  0.288072  2.367048          0.959912         0.959912
2000-04-04  0.003363  0.299997  1.143513          1.187941         1.187941
2000-04-05  0.694026  0.400442       NaN               NaN              NaN
2000-04-06  1.508863  0.458494       NaN               NaN              NaN
2000-04-07  0.226257  0.760370       NaN               NaN              NaN
2000-04-08  2.367048  0.959912       NaN               NaN              NaN
2000-04-09  1.143513  1.187941       NaN               NaN              NaN

For more see:

更多请见:

http://pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments

http://pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.shift.html

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.shift.html