pandas Groupby 并滞后数据帧的所有列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33907537/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Groupby and lag all columns of a dataframe?
提问by naught101
I want to lag every column in a dataframe, by group. I have a frame like this:
我想按组滞后数据框中的每一列。我有一个这样的框架:
import numpy as np
import pandas as pd
index = pd.date_range('2015-11-20', periods=6, freq='D')
df = pd.DataFrame(dict(time=index, grp=['A']*3 + ['B']*3, col1=[1,2,3]*2,
col2=['a','b','c']*2)).set_index(['time','grp'])
which looks like
看起来像
col1 col2
time grp
2015-11-20 A 1 a
2015-11-21 A 2 b
2015-11-22 A 3 c
2015-11-23 B 1 a
2015-11-24 B 2 b
2015-11-25 B 3 c
and I want it to look like this:
我希望它看起来像这样:
col1 col2 col1_lag col2_lag
time grp
2015-11-20 A 1 a 2 b
2015-11-21 A 2 b 3 c
2015-11-22 A 3 c NA NA
2015-11-23 B 1 a 2 b
2015-11-24 B 2 b 3 c
2015-11-25 B 3 c NA NA
This questionmanages the result for a single column, but I have an arbitrary number of columns, and I want to lag all of them. I can use groupby
and apply
, but apply
runs the shift
function over each column independently, and it doesn't seem to like receiving an [nrow, 2]
shaped dataframe in return. Is there perhaps a function like apply
that acts on the whole group sub-frame? Or is there a better way to do this?
这个问题管理单个列的结果,但我有任意数量的列,我想滞后所有列。我可以使用groupby
and apply
,但在每一列上独立apply
运行该shift
函数,并且它似乎不喜欢接收[nrow, 2]
成形的数据帧作为回报。是否有类似的功能apply
作用于整个组子框架?或者有没有更好的方法来做到这一点?
回答by DSM
IIUC, you can simply use level="grp"
and then shift by -1:
IIUC,您可以简单地使用level="grp"
然后移位-1:
>>> shifted = df.groupby(level="grp").shift(-1)
>>> df.join(shifted.rename(columns=lambda x: x+"_lag"))
col1 col2 col1_lag col2_lag
time grp
2015-11-20 A 1 a 2 b
2015-11-21 A 2 b 3 c
2015-11-22 A 3 c NaN NaN
2015-11-23 B 1 a 2 b
2015-11-24 B 2 b 3 c
2015-11-25 B 3 c NaN NaN