Python Pandas:在组内将值向下移动一行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26280345/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:19:50  来源:igfitidea点击:

Pandas: Shift down values by one row within a group

pythonpandasdataframe

提问by jeffalstott

I have a Pandas dataframe, and I want to create a new column whose values are that of another column, shifted down by one row. The last row should show NaN.

我有一个 Pandas 数据框,我想创建一个新列,其值是另一列的值,向下移动一行。最后一行应显示 NaN。

The catch is that I want to do this by group, with the last row of each group showing NaN. NOT have the last row of a group "steal" a value from a group that happens to be adjacent in the dataframe.

问题是我想按组执行此操作,每组的最后一行显示 NaN。不要让组的最后一行从恰好在数据帧中相邻的组中“窃取”一个值。

My attempted implementation is quite shamefully broken, so I'm clearly misunderstanding something fundamental.

我尝试的实现非常可耻地被破坏了,所以我显然误解了一些基本的东西。

df['B_shifted'] = df.groupby(['A'])['B'].transform(lambda x:x.values[1:])

采纳答案by Mike

Shift works on the output of the groupby clause:

Shift 作用于 groupby 子句的输出:

>>> df = pandas.DataFrame(numpy.random.randint(1,3, (10,5)), columns=['a','b','c','d','e'])
>>> df
   a  b  c  d  e
0  2  1  2  1  1
1  2  1  1  1  1
2  1  2  2  1  2
3  1  2  1  1  2
4  2  2  1  1  2
5  2  2  2  2  1
6  2  2  1  1  1
7  2  2  2  1  1
8  2  2  2  2  1
9  2  2  2  2  1


for k, v in df.groupby('a'):
    print k
    print 'normal'
    print v
    print 'shifted'
    print v.shift(1)

1
normal
   a  b  c  d  e
2  1  2  2  1  2
3  1  2  1  1  2
shifted
    a   b   c   d   e
2 NaN NaN NaN NaN NaN
3   1   2   2   1   2
2
normal
   a  b  c  d  e
0  2  1  2  1  1
1  2  1  1  1  1
4  2  2  1  1  2
5  2  2  2  2  1
6  2  2  1  1  1
7  2  2  2  1  1
8  2  2  2  2  1
9  2  2  2  2  1
shifted
    a   b   c   d   e
0 NaN NaN NaN NaN NaN
1   2   1   2   1   1
4   2   1   1   1   1
5   2   2   1   1   2
6   2   2   2   2   1
7   2   2   1   1   1
8   2   2   2   1   1
9   2   2   2   2   1

回答by abeboparebop

@EdChum's comment is a better answer to this question, so I'm posting it here for posterity:

@EdChum 的评论是对这个问题的更好回答,所以我把它贴在这里供后人使用:

df['B_shifted'] = df.groupby(['A'])['B'].transform(lambda x:x.shift())

df['B_shifted'] = df.groupby(['A'])['B'].transform(lambda x:x.shift())

or similarly

或类似

df['B_shifted'] = df.groupby(['A'])['B'].transform('shift').

df['B_shifted'] = df.groupby(['A'])['B'].transform('shift').

The former notation is more flexible, of course (e.g. if you want to shift by 2).

当然,前一种表示法更灵活(例如,如果您想移动 2)。

回答by chrisaycock

Newer versions of pandas can now perform a shifton a group:

较新版本的熊猫现在可以shift对组执行 a :

df['B_shifted'] = df.groupby(['A'])['B'].shift(1)

Note that when shifting down, it's the firstrow that has NaN.

请注意,向下移动,它是具有 NaN的第一行。

回答by Kevin Chou

All above answer make a mistake:

以上所有答案都犯了一个错误:

shift(1)is about shift upby one row, which is default behavior;
shift(-1)is really about shift downby one rows.

shift(1)大约向上移动一行,这是默认行为;
shift(-1)真的是向下移动一排。

see pandas documention shiftexample

请参阅熊猫文档shift示例