如何在 Pandas 的组内使用 cumsum?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32847800/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I use cumsum within a group in Pandas?
提问by Baron Yugovich
I have
我有
df = pd.DataFrame.from_dict({'id': ['A', 'B', 'A', 'C', 'D', 'B', 'C'], 'val': [1,2,-3,1,5,6,-2], 'stuff':['12','23232','13','1234','3235','3236','732323']})
id stuff val
0 A 12 1
1 B 23232 2
2 A 13 -3
3 C 1234 1
4 D 3235 5
5 B 3236 6
6 C 732323 -2
I'd like to get running some of valfor each id, so the desired output looks like this:
我想val为 each运行一些id,因此所需的输出如下所示:
id stuff val cumsum
0 A 12 1 1
1 B 23232 2 2
2 A 13 -3 -2
3 C 1234 1 1
4 D 3235 5 5
5 B 3236 6 8
6 C 732323 -2 -1
This is what I tried:
这是我尝试过的:
df['cumsum'] = df.groupby('id').cumsum(['val'])
and
和
df['cumsum'] = df.groupby('id').cumsum(['val'])
This is the error I got:
这是我得到的错误:
ValueError: Wrong number of items passed 0, placement implies 1
回答by EdChum
You can call transformand pass the cumsumfunction to add that column to your df:
您可以调用transform并传递cumsum函数将该列添加到您的 df:
In [156]:
df['cumsum'] = df.groupby('id')['val'].transform(pd.Series.cumsum)
df
Out[156]:
id stuff val cumsum
0 A 12 1 1
1 B 23232 2 2
2 A 13 -3 -2
3 C 1234 1 1
4 D 3235 5 5
5 B 3236 6 8
6 C 732323 -2 -1
With respect to your error, you can't call cumsumon a Series groupby object, secondly you're passing the name of the column as a list which is meaningless.
关于您的错误,您不能调用cumsumSeries groupby 对象,其次您将列的名称作为毫无意义的列表传递。
So this works:
所以这有效:
In [159]:
df.groupby('id')['val'].cumsum()
Out[159]:
0 1
1 2
2 -2
3 1
4 5
5 8
6 -1
dtype: int64

