pandas 按总和分组作为新列名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45124992/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Group by Sum as new column name
提问by Adam
I am doing function where I am grouping by ID and summing the $ value associated with those IDs with this code for python:
我正在执行按 ID 分组的函数,并将与这些 ID 关联的 $ 值与此 Python 代码相加:
df = df.groupby([' Id'], as_index=False, sort=False)[["Amount"]].sum();
but it doesnt rename the column. As such I tried doing this :
但它不会重命名列。因此,我尝试这样做:
`df = df.groupby([' Id'], as_index=False, sort=False)`[["Amount"]].sum();.reset_index(name ='Total Amount')
but it gave me error that TypeError: reset_index() got an unexpected keyword argument 'name'
但它给我的错误是 TypeError: reset_index() 得到了一个意外的关键字参数 'name'
So I tried doing this finally following this post:Python Pandas Create New Column with Groupby().Sum()
所以我最终在这篇文章之后尝试这样做:Python Pandas Create New Column with Groupby().Sum()
df = df.groupby(['Id'])[["Amount"]].transform('sum');
but it still didnt work.
但它仍然没有工作。
What am I doing wrong?
我究竟做错了什么?
回答by jezrael
I think you need remove parameter as_index=False
and use Series.reset_index
, because this parameter return df
and then DataFrame.reset_index
with parameter name
failed:
我认为你需要删除参数as_index=False
并使用Series.reset_index
,因为这个参数返回df
然后DataFrame.reset_index
参数name
失败:
df = df.groupby('Id', sort=False)["Amount"].sum().reset_index(name ='Total Amount')
Or rename
column first:
或rename
列第一:
d = {'Amount':'Total Amount'}
df = df.rename(columns=d).groupby('Id', sort=False, as_index=False)["Total Amount"].sum()
Sample:
样本:
df = pd.DataFrame({'Id':[1,2,2],'Amount':[10, 30,50]})
print (df)
Amount Id
0 10 1
1 30 2
2 50 2
df1 = df.groupby('Id', sort=False)["Amount"].sum().reset_index(name ='Total Amount')
print (df1)
Id Total Amount
0 1 10
1 2 80
d = {'Amount':'Total Amount'}
df1 = df.rename(columns=d).groupby('Id', sort=False, as_index=False)["Total Amount"].sum()
print (df1)
Id Total Amount
0 1 10
1 2 80
But if need new column with sum
in original df
use transform
and assign output to new column:
但是,如果需要sum
原始df
使用的transform
新列并将输出分配给新列:
df['Total Amount'] = df.groupby('Id', sort=False)["Amount"].transform('sum')
print (df)
Amount Id Total Amount
0 10 1 10
1 30 2 80
2 50 2 80