pandas 熊猫:返回多列的平均值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/49560809/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas: return average of multiple columns
提问by Karma
How do you output average of multiple columns?
你如何输出多列的平均值?
Gender Age Salary Yr_exp cup_coffee_daily
Male 28 45000.0 6.0 2.0
Female 40 70000.0 15.0 10.0
Female 23 40000.0 1.0 0.0
Male 35 55000.0 12.0 6.0
I have df.groupby('Gender', as_index=False)['Age', 'Salary', 'Yr_exp'].mean()
, but it still only returned the average of the first column Age
. How do you return the average of specific columns in different columns? Desired output:
我有df.groupby('Gender', as_index=False)['Age', 'Salary', 'Yr_exp'].mean()
,但它仍然只返回第一列的平均值Age
。你如何返回不同列中特定列的平均值?期望的输出:
Gender Age Salary Yr_exp
Male 31.5 50000.0 9.0
Female 31.5 55000.0 8.0
Thanks.
谢谢。
回答by Jonathan Dayton
Given this dataframe:
鉴于此数据框:
df = pd.DataFrame({
"Gender": ["Male", "Female", "Female", "Male"],
"Age": [28, 40, 23, 35],
"Salary": [45000, 70000, 40000, 55000],
"Yr_exp": [6, 15, 1, 12]
})
df
Age Gender Salary Yr_exp
0 28 Male 45000 6
1 40 Female 70000 15
2 23 Female 40000 1
3 35 Male 55000 12
Group by gender and use the mean()
function:
按性别分组并使用mean()
功能:
df.groupby("Gender").mean()
Age Salary Yr_exp
Gender
Female 31.5 55000.0 8.0
Male 31.5 50000.0 9.0
Edit: you may need to change the way you're indexing after groupby()
: df['Age', 'Salary']
gives a KeyError
, but df[['Age', 'Salary']]
returns the expected:
编辑:您可能需要改变你的索引后的方式groupby()
:df['Age', 'Salary']
给一个KeyError
,但df[['Age', 'Salary']]
返回预期:
Age Salary
0 28 45000
1 40 70000
2 23 40000
3 35 55000
Try changing
尝试改变
df.groupby("Gender", as_index=True)['Age', 'Salary', 'Yr_exp'].mean()
to
到
df.groupby("Gender", as_index=True)[['Age', 'Salary', 'Yr_exp']].mean()
回答by VnC
You can also use pandas.agg()
:
您还可以使用pandas.agg()
:
df.groupby("Gender").agg({'Age' : 'mean', 'Salary' : 'mean', 'Yr_exp': 'mean'})
Would result to:
将导致:
Age Salary Yr_exp
Gender
Female 31.5 55000 8
Male 31.5 50000 9
Using .agg()
give you the chance to apply different functions to a grouped object - something like:
使用.agg()
使您有机会将不同的功能应用于分组对象 - 例如:
df.groupby("Gender").agg({'Age' : 'mean', 'Salary' : ['min', 'max'], 'Yr_exp': 'sum'})
Outputs:
输出:
Age Salary Yr_exp
mean min max sum
Gender
Female 31.5 40000 70000 16
Male 31.5 45000 55000 18