pandas 熊猫：返回多列的平均值

Question

提问by Karma

How do you output average of multiple columns?

你如何输出多列的平均值？

Gender   Age     Salary     Yr_exp   cup_coffee_daily
  Male    28    45000.0        6.0                2.0
Female    40    70000.0       15.0               10.0
Female    23    40000.0        1.0                0.0
  Male    35    55000.0       12.0                6.0

I have df.groupby('Gender', as_index=False)['Age', 'Salary', 'Yr_exp'].mean(), but it still only returned the average of the first column Age. How do you return the average of specific columns in different columns? Desired output:

我有df.groupby('Gender', as_index=False)['Age', 'Salary', 'Yr_exp'].mean()，但它仍然只返回第一列的平均值Age。你如何返回不同列中特定列的平均值？期望的输出：

Gender   Age     Salary   Yr_exp
  Male  31.5    50000.0      9.0
Female  31.5    55000.0      8.0

Thanks.

谢谢。

Answer 1

回答by Jonathan Dayton

Given this dataframe:

鉴于此数据框：

df = pd.DataFrame({
    "Gender": ["Male", "Female", "Female", "Male"],
    "Age": [28, 40, 23, 35],
    "Salary": [45000, 70000, 40000, 55000],
    "Yr_exp": [6, 15, 1, 12]
})

df
   Age  Gender  Salary  Yr_exp
0   28    Male   45000       6
1   40  Female   70000      15
2   23  Female   40000       1
3   35    Male   55000      12

Group by gender and use the mean()function:

按性别分组并使用mean()功能：

df.groupby("Gender").mean()
         Age   Salary  Yr_exp
Gender                       
Female  31.5  55000.0     8.0
Male    31.5  50000.0     9.0

Edit: you may need to change the way you're indexing after groupby(): df['Age', 'Salary']gives a KeyError, but df[['Age', 'Salary']]returns the expected:

编辑：您可能需要改变你的索引后的方式groupby()：df['Age', 'Salary']给一个KeyError，但df[['Age', 'Salary']]返回预期：

   Age  Salary
0   28   45000
1   40   70000
2   23   40000
3   35   55000

Try changing

尝试改变

df.groupby("Gender", as_index=True)['Age', 'Salary', 'Yr_exp'].mean()

to

到

df.groupby("Gender", as_index=True)[['Age', 'Salary', 'Yr_exp']].mean()

Answer 2

回答by VnC

You can also use pandas.agg():

您还可以使用pandas.agg()：

df.groupby("Gender").agg({'Age' : 'mean', 'Salary' : 'mean', 'Yr_exp': 'mean'})

Would result to:

将导致：

         Age    Salary  Yr_exp
Gender          
Female  31.5    55000   8
Male    31.5    50000   9

Using .agg()give you the chance to apply different functions to a grouped object - something like:

使用.agg()使您有机会将不同的功能应用于分组对象 - 例如：

df.groupby("Gender").agg({'Age' : 'mean', 'Salary' : ['min', 'max'], 'Yr_exp': 'sum'})

Outputs:

输出：

          Age         Salary    Yr_exp
         mean    min      max   sum
Gender              
Female  31.5    40000   70000   16
Male    31.5    45000   55000   18

pandas 熊猫：返回多列的平均值

提问by Karma

回答by Jonathan Dayton

回答by VnC

相关推荐

最近更新

标签

pandas 熊猫：返回多列的平均值

提问by Karma

回答by Jonathan Dayton

回答by VnC

相关推荐

无法导入 Pandas 分析

pandas 将数字转换为熊猫数据框中的 2 位浮点数

如何将 Pandas 数据框中的多列弹出到新的数据框中？

pandas Int 太大而无法在执行 .astype(int) 时转换为 C long

相关推荐

最近更新

标签