pandas 为通过 groupby 应用结果设置列名

Question

提问by MrT

This is a fairly trivial problem, but its triggering my OCD and I haven't been able to find a suitable solution for the past half hour.

这是一个相当微不足道的问题，但它触发了我的强迫症，在过去的半小时里我一直找不到合适的解决方案。

For background, I'm looking to calculate a value (let's call it F) for each group in a DataFrame derived from differentaggregated measures of columns in the existing DataFrame.

作为背景，我希望为 DataFrame 中的每个组计算一个值（我们称之为 F），这些值源自现有 DataFrame 中列的不同聚合度量。

Here's a toy example of what I'm trying to do:

这是我正在尝试做的一个玩具示例：

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': ['X', 'Y', 'X', 'Y', 'Y', 'Y', 'Y', 'X', 'Y', 'X'],
                'B': ['N', 'N', 'N', 'M', 'N', 'M', 'M', 'N', 'M', 'N'],
                'C': [69, 83, 28, 25, 11, 31, 14, 37, 14,  0],
                'D': [ 0.3,  0.1,  0.1,  0.8,  0.8,  0. ,  0.8,  0.8,  0.1,  0.8],
                'E': [11, 11, 12, 11, 11, 12, 12, 11, 12, 12]
                })

df_grp = df.groupby(['A','B'])
df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())

What I'd like to do is assign a name to the result of apply(or lambda). Is there anyway to do this without moving lambdato a named function or renaming the column after running the last line?

我想做的是为apply(or lambda)的结果指定一个名称。无论如何，lambda在运行最后一行后，是否可以在不移动到命名函数或重命名列的情况下执行此操作？

Answer 1

回答by Alexander

Have the lambda function return a new Series:

让 lambda 函数返回一个新系列：

df_grp.apply(lambda x: pd.Series({'new_name':
                    x['C'].sum() * x['D'].mean() / x['E'].max()}))
# or df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max()).to_frame('new_name')

     new_name
A B          
X N  5.583333
Y M  2.975000
  N  3.845455

Answer 2

回答by Zero

You could convert your seriesto a dataframeusing reset_index()and provide name='yout_col_name'-- The name of the column corresponding to the Series values

您可以将您的转换series为dataframeusingreset_index()并提供name='yout_col_name'-- 与系列值对应的列的名称

(df_grp.apply(lambda x: x['C'].sum() * x['D'].mean() / x['E'].max())
      .reset_index(name='your_col_name'))

   A  B  your_col_name
0  X  N   5.583333
1  Y  M   2.975000
2  Y  N   3.845455

pandas 为通过 groupby 应用结果设置列名

提问by MrT

回答by Alexander

回答by Zero

相关推荐

最近更新

标签

pandas 为通过 groupby 应用结果设置列名

提问by MrT

回答by Alexander

回答by Zero

相关推荐

如何指定 Pandas 数据框的行数？

Pandas, groupby 列值大于 x

将 PANDAS 数据框从每月转换为每天

pandas python计算csv列中唯一元素的数量

相关推荐

最近更新

标签