Python Pandas：将参数传递给 agg() 中的函数

Question

提问by Tanguy

I am trying to reduce data in a pandas dataframe by using different kind of functions and argument values. However, I did not manage to change the default arguments in the aggregation functions. Here is an example:

我试图通过使用不同类型的函数和参数值来减少Pandas数据框中的数据。但是，我没有设法更改聚合函数中的默认参数。下面是一个例子：

>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
...                    'y': ['a','a','b','b']})
>>> df
     x  y
0  1.0  a
1  NaN  a
2  2.0  b
3  1.0  b

Here is an aggregation function, for which I would like to test different values of b:

这是一个聚合函数，我想测试它的不同值b：

>>> def translate_mean(x, b=10):
...   y = [elem + b for elem in x]
...   return np.mean(y)

In the following code, I can use this function with the default bvalue, but I would like to pass other values:

在下面的代码中，我可以使用这个函数的默认b值，但我想传递其他值：

>>> df.groupby('y').agg(translate_mean)
      x
y
a   NaN
b  11.5

Any ideas?

有任何想法吗？

Answer 1

回答by Bubble Bubble Bubble Gut

Maybe you can try using applyin this case:

也许你可以尝试apply在这种情况下使用：

df.groupby('y').apply(lambda x: translate_mean(x['x'], 20))

Now the result is:

现在的结果是：

y
a     NaN
b    21.5

Answer 2

回答by ayhan

Just pass as arguments to agg(this works with apply, too).

只需作为参数传递给agg（这也适用于apply）。

df.groupby('y').agg(translate_mean, b=4)
Out: 
     x
y     
a  NaN
b  5.5

Answer 3

回答by Yunzhao Xing

Just in case you have multiple columns, and you want to apply different functions and different parameters for each column, you can use lambda function with agg function. For example:

万一您有多个列，并且您想为每一列应用不同的函数和不同的参数，您可以将 lambda 函数与 agg 函数一起使用。例如：

>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
...                    'y': ['a','a','b','b']
                       'z': ['0.1','0.2','0.3','0.4']})
>>> df
     x  y  z
0  1.0  a  0.1
1  NaN  a  0.2
2  2.0  b  0.3
3  1.0     0.4

>>> def translate_mean(x, b=10):
...   y = [elem + b for elem in x]
...   return np.mean(y)

To groupby column 'y', and apply function translate_mean with b=10 for col 'x'; b=25 for col 'z', you can try this:

对列 'y' 进行分组，并为 col 'x' 应用函数 translate_mean 和 b=10；b=25 对于 col 'z'，你可以试试这个：

df_res = df.groupby(by='a').agg({
    'x': lambda x: translate_mean(x, 10),
    'z': lambda x: translate_mean(x, 25)})

Hopefully, it helps.

希望它有所帮助。

Python Pandas：将参数传递给 agg() 中的函数

提问by Tanguy

回答by Bubble Bubble Bubble Gut

回答by ayhan

回答by Yunzhao Xing

相关推荐

最近更新

标签

Python Pandas：将参数传递给 agg() 中的函数

提问by Tanguy

回答by Bubble Bubble Bubble Gut

回答by ayhan

回答by Yunzhao Xing

相关推荐

pandas 熊猫：to_excel() float_format

pandas 将 numpy 数组转换为数据框列？

pandas dask 数据框应用元

pandas 从现有数据帧创建多索引

相关推荐

最近更新

标签