Python 使用带参数的 Pandas groupby() + apply()

Question

提问by beta

I would like to use df.groupby()in combination with apply()to apply a function to each row per group.

我想df.groupby()结合使用apply()将函数应用于每组的每一行。

I normally use the following code, which usually works (note, that this is without groupby()):

我通常使用以下代码，它通常有效（注意，这是没有groupby()）：

df.apply(myFunction, args=(arg1,))

With the groupby()I tried the following:

随着groupby()我试过如下：

df.groupby('columnName').apply(myFunction, args=(arg1,))

However, I get the following error:

但是，我收到以下错误：

TypeError: myFunction() got an unexpected keyword argument 'args'

类型错误：myFunction() 得到了一个意外的关键字参数“args”

Hence, my question is: How can I use groupby()and apply()with a function that needs arguments?

因此，我的问题是：如何使用groupby()和apply()需要参数的函数？

Answer 1

采纳答案by MaxU

pandas.core.groupby.GroupBy.applydoes NOT have namedparameter args, but pandas.DataFrame.applydoes have it.

pandas.core.groupby.GroupBy.apply没有命名参数args，但pandas.DataFrame.apply有它。

So try this:

所以试试这个：

df.groupby('columnName').apply(lambda x: myFunction(x, arg1))

or as suggested by @Zero:

或者按照@Zero 的建议：

df.groupby('columnName').apply(myFunction, ('arg1'))

Demo:

演示：

In [82]: df = pd.DataFrame(np.random.randint(5,size=(5,3)), columns=list('abc'))

In [83]: df
Out[83]:
   a  b  c
0  0  3  1
1  0  3  4
2  3  0  4
3  4  2  3
4  3  4  1

In [84]: def f(ser, n):
    ...:     return ser.max() * n
    ...:

In [85]: df.apply(f, args=(10,))
Out[85]:
a    40
b    40
c    40
dtype: int64

when using GroupBy.applyyou can pass either a named arguments:

使用时，GroupBy.apply您可以传递命名参数：

In [86]: df.groupby('a').apply(f, n=10)
Out[86]:
    a   b   c
a
0   0  30  40
3  30  40  40
4  40  20  30

a tuple of arguments:

一组参数：

In [87]: df.groupby('a').apply(f, (10))
Out[87]:
    a   b   c
a
0   0  30  40
3  30  40  40
4  40  20  30

Answer 2

回答by Brad Solomon

Some confusion here over why using an argsparameter throws an error might stem from the fact that pandas.DataFrame.applydoes have an argsparameter (a tuple), while pandas.core.groupby.GroupBy.applydoes not.

关于为什么使用args参数会引发错误的一些混淆可能源于pandas.DataFrame.apply这样一个事实，即确实有一个args参数（元组），而pandas.core.groupby.GroupBy.apply没有。

So, when you call .applyon a DataFrame itself, you can use this argument; when you call .applyon a groupby object, you cannot.

因此，当您调用.applyDataFrame 本身时，您可以使用此参数；当您调用.applygroupby 对象时，您不能。

In @MaxU's answer, the expression lambda x: myFunction(x, arg1)is passed to func(the first parameter); there is no need to specify additional *args/**kwargsbecause arg1is specified in lambda.

在@MaxU 的回答中，表达式lambda x: myFunction(x, arg1)被传递给func（第一个参数）；不需要指定额外的*args/**kwargs因为arg1是在 lambda 中指定的。

An example:

一个例子：

import numpy as np
import pandas as pd

# Called on DataFrame - `args` is a 1-tuple
# `0` / `1` are just the axis arguments to np.sum
df.apply(np.sum, axis=0)  # equiv to df.sum(0)
df.apply(np.sum, axis=1)  # equiv to df.sum(1)


# Called on groupby object of the DataFrame - will throw TypeError
print(df.groupby('col1').apply(np.sum, args=(0,)))
# TypeError: sum() got an unexpected keyword argument 'args'

Answer 3

回答by Hitesh Somani

For me

为了我

df2 = df.groupby('columnName').apply(lambda x: my_function(x, arg1, arg2,))

worked

工作过

Python 使用带参数的 Pandas groupby() + apply()

提问by beta

采纳答案by MaxU

回答by Brad Solomon

回答by Hitesh Somani

相关推荐

最近更新

标签

Python 使用带参数的 Pandas groupby() + apply()

提问by beta

采纳答案by MaxU

回答by Brad Solomon

回答by Hitesh Somani

相关推荐

Python 如何读取/打印（_io.TextIOWrapper）数据？

Python 重复一个字符串 n 次并打印 n 行

Python 不能用灵活的类型执行reduce

为什么在 Python 3 中 map 返回一个 map 对象而不是一个列表？

相关推荐

最近更新

标签