pandas 熊猫分组并制作一组项目
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37572611/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas groupby and make set of items
提问by cppgnlearner
I am using pandas groupby and want to apply the function to make a set from the items in the group.
我正在使用 pandas groupby 并希望应用该函数从组中的项目中创建一个集合。
The following does not work:
以下不起作用:
df = df.groupby('col1')['col2'].agg({'size': len, 'set': set})
But the following works:
但以下工作:
def to_set(x):
return set(x)
df = df.groupby('col1')['col2'].agg({'size': len, 'set': to_set})
In my understanding the two expression are similar, what is the reason why the first does not work?
在我的理解中这两个表达式是相似的,第一个不起作用的原因是什么?
回答by Stefan
It's because set
is of type
type
whereas to_set
is of type
function
:
这是因为set
is of type
type
while to_set
is of type
function
:
type(set)
<class 'type'>
def to_set(x):
return set(x)
type(to_set)
<class 'function'>
According to the docs, .agg()
expects:
根据文档,.agg()
预计:
arg :
function
ordict
Function to use for aggregating groups.
If a
function
, must either work when passed aDataFrame
or when passed toDataFrame.apply
.If passed a
dict
, the keys must beDataFrame
column names.Accepted Combinations are:
string
cythonized function name
function
list
of functions
dict
of columns -> functionsnested
dict
of names -> dicts of functions
参数:
function
或dict
用于聚合组的函数。
如果 a
function
,必须在传递 aDataFrame
或传递给 时工作DataFrame.apply
。如果传递 a
dict
,则键必须是DataFrame
列名。接受的组合是:
string
cythonized 函数名
function
list
功能
dict
列数 -> 函数
dict
名称嵌套-> 函数字典
回答by Animesh Mishra
Try using:
尝试使用:
df = df.groupby('col1')['col2'].agg({'size': len, 'set': lambda x: set(x)})
Works for me.
对我来说有效。