pandas 如何从pandas groupby中的多列中获取唯一值

Question

提问by Fabio Lamanna

Starting from this dataframe df:

从这个数据帧 df 开始：

df = pd.DataFrame({'c':[1,1,1,2,2,2],'l1':['a','a','b','c','c','b'],'l2':['b','d','d','f','e','f']})

   c l1 l2
0  1  a  b
1  1  a  d
2  1  b  d
3  2  c  f
4  2  c  e
5  2  b  f

I would like to perform a groupby over the ccolumn to get unique values of the l1and l2columns. For one columns I can do:

我想对c列执行 groupby以获取l1和l2列的唯一值。对于一列，我可以这样做：

g = df.groupby('c')['l1'].unique()

that correctly returns:

正确返回：

c
1    [a, b]
2    [c, b]
Name: l1, dtype: object

but using:

但使用：

g = df.groupby('c')['l1','l2'].unique()

returns:

返回：

AttributeError: 'DataFrameGroupBy' object has no attribute 'unique'

I know I can get the unique values for the two columns with (among others):

我知道我可以获得两列的唯一值（除其他外）：

In [12]: np.unique(df[['l1','l2']])
Out[12]: array(['a', 'b', 'c', 'd', 'e', 'f'], dtype=object)

Is there a way to apply this method to the groupby in order to get something like:

有没有办法将此方法应用于 groupby 以获得类似的东西：

c
1    [a, b, d]
2    [c, b, e, f]
Name: l1, dtype: object

Answer 1

回答by ayhan

You can do it with apply:

你可以这样做apply：

import numpy as np
g = df.groupby('c')['l1','l2'].apply(lambda x: list(np.unique(x)))

Answer 2

回答by Yaakov Bressler

Alternatively, you can use agg:

或者，您可以使用agg：

g = df.groupby('c')['l1','l2'].agg(['unique'])

pandas 如何从pandas groupby中的多列中获取唯一值

提问by Fabio Lamanna

回答by ayhan

回答by Yaakov Bressler

相关推荐

最近更新

标签

pandas 如何从pandas groupby中的多列中获取唯一值

提问by Fabio Lamanna

回答by ayhan

回答by Yaakov Bressler

相关推荐

pandas 如何用来自不同数据集的“边际”（分布直方图）覆盖 Seaborn 联合图

pandas 如何获取熊猫中的每第 n 列？

Python 中 DataFrames 的 DataFrame (Pandas)

使用 index_col 时 Pandas read_sql 列不起作用 - 而是返回所有列

相关推荐

最近更新

标签