pandas - 检查数据帧 groupby 中的非唯一值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33732106/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:14:25  来源:igfitidea点击:

pandas - check for non unique values in dataframe groupby

pythonpandas

提问by Fabio Lamanna

I have this simple dataframe df:

我有这个简单的数据框df

a,b
1,2
1,3
1,4
1,2
2,1
2,2
2,3
2,5
2,5

I would like to check whether there are duplicates in bwith respect to each group in a. So far I did the following:

我想,以检查是否有重复b相对于每个组中a。到目前为止,我做了以下工作:

g = df.groupby('a')['b'].unique()

which returns:

返回:

a
1       [2, 3, 4]
2    [1, 2, 3, 5]

But what I would like to have is a list, for each group in a, with multiple occurrences in b. The expected output in this case would be:

但是我想要的是一个列表,对于 中的每个组a,在b. 在这种情况下,预期的输出是:

a
1    [2]
2    [5]

回答by atomh33ls

g=df.groupby('a')['b'].value_counts()
g.where(g>1).dropna()

回答by akrun

We can use duplicated

我们可以用 duplicated

print(df[df.duplicated()].drop_duplicates())