pandas - 检查数据帧 groupby 中的非唯一值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33732106/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas - check for non unique values in dataframe groupby
提问by Fabio Lamanna
I have this simple dataframe df
:
我有这个简单的数据框df
:
a,b
1,2
1,3
1,4
1,2
2,1
2,2
2,3
2,5
2,5
I would like to check whether there are duplicates in b
with respect to each group in a
. So far I did the following:
我想,以检查是否有重复b
相对于每个组中a
。到目前为止,我做了以下工作:
g = df.groupby('a')['b'].unique()
which returns:
返回:
a
1 [2, 3, 4]
2 [1, 2, 3, 5]
But what I would like to have is a list, for each group in a
, with multiple occurrences in b
. The expected output in this case would be:
但是我想要的是一个列表,对于 中的每个组a
,在b
. 在这种情况下,预期的输出是:
a
1 [2]
2 [5]
回答by atomh33ls
g=df.groupby('a')['b'].value_counts()
g.where(g>1).dropna()
回答by akrun
We can use duplicated
我们可以用 duplicated
print(df[df.duplicated()].drop_duplicates())