Python 如何按对象计算熊猫组列中的不同值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17926273/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:30:18  来源:igfitidea点击:

How to count distinct values in a column of a pandas group by object?

pythongroup-bypandas

提问by Roman

I have a pandas data frame and group it by two columns (for example col1and col2). For fixed values of col1and col2(i.e. for a group) I can have several different values in the col3. I would like to count the number of distinct values from the third columns.

我有一个 Pandas 数据框,并按两列(例如col1col2)将其分组。为固定值col1col2(为一个基团,即)我可以在几个不同的值col3。我想计算第三列中不同值的数量。

For example, If I have this as my input:

例如,如果我有这个作为我的输入:

1  1  1
1  1  1
1  1  2
1  2  3
1  2  3
1  2  3
2  1  1
2  1  2
2  1  3
2  2  3
2  2  3
2  2  3

I would like to have this table (data frame) as the output:

我想将此表(数据框)作为输出:

1  1  2
1  2  1
2  1  3
2  2  1

采纳答案by Roman

df.groupby(['col1','col2'])['col3'].nunique().reset_index()

回答by Jeff

In [17]: df
Out[17]: 
    0  1  2
0   1  1  1
1   1  1  1
2   1  1  2
3   1  2  3
4   1  2  3
5   1  2  3
6   2  1  1
7   2  1  2
8   2  1  3
9   2  2  3
10  2  2  3
11  2  2  3

In [19]: df.groupby([0,1])[2].apply(lambda x: len(x.unique()))
Out[19]: 
0  1
1  1    2
   2    1
2  1    3
   2    1
dtype: int64