Python 如何按对象计算熊猫组列中的不同值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17926273/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to count distinct values in a column of a pandas group by object?
提问by Roman
I have a pandas data frame and group it by two columns (for example col1
and col2
). For fixed values of col1
and col2
(i.e. for a group) I can have several different values in the col3
. I would like to count the number of distinct values from the third columns.
我有一个 Pandas 数据框,并按两列(例如col1
和col2
)将其分组。为固定值col1
和col2
(为一个基团,即)我可以在几个不同的值col3
。我想计算第三列中不同值的数量。
For example, If I have this as my input:
例如,如果我有这个作为我的输入:
1 1 1
1 1 1
1 1 2
1 2 3
1 2 3
1 2 3
2 1 1
2 1 2
2 1 3
2 2 3
2 2 3
2 2 3
I would like to have this table (data frame) as the output:
我想将此表(数据框)作为输出:
1 1 2
1 2 1
2 1 3
2 2 1
采纳答案by Roman
df.groupby(['col1','col2'])['col3'].nunique().reset_index()
回答by Jeff
In [17]: df
Out[17]:
0 1 2
0 1 1 1
1 1 1 1
2 1 1 2
3 1 2 3
4 1 2 3
5 1 2 3
6 2 1 1
7 2 1 2
8 2 1 3
9 2 2 3
10 2 2 3
11 2 2 3
In [19]: df.groupby([0,1])[2].apply(lambda x: len(x.unique()))
Out[19]:
0 1
1 1 2
2 1
2 1 3
2 1
dtype: int64