pandas 数据框中的熊猫聚合计数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41682240/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas aggregate count in dataframe
提问by Mike El Hymanson
I have a DataFrame and I am using .aggregate({'col1': np.sum})
, this will perform a summation of the values in col1
and aggregate them together. Is it possible to perform a count, something like .aggregate({'col1': some count function here})
?
我有一个 DataFrame 并且我正在使用.aggregate({'col1': np.sum})
,这将对中的值进行求和col1
并将它们聚合在一起。是否可以执行计数,例如.aggregate({'col1': some count function here})
?
回答by root
You can use 'size'
, 'count'
, or 'nunique'
depending on your use case. The differences between them being:
您可以使用'size'
、'count'
或 ,'nunique'
具体取决于您的用例。它们之间的区别在于:
'size'
: the count includingNaN
and repeat values.'count'
: the count excludingNaN
but including repeats.'nunique'
: the count of unique values, excluding repeats andNaN
.
'size'
:包括NaN
和重复值的计数。'count'
: 不包括NaN
但包括重复的计数。'nunique'
:唯一值的计数,不包括重复和NaN
。
For example, consider the following DataFrame:
例如,考虑以下 DataFrame:
df = pd.DataFrame({'col0': list('aabbcc'), 'col1': [1, 1, 2, np.nan, 3, 4]})
col0 col1
0 a 1.0
1 a 1.0
2 b 2.0
3 b NaN
4 c 3.0
5 c 4.0
Then using the three functions described:
然后使用描述的三个函数:
df.groupby('col0')['col1'].agg(['size', 'count', 'nunique'])
size count nunique
col0
a 2 2 1
b 2 1 1
c 2 2 2