pandas 数据框中的熊猫聚合计数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41682240/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:47:52  来源:igfitidea点击:

pandas aggregate count in dataframe

pandasindexingdataframecounting

提问by Mike El Hymanson

I have a DataFrame and I am using .aggregate({'col1': np.sum}), this will perform a summation of the values in col1and aggregate them together. Is it possible to perform a count, something like .aggregate({'col1': some count function here})?

我有一个 DataFrame 并且我正在使用.aggregate({'col1': np.sum}),这将对中的值进行求和col1并将它们聚合在一起。是否可以执行计数,例如.aggregate({'col1': some count function here})

回答by root

You can use 'size', 'count', or 'nunique'depending on your use case. The differences between them being:

您可以使用'size''count'或 ,'nunique'具体取决于您的用例。它们之间的区别在于:

  • 'size': the count including NaNand repeat values.
  • 'count': the count excluding NaNbut including repeats.
  • 'nunique': the count of unique values, excluding repeats and NaN.
  • 'size':包括NaN和重复值的计数。
  • 'count': 不包括NaN但包括重复的计数。
  • 'nunique':唯一值的计数,不包括重复和NaN

For example, consider the following DataFrame:

例如,考虑以下 DataFrame:

df = pd.DataFrame({'col0': list('aabbcc'), 'col1': [1, 1, 2, np.nan, 3, 4]})

  col0  col1
0    a   1.0
1    a   1.0
2    b   2.0
3    b   NaN
4    c   3.0
5    c   4.0

Then using the three functions described:

然后使用描述的三个函数:

df.groupby('col0')['col1'].agg(['size', 'count', 'nunique'])

      size  count  nunique
col0                      
a        2      2        1
b        2      1        1
c        2      2        2