pandas 熊猫系列的 groupby 不起作用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17929426/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:02:23  来源:igfitidea点击:

groupby for pandas Series not working

pythonpandas

提问by andrew

I am unable to do a groupby on a pandas Series object. DataFrames are fine, but I cannot seem to do groupby with a Series. Has anyone been able to get this to work?

我无法对Pandas系列对象进行分组。DataFrames 很好,但我似乎无法使用 Series 进行 groupby。有没有人能够让这个工作?

>>> import pandas as pd
>>> a = pd.Series([1,2,3,4], index=[4,3,2,1])
>>> a
4    1
3    2
2    3
1    4
dtype: int64
>>> a.groupby()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/generic.py", line 153, in groupby
    sort=sort, group_keys=group_keys)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 537, in groupby
    return klass(obj, by, **kwds)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 195, in __init__
    level=level, sort=sort)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 1326, in _get_grouper
    ping = Grouping(group_axis, gpr, name=name, level=level, sort=sort)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 1203, in __init__
    self.grouper = self.index.map(self.grouper)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 878, in map
    return self._arrmap(self.values, mapper)
  File "generated.pyx", line 2200, in pandas.algos.arrmap_int64 (pandas/algos.c:61221)
TypeError: 'NoneType' object is not callable

回答by Jeff

You need to pass a mapping of some kind (could be a dict/function/index)

您需要传递某种映射(可能是字典/函数/索引)

In [6]: a
Out[6]: 
4    1
3    2
2    3
1    4
dtype: int64

In [7]: a.groupby(a.index).sum()
Out[7]: 
1    4
2    3
3    2
4    1
dtype: int64

In [3]: a.groupby(lambda x: x % 2 == 0).sum()
Out[3]: 
False    6
True     4
dtype: int64

回答by luca

if you need to groupby series' values:

如果您需要对系列的值进行分组:

grouped = a.groupby(a)

or

或者

grouped = a.groupby(lambda x: a[x])

回答by braunmagrin

Don't take the answer too seriously ;) I'm not saying this is a good idea.

不要太认真地回答这个问题;) 我不是说这是个好主意。

If you reallywant to do it inline, or in a "fluent" way, you could do something like this.

如果您真的想内联或以“流畅”的方式进行操作,您可以这样做。

def smart_groupby(self, by=None, *args, **kwargs):
    if by is None:
        return self.groupby(self, *args, **kwargs)
    return self.groupby(by, *args, **kwargs)

import pandas as pd
ps.Series.groupby = smart_groupby

pd.Series(['a', 'a', 'a', 'b', 'b']).groupby().count()

and the result would be

结果是

a    3
b    2
dtype: int64

It should behave as usual, but with the added benefit that if you omit the byit groups based on itself.

它应该表现得像往常一样,但如果您省略by它基于自身的分组,则有额外的好处。