pandas 熊猫系列的 groupby 不起作用

Question

提问by andrew

I am unable to do a groupby on a pandas Series object. DataFrames are fine, but I cannot seem to do groupby with a Series. Has anyone been able to get this to work?

我无法对Pandas系列对象进行分组。DataFrames 很好，但我似乎无法使用 Series 进行 groupby。有没有人能够让这个工作？

>>> import pandas as pd
>>> a = pd.Series([1,2,3,4], index=[4,3,2,1])
>>> a
4    1
3    2
2    3
1    4
dtype: int64
>>> a.groupby()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/generic.py", line 153, in groupby
    sort=sort, group_keys=group_keys)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 537, in groupby
    return klass(obj, by, **kwds)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 195, in __init__
    level=level, sort=sort)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 1326, in _get_grouper
    ping = Grouping(group_axis, gpr, name=name, level=level, sort=sort)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 1203, in __init__
    self.grouper = self.index.map(self.grouper)
  File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 878, in map
    return self._arrmap(self.values, mapper)
  File "generated.pyx", line 2200, in pandas.algos.arrmap_int64 (pandas/algos.c:61221)
TypeError: 'NoneType' object is not callable

Answer 1

回答by Jeff

You need to pass a mapping of some kind (could be a dict/function/index)

您需要传递某种映射（可能是字典/函数/索引）

In [6]: a
Out[6]: 
4    1
3    2
2    3
1    4
dtype: int64

In [7]: a.groupby(a.index).sum()
Out[7]: 
1    4
2    3
3    2
4    1
dtype: int64

In [3]: a.groupby(lambda x: x % 2 == 0).sum()
Out[3]: 
False    6
True     4
dtype: int64

Answer 2

回答by luca

if you need to groupby series' values:

如果您需要对系列的值进行分组：

grouped = a.groupby(a)

or

或者

grouped = a.groupby(lambda x: a[x])

Answer 3

回答by braunmagrin

Don't take the answer too seriously ;) I'm not saying this is a good idea.

不要太认真地回答这个问题；) 我不是说这是个好主意。

If you reallywant to do it inline, or in a "fluent" way, you could do something like this.

如果您真的想内联或以“流畅”的方式进行操作，您可以这样做。

def smart_groupby(self, by=None, *args, **kwargs):
    if by is None:
        return self.groupby(self, *args, **kwargs)
    return self.groupby(by, *args, **kwargs)

import pandas as pd
ps.Series.groupby = smart_groupby

pd.Series(['a', 'a', 'a', 'b', 'b']).groupby().count()

and the result would be

结果是

a    3
b    2
dtype: int64

It should behave as usual, but with the added benefit that if you omit the byit groups based on itself.

它应该表现得像往常一样，但如果您省略by它基于自身的分组，则有额外的好处。

pandas 熊猫系列的 groupby 不起作用

提问by andrew

回答by Jeff

回答by luca

回答by braunmagrin

相关推荐

最近更新

标签

pandas 熊猫系列的 groupby 不起作用

提问by andrew

回答by Jeff

回答by luca

回答by braunmagrin

相关推荐

基于索引的 Pandas Dataframe Mask

使用 Pandas DataFrame.sort() 时，我可以让它实际重新编号行吗？

Pandas date_range 从 DatetimeIndex 到 Date 格式

在 Pandas DataFrame 中有效地将字符串转换为适当的数字类型

相关推荐

最近更新

标签