Python 没有要聚合的数字类型 - groupby() 行为的变化?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12844529/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:59:10  来源:igfitidea点击:

No numeric types to aggregate - change in groupby() behaviour?

pythonpandas

提问by andreas-h

I have a problem with some groupy code which I'm quite sure once ran (on an older pandas version). On 0.9, I get No numeric types to aggregateerrors. Any ideas?

我对一些我很确定曾经运行过的 groupy 代码有问题(在较旧的 Pandas 版本上)。在 0.9 上,我没有数字类型来聚合错误。有任何想法吗?

In [31]: data
Out[31]: 
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2557 entries, 2004-01-01 00:00:00 to 2010-12-31 00:00:00
Freq: <1 DateOffset>
Columns: 360 entries, -89.75 to 89.75
dtypes: object(360)

In [32]: latedges = linspace(-90., 90., 73)

In [33]: lats_new = linspace(-87.5, 87.5, 72)

In [34]: def _get_gridbox_label(x, bins, labels):
   ....:             return labels[searchsorted(bins, x) - 1]
   ....: 

In [35]: lat_bucket = lambda x: _get_gridbox_label(x, latedges, lats_new)

In [36]: data.T.groupby(lat_bucket).mean()
---------------------------------------------------------------------------
DataError                                 Traceback (most recent call last)
<ipython-input-36-ed9c538ac526> in <module>()
----> 1 data.T.groupby(lat_bucket).mean()

/usr/lib/python2.7/site-packages/pandas/core/groupby.py in mean(self)
    295         """
    296         try:
--> 297             return self._cython_agg_general('mean')
    298         except DataError:
    299             raise

/usr/lib/python2.7/site-packages/pandas/core/groupby.py in _cython_agg_general(self, how, numeric_only)
   1415 
   1416     def _cython_agg_general(self, how, numeric_only=True):
-> 1417         new_blocks = self._cython_agg_blocks(how, numeric_only=numeric_only)
   1418         return self._wrap_agged_blocks(new_blocks)
   1419 

/usr/lib/python2.7/site-packages/pandas/core/groupby.py in _cython_agg_blocks(self, how, numeric_only)
   1455 
   1456         if len(new_blocks) == 0:
-> 1457             raise DataError('No numeric types to aggregate')
   1458 
   1459         return new_blocks

DataError: No numeric types to aggregate

采纳答案by Chang She

How are you generating your data?

你是如何生成数据的?

See how the output shows that your data is of 'object' type? the groupby operations specifically check whether each column is a numeric dtype first.

看看输出如何显示您的数据是“对象”类型?groupby 操作首先专门检查每列是否为数字 dtype。

In [31]: data
Out[31]: 
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2557 entries, 2004-01-01 00:00:00 to 2010-12-31 00:00:00
Freq: <1 DateOffset>
Columns: 360 entries, -89.75 to 89.75
dtypes: object(360)

look ↑

看↑


Did you initialize an empty DataFrame first and then filled it? If so that's probably why it changed with the new version as before 0.9 empty DataFrames were initialized to float type but now they are of object type. If so you can change the initialization to DataFrame(dtype=float).


您是否先初始化一个空的 DataFrame 然后填充它?如果是这样,这可能就是为什么它在新版本中更改的原因,因为 0.9 空数据帧被初始化为浮点类型,但现在它们是对象类型。如果是这样,您可以将初始化更改为DataFrame(dtype=float).

You can also call frame.astype(float)

你也可以打电话 frame.astype(float)

回答by Blackbrook

I got this error generating a data frame consisting of timestamps and data:

我在生成由时间戳和数据组成的数据帧时遇到此错误:

df = pd.DataFrame({'data':value}, index=pd.DatetimeIndex(timestamp))

Adding the suggested solution works for me:

添加建议的解决方案对我有用:

df = pd.DataFrame({'data':value}, index=pd.DatetimeIndex(timestamp), dtype=float))

Thanks Chang She!

谢谢常舍!

Example:

例子:

                     data
2005-01-01 00:10:00  7.53
2005-01-01 00:20:00  7.54
2005-01-01 00:30:00  7.62
2005-01-01 00:40:00  7.68
2005-01-01 00:50:00  7.81
2005-01-01 01:00:00  7.95
2005-01-01 01:10:00  7.96
2005-01-01 01:20:00  7.95
2005-01-01 01:30:00  7.98
2005-01-01 01:40:00  8.06
2005-01-01 01:50:00  8.04
2005-01-01 02:00:00  8.06
2005-01-01 02:10:00  8.12
2005-01-01 02:20:00  8.12
2005-01-01 02:30:00  8.25
2005-01-01 02:40:00  8.27
2005-01-01 02:50:00  8.17
2005-01-01 03:00:00  8.21
2005-01-01 03:10:00  8.29
2005-01-01 03:20:00  8.31
2005-01-01 03:30:00  8.25
2005-01-01 03:40:00  8.19
2005-01-01 03:50:00  8.17
2005-01-01 04:00:00  8.18
                     data
2005-01-01 00:00:00  7.636000
2005-01-01 01:00:00  7.990000
2005-01-01 02:00:00  8.165000
2005-01-01 03:00:00  8.236667
2005-01-01 04:00:00  8.180000