Python groupby - TypeError 'DataFrame' 对象不可调用

Question

提问by user3204120

newbie here - my first foray seemed ok, but this is my 2nd use of pandas. In using Pandas 0.12.0 on windows 7, I read 2 dataframes from SQL One works with groupby as expected, so I'm sure my problem isn't syntax. But on the other, where type(reddf) return pandas.core.frame.DataFrame, when try reddf.groupby( 'any column') I get - last few lines -

新手 - 我的第一次尝试似乎没问题，但这是我第二次使用熊猫。在 Windows 7 上使用 Pandas 0.12.0 时，我从 SQL One 中读取了 2 个数据帧，按预期与 groupby 一起工作，所以我确定我的问题不是语法。但另一方面，type（reddf）返回pandas.core.frame.DataFrame，当尝试reddf.groupby（'any column'）时，我得到 - 最后几行 -

    c:\python27\lib\site-packages\pandas\core\groupby.pyc in __init__(self, index, grouper,     name, level, sort)
   1197             # no level passed
   1198             if not isinstance(self.grouper, np.ndarray):
-> 1199                 self.grouper = self.index.map(self.grouper)
   1200                 if not (hasattr(self.grouper,"__len__") and \
   1201                    len(self.grouper) == len(self.index)):

c:\python27\lib\site-packages\pandas\algos.pyd in pandas.algos.arrmap_int64 (pandas\algos.c:62839)()

TypeError: 'DataFrame' object is not callable

I know groupbyis OK, and the column exists, so there's some other constraint / condition on the dataframe that I'm just not aware of or blew past. So what could cause this error? And what should I do? What should I look for in the future?

我知道groupby 没问题，并且该列存在，因此数据帧上还有一些其他约束/条件，我只是不知道或忽略了。那么什么会导致这个错误呢？我该怎么办？我将来应该寻找什么？

info requested

请求的信息

print type(reddf.index)
<class 'pandas.core.index.Int64Index'>

print repr(reddf.index) 
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], dtype=int64)

print type(reddf.index.map)
<type 'instancemethod'>

print repr(reddf.index.map)
<bound method Int64Index.map of Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], dtype=int64)>

Just in case
reddf gives
<class 'pandas.core.frame.DataFrame'>
Int64Index: 20 entries, 0 to 19
Data columns (total 24 columns):
AssetId                  20  non-null values
DateAdded                20  non-null values
ModelId                  20  non-null values
UsageTypeId              20  non-null values
DateAdded                20  non-null values
Name                     20  non-null values
NatureId                 20  non-null values
IsContainer              20  non-null values
SparePartNumber          8  non-null values
ProductNumber            19  non-null values
SupportCategoryOid       20  non-null values
SerialNumber             20  non-null values
IpAddress                20  non-null values
Description              20  non-null values
CustomsId                15  non-null values
AssetTag                 20  non-null values
ParentId                 5  non-null values
ManagementProcessorId    7  non-null values
OperatingSystem          20  non-null values
OsVersion                20  non-null values
SystemName               20  non-null values
LocationId               10  non-null values
RomVersion               20  non-null values
MacAddress               19  non-null values
dtypes: bool(1), datetime64[ns](2), float64(3), int64(5), object(13)

and I get the error doing a reddf.groupby('ModelId'), in particular. thanks

并且我在执行 reddf.groupby('ModelId') 时遇到错误，特别是。谢谢

Thanks to everyone, The duplicate field name caused me the issue, I can't believe I did not notice before the last comment.

谢谢大家，重复的字段名称给我造成了这个问题，我不敢相信我在最后一条评论之前没有注意到。

Now, I don't understand how the .index output eliminated other problems, could you elaborate? What if the index were missing, should not groupby have been able to function properly, why not? Just looking for a short explanation and if you point to code, that's fine. appreciate the help, guys.

现在，我不明白 .index 输出如何消除其他问题，你能详细说明吗？如果缺少索引怎么办，应该不是groupby已经能够正常运行了，为什么不呢？只是寻找一个简短的解释，如果你指向代码，那很好。感谢帮助，伙计们。

Answer 1

采纳答案by Ciba

is caused by the duplication of 'DateAdded' column. Rename it and you are good to go.

是由“DateAdded”列的重复引起的。重命名它，您就可以开始了。

Answer 2

回答by WWH

FYI, duplicate column names should no longer cause this error. If you're using the latest pandas then this error is caused by something else.

仅供参考，重复的列名不应再导致此错误。如果您使用的是最新的熊猫，那么此错误是由其他原因引起的。

See: https://github.com/pandas-dev/pandas/pull/8210

见：https: //github.com/pandas-dev/pandas/pull/8210

Python groupby - TypeError 'DataFrame' 对象不可调用

提问by user3204120

采纳答案by Ciba

回答by WWH

相关推荐

最近更新

标签

Python groupby - TypeError 'DataFrame' 对象不可调用

提问by user3204120

采纳答案by Ciba

回答by WWH

相关推荐

Python PyCharm 中的错误未使用导入语句？

Python Pandas：将日期时间列分组为小时和分钟聚合

如何从同一目录导入python类文件？

Python 用numpy找出矩阵是否是正定的

相关推荐

最近更新

标签