pandas 将 Int64Index 转换为 Int

Question

提问by Christopher Jenkins

I'm iterating through a dataframe (called hdf) and applying changes on a row by row basis. hdf is sorted by group_id and assigned a 1 through n rank on some criteria.

我正在遍历一个数据框（称为 hdf）并逐行应用更改。hdf 按 group_id 排序，并根据某些条件分配 1 到 n 等级。

# Groupby function creates subset dataframes (a dataframe per distinct group_id).
grouped = hdf.groupby('group_id')

# Iterate through each subdataframe. 
for name, group in grouped:

    # This grabs the top index for each subdataframe
    index1 = group[group['group_rank']==1].index

    # If criteria1 == 0, flag all rows for removal
    if(max(group['criteria1']) == 0):    
        for x in range(rank1, rank1 + max(group['group_rank'])):
            hdf.loc[x,'remove_row'] = 1

I'm getting the following error:

我收到以下错误：

TypeError: int() argument must be a string or a number, not 'Int64Index'

I get the same error when I try to cast rank1 explicitly I get the same error:

当我尝试明确地强制转换 rank1 时，我得到了同样的错误我得到了同样的错误：

rank1 = int(group[group['auction_rank']==1].index)

Can someone explain what is happening and provide an alternative?

有人可以解释正在发生的事情并提供替代方案吗？

Answer 1

采纳答案by Evan Wright

The answer to your specific question is that index1is an Int64Index (basically a list), even if it has one element. To get that one element, you can use index1[0].

您的具体问题的答案index1是 Int64Index（基本上是一个列表），即使它只有一个元素。要获得那个元素，您可以使用index1[0].

But there are better ways of accomplishing your goal. If you want to remove all of the rows in the "bad" groups, you can use filter:

但是有更好的方法来实现你的目标。如果要删除“坏”组中的所有行，可以使用filter：

hdf = hdf.groupby('group_id').filter(lambda group: group['criteria1'].max() != 0)

If you only want to remove certain rows within matching groups, you can write a function and then use apply:

如果您只想删除匹配组中的某些行，您可以编写一个函数，然后使用apply：

def filter_group(group):
    if group['criteria1'].max() != 0:
        return group
    else:
        return group.loc[other criteria here]

hdf = hdf.groupby('group_id').apply(filter_group)

(If you really like your current way of doing things, you should know that locwill accept an index, not just an integer, so you could also do hdf.loc[group.index, 'remove_row'] = 1).

（如果你真的喜欢你目前的做事方式，你应该知道它loc会接受一个索引，而不仅仅是一个整数，所以你也可以这样做hdf.loc[group.index, 'remove_row'] = 1）。

Answer 2

回答by Hemanth Sharma

call tolist() on Int64Index object. Then the list can be iterated as int values.

在 Int64Index 对象上调用 tolist()。然后可以将列表迭代为 int 值。

pandas 将 Int64Index 转换为 Int

提问by Christopher Jenkins

采纳答案by Evan Wright

回答by Hemanth Sharma

相关推荐

最近更新

标签

pandas 将 Int64Index 转换为 Int

提问by Christopher Jenkins

采纳答案by Evan Wright

回答by Hemanth Sharma

相关推荐

pandas 根据前几年的数据计算熊猫数据框行的百分位数

pandas ValueError：无法将大小为 5 的序列复制到维度为 2 的数组轴

使用包含空格的列名查询 Pandas DataFrame 或使用包含空格的列名使用 drop 方法

在 Pandas 中转置 DataFrame，同时保留索引列

相关推荐

最近更新

标签