pandas “IndexError: positional indexers are out-of-bounds” 当它们显然不是
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44123056/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
"IndexError: positional indexers are out-of-bounds" when they're demonstrably not
提问by Arnold
Here's an MWE of some code I'm using. I slowly whittle down an initial dataframe via slicing and some conditions until I have only the rows that I need. Each block of five rows actually represents a different object so that, as I whittle things down, if any one row in each block of five meets the criteria, I want to keep it -- this is what the loop over keep.index accomplishes. No matter what, when I'm done I can see that the final indices I want exist, but I get an error message saying "IndexError: positional indexers are out-of-bounds." What is happening here?
这是我正在使用的一些代码的 MWE。我通过切片和一些条件慢慢减少初始数据帧,直到我只有我需要的行。每个五行块实际上代表一个不同的对象,因此,当我精简时,如果每个五行块中的任何一行符合条件,我想保留它——这就是 keep.index 上的循环所完成的。无论如何,当我完成后,我可以看到我想要的最终索引存在,但我收到一条错误消息,指出“IndexError:位置索引器越界”。这里发生了什么?
import pandas as pd
import numpy as np
temp = np.random.rand(100,5)
df = pd.DataFrame(temp, columns=['First', 'Second', 'Third', 'Fourth', 'Fifth'])
df_cut = df.iloc[10:]
keep = df_cut.loc[(df_cut['First'] < 0.5) & (df_cut['Second'] <= 0.6)]
new_indices_to_use = []
for item in keep.index:
remainder = (item % 5)
add = np.arange(0-remainder,5-remainder,1)
inds_to_use = item + add
new_indices_to_use.append(inds_to_use)
new_indices_to_use = [ind for sublist in new_indices_to_use for ind in sublist]
final_indices_to_use = []
for item in new_indices_to_use:
if item not in final_indices_to_use:
final_indices_to_use.append(item)
final = df_cut.iloc[final_indices_to_use]
回答by TemporalWolf
From Pandas documentation on .iloc
(emphasis mine):
来自 Pandas 文档.iloc
(重点是我的):
Pandas provides a suite of methods in order to get purely integer based indexing. The semantics follow closely python and numpy slicing. These are 0-based indexing.
Pandas 提供了一套方法来获得纯粹基于整数的索引。语义紧跟 python 和 numpy 切片。这些是基于 0 的索引。
You're trying to use it by label, which means you need .loc
您正在尝试按标签使用它,这意味着您需要 .loc
From your example:
从你的例子:
>>>print df_cut.iloc[89]
...
Name: 99, dtype: float64
>>>print df_cut.loc[89]
...
Name: 89, dtype: float64