使用 python-pandas 索引数据框时，无法为非唯一标签绑定正确的切片

Question

提问by user5779223

I have such a data frame df:

我有这样一个数据框df：

a         b
10        2
3         1
0         0
0         4
....
# about 50,000+ rows

I wish to choose the df[:5, 'a']. But When I call df.loc[:5, 'a'], I got an error: KeyError: 'Cannot get right slice bound for non-unique label: 5. When I call df.loc[5], the result contains 250 rows while there is just one when I use df.iloc[5]. Why does this thing happen and how can I index it properly? Thank you in advance!

我希望选择df[:5, 'a']. 但是当我打电话时df.loc[:5, 'a']，我收到一个错误：KeyError: 'Cannot get right slice bound for non-unique label: 5。当我调用时df.loc[5]，结果包含 250 行，而当我使用df.iloc[5]. 为什么会发生这种情况，我该如何正确索引它？先感谢您！

Answer 1

回答by Stefan

The error message is explained here: if the index is not monotonic, then both slice bounds must be unique members of the index.

此处解释了错误消息：if the index is not monotonic, then both slice bounds must be unique members of the index。

The difference between .locand .ilocis labelvs integer positionbased indexing - see docs. .locis intended to select individual labelsor slicesof labels. That's why .loc[5]selects all rows where the indexhas the value 250 (and the error is about a non-unique index). iloc, in contrast, select row number 5 (0-indexed). That's why you only get a single row, and the index value may or may not be 5. Hope this helps!

.loc和之间的区别.iloc是label与integer position基于索引的索引 -请参阅文档。.loc旨在选择单个labels或slices标签。这就是为什么.loc[5]选择index值为 250 的所有行（并且错误与非唯一索引有关）。iloc，相反，选择第 5 行（0 索引）。这就是为什么你只得到一行，而索引值可能是也可能不是5。希望这可以帮助！

Answer 2

回答by Sujith Rao

The issue with the way you are addressing is that, there are multiple rows with index as 5. So the loc attribute does not know which one to pick. To know just do a df.loc[5] you will get number of rows with same index. Either you can sort it using sort_index or you can first aggregate data based on index and then retrieve. Hope this helps.

您处理方式的问题在于，有多行索引为 5。因此 loc 属性不知道选择哪一个。要知道只需执行 df.loc[5] 您将获得具有相同索引的行数。您可以使用 sort_index 对其进行排序，也可以首先根据索引聚合数据，然后进行检索。希望这可以帮助。

使用 python-pandas 索引数据框时，无法为非唯一标签绑定正确的切片

提问by user5779223

回答by Stefan

回答by Sujith Rao

相关推荐

最近更新

标签

使用 python-pandas 索引数据框时，无法为非唯一标签绑定正确的切片

提问by user5779223

回答by Stefan

回答by Sujith Rao

相关推荐

pandas 使用 Seaborn 绘制最小/最大阴影的时间序列图

Pandas drop_duplicates - 类型错误：* 后的类型对象参数必须是序列，而不是映射

Pandas 数据框 - RemoteDataError - Python

pandas 如何将图添加到子图 matplotlib

相关推荐

最近更新

标签