Python(Pandas)错误“标签[阿尔及利亚]不在[索引]中”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41428107/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:41:53  来源:igfitidea点击:

Python (Pandas) error 'the label [Algeria] is not in the [index]'

pythonpandasindexingassign

提问by YohanRoth

I do not understand why this works

我不明白为什么这有效

df[(df['Gold']>0) & (df['Gold.1']>0)].loc[((df['Gold'] - df['Gold.1'])/(df['Gold'])).abs().idxmax()]

but when I divide by (df['Gold'] + df['Gold.1'] + df['Gold.2'])it stops working giving me error that you can find below.

但是当我除以(df['Gold'] + df['Gold.1'] + df['Gold.2'])它时,它停止工作,给我错误,你可以在下面找到。

Interestingly, the following line works

有趣的是,以下行有效

df.loc[((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()]

I do not understand what is happening since I just started to learn Python and Pandas. I need to understand the reason why this happens and how to fix it.

我不明白发生了什么,因为我刚开始学习 Python 和 Pandas。我需要了解发生这种情况的原因以及如何解决。

ERROR

错误

KeyError: 'the label [Algeria] is not in the [index]'

KeyError: '标签 [Algeria] 不在 [index] 中'

DataFrame snap enter image description here

数据帧快照 在此处输入图片说明

回答by jezrael

Your problem is boolean indexing:

你的问题是boolean indexing

df[(df['Gold']>0) & (df['Gold.1']>0)]

returns a filtered DataFrame which does not contain the indexof maxvalue of Seriesyou calculated with this:

返回一个过滤后的数据帧,它不包含你用这个计算indexmaxSeries

((df['Gold'] - df['Gold.1'])/(df['Gold'] + df['Gold.1'] + df['Gold.2'])).abs().idxmax()

In your data it is Algeria.

在您的数据中,它是Algeria.

So loclogically throws a KeyError.

所以loc逻辑上抛出一个KeyError.

One possible solution is to assign the new filtered DataFrameto df1and then get the index corresponding to the max value of Seriesby using idxmax:

一个可能的解决办法是赋予新的过滤DataFramedf1,然后得到相应的最大值的指数Series使用idxmax

df1 = df[(df['Gold']>0) & (df['Gold.1']>0)]
df2 = df1.loc[((df1['Gold']-df1['Gold.1'])/(df1['Gold']+df1['Gold.1']+df1['Gold.2'])).abs().idxmax()]