pandas 如何处理 KeyError:“['blah'] 不在索引中”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31450672/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:37:49  来源:igfitidea点击:

How to handle KeyError: "['blah'] not in index"

pythonpandas

提问by Mike

I'm looking at the US names dataset (SSA), as described in Python for Data Analysis by Wes McKinney.

我正在查看美国名称数据集 (SSA),如 Wes McKinney 在 Python for Data Analysis 中所述。

This works:

有效

total_births = top1000.pivot_table('births', index = 'year', columns = 'name', aggfunc = sum)
subset = total_births[['Michael', 'Mike', 'Martin']].fillna(0)
subset.plot( title = 'Number of births per year', grid = True, figsize=(28,20), xticks=range(1880, 2020, 5)).get_figure().savefig('output2.png', bbox_inches = 'tight')

But when I add an unpopular name, which nevertheless is in the data set:

但是当我添加一个不受欢迎的名字时,它仍然在数据集中:

subset = total_births[['Michael', 'Mike', 'Martin', 'Ammar']].fillna(0)

...I get the following error:

...我收到以下错误:

Traceback (most recent call last):
  File "names.py", line 44, in <module>
    subset = total_births[['Michael', 'Mike', 'Martin', 'Ammar']].fillna(0)
  File "/home/mike/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 1774, in __getitem__
    return self._getitem_array(key)
  File "/home/mike/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 1818, in _getitem_array
    indexer = self.ix._convert_to_indexer(key, axis=1)
  File "/home/mike/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1143, in _convert_to_indexer
    raise KeyError('%s not in index' % objarr[mask])
KeyError: "['Ammar'] not in index"

I tried adding fillna(0), but it doesn't help... The code is available at https://github.com/m1key/data-science-sandbox(ade55154f177410e1e269d64766a4e8b8e1ae585), the troublesome lines are commented out.

我尝试添加fillna(0),但没有帮助...代码可在https://github.com/m1key/data-science-sandbox(ade55154f177410e1e269d64766a4e8b8e1ae585) 获得,麻烦的行已注释掉。

Sample data set:

样本数据集:

name  Aaden  Aaliyah  Aanya  Aarav  Aaron  Aarush  Ab  Abagail  Abb  Abbey  \
year                                                                         
1880    NaN      NaN    NaN    NaN    102     NaN NaN      NaN  NaN    NaN   
1881    NaN      NaN    NaN    NaN     94     NaN NaN      NaN  NaN    NaN   
1882    NaN      NaN    NaN    NaN     85     NaN NaN      NaN  NaN    NaN   
1883    NaN      NaN    NaN    NaN    105     NaN NaN      NaN  NaN    NaN   
1884    NaN      NaN    NaN    NaN     97     NaN NaN      NaN  NaN    NaN   

name  ...   Zoa  Zoe  Zoey  Zoie  Zola  Zollie  Zona  Zora  Zula  Zuri  
year  ...                                                               
1880  ...     8   23   NaN   NaN     7     NaN     8    28    27   NaN  
1881  ...   NaN   22   NaN   NaN    10     NaN     9    21    27   NaN  
1882  ...     8   25   NaN   NaN     9     NaN    17    32    21   NaN  
1883  ...   NaN   23   NaN   NaN    10     NaN    11    35    25   NaN  
1884  ...    13   31   NaN   NaN    14       6     8    58    27   NaN  

Thanks for any hints.

感谢您的任何提示。

回答by Julien Marrec

Ammar doesn't appear to be in your dataset.

Ammar 似乎不在您的数据集中。

In order to double check, try 'Ammar' in total_births.columnswhich will return a boolean (Trueor False)

为了仔细检查,尝试'Ammar' in total_births.columns哪个将返回一个布尔值(TrueFalse