pandas KeyError:熊猫数据框中的错误
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44875397/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
KeyError: False in pandas dataframe
提问by panchester
import pandas as pd
businesses = pd.read_json(businesses_filepath, lines=True, encoding='utf_8')
restaurantes = businesses['Restaurants' in businesses['categories']]
I would like to remove the lines that do not have Restaurants in the categories column, and this column has lists, however gave the error 'KeyError: False' and I would like to understand why and how to solve.
我想删除类别列中没有餐厅的行,该列有列表,但是给出了错误“KeyError: False”,我想了解原因以及如何解决。
回答by Ted Petrou
The expression 'Restaurants' in businesses['categories']
returns the boolean value False
. This is passed to the brackets indexing operator for the DataFrame businesses which does not contain a column called False and thus raises a KeyError.
该表达式'Restaurants' in businesses['categories']
返回布尔值False
。这被传递给不包含名为 False 的列的 DataFrame 业务的括号索引运算符,因此引发 KeyError。
What you are looking to do is something called boolean indexing which works like this.
您要做的是称为布尔索引的东西,它的工作原理是这样的。
businesses[businesses['categories'] == 'Restaurants']
回答by Joe
If you find that your data contains spelling variations or alternative restaurant related terms, the following may be of benefit. Essentially you put your restaurant related terms in restuarant_lst
. The lambda
function returns true
if any of the items in restaurant_lst
are contained within each row of the business series. The .loc
indexer filters out rows which return false
for the lambda
function.
如果您发现您的数据包含拼写变体或替代餐厅相关术语,以下内容可能会有所帮助。本质上,您将与餐厅相关的术语放入restuarant_lst
. 如果 中的任何项目包含在业务系列的每一行中,则该lambda
函数返回。该索引过滤掉行这回的功能。true
restaurant_lst
.loc
false
lambda
restaurant_lst = ['Restaurant','restaurantes','diner','bistro']
restaurant = businesses.loc[businesses.apply(lambda x: any(restaurant_str in x for restaurant_str in restaurant_lst))]
回答by Rayhane Mama
I think what you meant was :
我想你的意思是:
businesses = businesses.loc[businesses['categories'] == 'Restaurants']
that will only keep rows with the category restaurants
只会保留类别餐厅的行