pandas KeyError:熊猫数据框中的错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44875397/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:55:02  来源:igfitidea点击:

KeyError: False in pandas dataframe

pythonpandas

提问by panchester

import pandas as pd

businesses = pd.read_json(businesses_filepath, lines=True, encoding='utf_8')
restaurantes = businesses['Restaurants' in businesses['categories']]

I would like to remove the lines that do not have Restaurants in the categories column, and this column has lists, however gave the error 'KeyError: False' and I would like to understand why and how to solve.

我想删除类别列中没有餐厅的行,该列有列表,但是给出了错误“KeyError: False”,我想了解原因以及如何解决。

回答by Ted Petrou

The expression 'Restaurants' in businesses['categories']returns the boolean value False. This is passed to the brackets indexing operator for the DataFrame businesses which does not contain a column called False and thus raises a KeyError.

该表达式'Restaurants' in businesses['categories']返回布尔值False。这被传递给不包含名为 False 的列的 DataFrame 业务的括号索引运算符,因此引发 KeyError。

What you are looking to do is something called boolean indexing which works like this.

您要做的是称为布尔索引的东西,它的工作原理是这样的。

businesses[businesses['categories'] == 'Restaurants']

回答by Joe

If you find that your data contains spelling variations or alternative restaurant related terms, the following may be of benefit. Essentially you put your restaurant related terms in restuarant_lst. The lambdafunction returns trueif any of the items in restaurant_lstare contained within each row of the business series. The .locindexer filters out rows which return falsefor the lambdafunction.

如果您发现您的数据包含拼写变体或替代餐厅相关术语,以下内容可能会有所帮助。本质上,您将与餐厅相关的术语放入restuarant_lst. 如果 中的任何项目包含在业务系列的每一行中,则该lambda函数返回。该索引过滤掉行这回的功能。truerestaurant_lst.locfalselambda

restaurant_lst = ['Restaurant','restaurantes','diner','bistro']
restaurant = businesses.loc[businesses.apply(lambda x: any(restaurant_str in x for restaurant_str in restaurant_lst))]

回答by Rayhane Mama

I think what you meant was :

我想你的意思是:

businesses = businesses.loc[businesses['categories'] == 'Restaurants']

that will only keep rows with the category restaurants

只会保留类别餐厅的行