pandas KeyError：熊猫数据框中的错误

Question

提问by panchester

import pandas as pd

businesses = pd.read_json(businesses_filepath, lines=True, encoding='utf_8')
restaurantes = businesses['Restaurants' in businesses['categories']]

I would like to remove the lines that do not have Restaurants in the categories column, and this column has lists, however gave the error 'KeyError: False' and I would like to understand why and how to solve.

我想删除类别列中没有餐厅的行，该列有列表，但是给出了错误“KeyError: False”，我想了解原因以及如何解决。

Answer 1

回答by Ted Petrou

The expression 'Restaurants' in businesses['categories']returns the boolean value False. This is passed to the brackets indexing operator for the DataFrame businesses which does not contain a column called False and thus raises a KeyError.

该表达式'Restaurants' in businesses['categories']返回布尔值False。这被传递给不包含名为 False 的列的 DataFrame 业务的括号索引运算符，因此引发 KeyError。

What you are looking to do is something called boolean indexing which works like this.

您要做的是称为布尔索引的东西，它的工作原理是这样的。

businesses[businesses['categories'] == 'Restaurants']

Answer 2

回答by Joe

If you find that your data contains spelling variations or alternative restaurant related terms, the following may be of benefit. Essentially you put your restaurant related terms in restuarant_lst. The lambdafunction returns trueif any of the items in restaurant_lstare contained within each row of the business series. The .locindexer filters out rows which return falsefor the lambdafunction.

如果您发现您的数据包含拼写变体或替代餐厅相关术语，以下内容可能会有所帮助。本质上，您将与餐厅相关的术语放入restuarant_lst. 如果中的任何项目包含在业务系列的每一行中，则该lambda函数返回。该索引过滤掉行这回的功能。truerestaurant_lst.locfalselambda

restaurant_lst = ['Restaurant','restaurantes','diner','bistro']
restaurant = businesses.loc[businesses.apply(lambda x: any(restaurant_str in x for restaurant_str in restaurant_lst))]

Answer 3

回答by Rayhane Mama

I think what you meant was :

我想你的意思是：

businesses = businesses.loc[businesses['categories'] == 'Restaurants']

that will only keep rows with the category restaurants

只会保留类别餐厅的行

pandas KeyError：熊猫数据框中的错误

提问by panchester

回答by Ted Petrou

回答by Joe

回答by Rayhane Mama

相关推荐

最近更新

标签

pandas KeyError：熊猫数据框中的错误

提问by panchester

回答by Ted Petrou

回答by Joe

回答by Rayhane Mama

相关推荐

如何在不遇到 MemoryError 的情况下连接多个 pandas.DataFrames

pandas 计算pandas中每行具有某些值的列数

从一列中的唯一值创建 Pandas DataFrame

pandas 按一列分组并在熊猫中找到另一列的总和和最大值

相关推荐

最近更新

标签