使用包含空格的列名查询 Pandas DataFrame 或使用包含空格的列名使用 drop 方法

Question

提问by iNoob

I am looking to use pandasto drop rows based on the column name (contains a space) and the cell value. I have tried various ways to achieve this (drop and query methods) but it seems I'm failing due to the space in the name. Is there a way to query the data using the name that has a space in it or do I need to clean all spaces first?

我希望使用pandas基于列名（包含空格）和单元格值来删除行。我尝试了各种方法来实现这一点（删除和查询方法），但由于名称中的空格，我似乎失败了。有没有办法使用其中有空格的名称查询数据，或者我是否需要先清理所有空格？

data in form of a csv file

csv文件形式的数据

Date,"price","Sale Item"
2012-06-11,1600.20,item1
2012-06-12,1610.02,item2
2012-06-13,1618.07,item3
2012-06-14,1624.40,item4
2012-06-15,1626.15,item5
2012-06-16,1626.15,item6
2012-06-17,1626.15,item7

Attempt Examples

尝试示例

df.drop(['Sale Item'] != 'Item1')
df.drop('Sale Item' != 'Item1')
df.drop("'Sale Item'] != 'Item1'")

df.query('Sale Item' != 'Item1')
df.query(['Sale Item'] != 'Item1')
df.query("'Sale Item'] != 'Item1'")

Error received in most cases

大多数情况下收到错误

ImportError: 'numexpr' not found. Cannot use engine='numexpr' for query/eval if 'numexpr' is not installed

Answer 1

回答by Fabio Lamanna

If I understood correctly your issue, maybe you can just apply a filter like:

如果我正确理解了您的问题，也许您可以应用以下过滤器：

df = df[df['Sale Item'] != 'item1']

which returns:

返回：

         Date    price Sale Item
1  2012-06-12  1610.02     item2
2  2012-06-13  1618.07     item3
3  2012-06-14  1624.40     item4
4  2012-06-15  1626.15     item5
5  2012-06-16  1626.15     item6
6  2012-06-17  1626.15     item7

Answer 2

回答by Anand S Kumar

As you can see from the documentation-

正如您从文档中看到的那样-

DataFrame.drop(labels, axis=0, level=None, inplace=False, errors='raise')
Return new object with labels in requested axis removed

DataFrame.drop(labels, axis=0, level=None, inplace=False, errors='raise')
返回移除了请求轴中标签的新对象

DataFrame.drop()takes the indexof the rows to drop, not the condition. Hence you would most probably need something like -

DataFrame.drop()需要index删除的行，而不是条件。因此，您很可能需要类似的东西 -

df.drop(df.ix[df['Sale Item'] != 'item1'].index)

Please note, this drops the rows that meet the condition, so the result would be the rows that don't meet the condition, if you want the opposite you can use ~operator before your condition to negate it.

请注意，这会删除满足条件的行，因此结果将是不满足条件的行，如果您想要相反的行，您可以~在条件之前使用运算符来否定它。

But this seems a bit too much, it would be easier to just use Boolean indexing to get the rows you want (as indicated in the other answer) .

但这似乎有点太多了，仅使用布尔索引来获取所需的行会更容易（如其他答案中所示）。

Demo -

演示 -

In [20]: df
Out[20]:
         Date    price Sale Item
0  2012-06-11  1600.20     item1
1  2012-06-12  1610.02     item2
2  2012-06-13  1618.07     item3
3  2012-06-14  1624.40     item4
4  2012-06-15  1626.15     item5
5  2012-06-16  1626.15     item6
6  2012-06-17  1626.15     item7

In [21]: df.drop(df.ix[df['Sale Item'] != 'item1'].index)
Out[21]:
         Date   price Sale Item
0  2012-06-11  1600.2     item1

使用包含空格的列名查询 Pandas DataFrame 或使用包含空格的列名使用 drop 方法

提问by iNoob

回答by Fabio Lamanna

回答by Anand S Kumar

相关推荐

最近更新

标签

使用包含空格的列名查询 Pandas DataFrame 或使用包含空格的列名使用 drop 方法

提问by iNoob

回答by Fabio Lamanna

回答by Anand S Kumar

相关推荐

pandas 在熊猫数据框中的每一行中查找非零值的列索引集

获取 CParserError。pandas 是否对单元格中值的最大大小进行了限制？

pandas 在python中将dbf转换为csv的方法？

pandas 根据不同列中的值复制行

相关推荐

最近更新

标签