Python 根据熊猫中的条件删除行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41833624/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-20 01:42:11  来源:igfitidea点击:

delete rows based on a condition in pandas

pythonpandas

提问by Shiva Krishna Bavandla

I have the below dataframe

我有以下数据框

In [62]: df
Out[62]:
            coverage   name  reports  year
Cochice           45  Jason        4  2012
Pima             214  Molly       24  2012
Santa Cruz       212   Tina       31  2013
Maricopa          72   Jake        2  2014
Yuma              85    Amy        3  2014

Basically i can filter the rows as below

基本上我可以过滤如下行

df[df["coverage"] > 30

and i can drop/delete a single row as below

我可以删除/删除单行,如下所示

df.drop(['Cochice', 'Pima'])

But i want to delete a certain number of rows based on a condition, how can i do so?

但是我想根据条件删除一定数量的行,我该怎么做?

回答by jezrael

The best is boolean indexingbut need invert condition - get all values equal and higher as 72:

最好的是boolean indexing但需要反转条件 - 使所有值相等且更高72

print (df[df["coverage"] >= 72])
            coverage   name  reports  year
Pima             214  Molly       24  2012
Santa Cruz       212   Tina       31  2013
Maricopa          72   Jake        2  2014
Yuma              85    Amy        3  2014

It is same as gefunction:

它与ge功能相同:

print (df[df["coverage"].ge(72)])
            coverage   name  reports  year
Pima             214  Molly       24  2012
Santa Cruz       212   Tina       31  2013
Maricopa          72   Jake        2  2014
Yuma              85    Amy        3  2014

Another possible solution is invert mask by ~:

另一种可能的解决方案是通过~以下方式反转掩码:

print (df["coverage"] < 72)
Cochice        True
Pima          False
Santa Cruz    False
Maricopa      False
Yuma          False
Name: coverage, dtype: bool

print (~(df["coverage"] < 72))
Cochice       False
Pima           True
Santa Cruz     True
Maricopa       True
Yuma           True
Name: coverage, dtype: bool


print (df[~(df["coverage"] < 72)])
            coverage   name  reports  year
Pima             214  Molly       24  2012
Santa Cruz       212   Tina       31  2013
Maricopa          72   Jake        2  2014
Yuma              85    Amy        3  2014

回答by qaiser

we can use pandas.query() functionality as well

我们也可以使用 pandas.query() 功能

import pandas as pd 

dict_ = {'coverage':[45,214,212,72,85], 'name': ['jason','Molly','Tina','Jake','Amy']}
df  = pd.DataFrame(dict_)

print(df.query('coverage > 72'))

enter image description here

在此处输入图片说明