Python pandas 删除不满足多个条件的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38825087/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pandas remove rows where multiple conditions are not met
提问by Simon
Lets say I have a dataframe like this:
假设我有一个这样的数据框:
id num
0 1 1
1 2 2
2 3 1
3 4 2
4 1 1
5 2 2
6 3 1
7 4 2
The above can be generated with this for testing purposes:
出于测试目的,可以使用此生成上述内容:
test = pd.DataFrame({'id': np.array([1,2,3,4] * 2,dtype='int32'),
'num': np.array([1,2] * 4,dtype='int32')
})
Now, I want to keep only the rows where a certain condition is met: id
is not 1 AND num
is not 1. Essentially I want to remove the rows with index 0 and 4. For my actual dataset its easier to remove the rows I dont want rather than to specify the rows that I do want
现在,我只想保留满足特定条件的行:id
不是 1 ANDnum
不是 1。基本上我想删除索引为 0 和 4 的行。对于我的实际数据集,删除我不想要的行更容易而不是指定我想要的行
I have tried this:
我试过这个:
test = test[(test['id'] != 1) & (test['num'] != 1)]
However, that gives me this:
然而,这给了我这个:
id num
1 2 2
3 4 2
5 2 2
7 4 2
It seems to have removed all rows where id
is 1 OR num
is 1
它似乎删除了所有id
为 1 或num
为 1 的行
I've seen a number of other questions where the answer is the one I used above but it doesn't seem to be working out in my case
我已经看到许多其他问题的答案是我上面使用的答案,但在我的情况下似乎没有解决
回答by EdChum
If you change the boolean condition to be equality and invert the combined boolean conditions by enclosing both in additional parentheses then you get the desired behaviour:
如果您将布尔条件更改为相等并通过将两者括在附加括号中来反转组合的布尔条件,那么您将获得所需的行为:
In [14]:
test = test[~((test['id'] == 1) & (test['num'] == 1))]
test
Out[14]:
id num
1 2 2
2 3 1
3 4 2
5 2 2
6 3 1
7 4 2
I also think your understanding of boolean syntax is incorrect what you want is to or
the conditions:
我也认为你对布尔语法的理解是不正确的,你想要的是or
条件:
In [22]:
test = test[(test['id'] != 1) | (test['num'] != 1)]
test
Out[22]:
id num
1 2 2
2 3 1
3 4 2
5 2 2
6 3 1
7 4 2
If you think about what this means the first condition excludes any row where 'id' is equal to 1 and similarly for the 'num' column:
如果您考虑一下这意味着什么,第一个条件排除任何“id”等于 1 的行,“num”列也是如此:
In [24]:
test[test['id'] != 1]
Out[24]:
id num
1 2 2
2 3 1
3 4 2
5 2 2
6 3 1
7 4 2
In [25]:
test[test['num'] != 1]
Out[25]:
id num
1 2 2
3 4 2
5 2 2
7 4 2
So really you wanted to or
(|
) the above conditions
所以你真的想要or
( |
) 上述条件