在 Pandas 中使用布尔掩码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16688718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using boolean masks in Pandas
提问by elksie5000
This is probably a trivial query but I can't work it out.
这可能是一个微不足道的查询,但我无法解决。
Essentially, I want to be able to filter out noisy tweets from a dataframe below
本质上,我希望能够从下面的数据框中过滤掉嘈杂的推文
<class 'pandas.core.frame.DataFrame'>
Int64Index: 140381 entries, 0 to 140380
Data columns:
text 140381 non-null values
created_at 140381 non-null values
id 140381 non-null values
from_user 140381 non-null values
geo 5493 non-null values
dtypes: float64(1), object(4)
I can create a dataframe based on unwanted keywords thus:
我可以根据不需要的关键字创建一个数据框,因此:
junk = df[df.text.str.contains("Swans")]
But what's the best way to use this to see what's left?
但是,使用它来查看还剩下什么的最佳方法是什么?
回答by waitingkuo
df[~df.text.str.contains("Swans")]
回答by Mohamed Ali JAMAOUI
You can also use the following two options:
您还可以使用以下两个选项:
option 1:
选项1:
df[-df.text.str.contains("Swans")]
option 2:
选项2:
import numpy as np
df[np.invert(df.text.str.contains("Swans"))]

