pandas 如何通过不包含子字符串的单元格过滤熊猫数据框?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30791265/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to filter a pandas dataframe by cells that DO NOT contain a substring?
提问by bpr
I want to filter a dataframe to find rows which do not contain the string 'site'.
我想过滤数据框以查找不包含字符串“站点”的行。
I know how to filter for rows which do contain 'site' but have not been able to get the reverse working. Here is what I have so far:
我知道如何过滤包含“站点”但无法反向工作的行。这是我到目前为止所拥有的:
def rbs(): #removes blocked sites
frame = fill_rate()
mask = frame[frame['Media'].str.contains('Site')==True]
frame = (frame != mask)
return frame
But this returns an error, of course.
但这当然会返回错误。
回答by EdChum
Just do frame[~frame['Media'].str.contains('Site')]
做就是了 frame[~frame['Media'].str.contains('Site')]
The ~negates the boolean condition
在~否定了布尔条件
So your method becomes:
所以你的方法变成:
def rbs(): #removes blocked sites
frame = fill_rate()
return frame[~frame['Media'].str.contains('Site')]
EDIT
编辑
it looks like you have NaNvalues judging by your errors so you have to filter these out first so your method becomes:
看起来你有NaN根据你的错误判断的值,所以你必须先过滤掉这些值,这样你的方法就变成了:
def rbs(): #removes blocked sites
frame = fill_rate()
frame = frame[frame['Media'].notnull()]
return frame[~frame['Media'].str.contains('Site')]
the notnullwill filter out the missing values
在notnull将筛选出的遗漏值

