pandas 在熊猫数据框上使用 str.contains
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31745509/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using str.contains on pandas dataframe
提问by Davtho1983
This pandas python code generates the error message,
此 Pandas python 代码生成错误消息,
"TypeError: bad operand type for unary ~: 'float'"
“类型错误:一元操作数类型错误〜:'float'”
I have no idea why because I'm trying to manipulate a str object
我不知道为什么,因为我正在尝试操作 str 对象
df_Anomalous_Vendor_Reasons[~df_Anomalous_Vendor_Reasons['V'].str.contains("File*|registry*")] #sorts, leaving only cases where reason is NOT File or Registry
Anybody got any ideas?
有人有任何想法吗?
回答by Josh
Credit to Davtho1983 comment above, I thought I'd add color to the comment for clarity.
归功于上面的 Davtho1983 评论,我想为了清晰起见我会为评论添加颜色。
For anyone stumbling on this later with the same error (like me). It's a very simple fix. The documentationfrom pandas shows
对于以后遇到相同错误的人(像我一样)。这是一个非常简单的修复。Pandas的文档显示
Series.str.contains(pat, case=True, flags=0, na=nan, regex=True)
What's happening is the contains() method isn't being applied to na values in the DataFrame, they will remain na. You just need to fill na values with Boolean values so you may use the invert operator ~.
发生的事情是 contains() 方法没有应用于 DataFrame 中的 na 值,它们将保持为 na。您只需要用布尔值填充 na 值,以便您可以使用反转运算符~。
With the example above one should use
对于上面的例子,应该使用
df_Anomalous_Vendor_Reasons[~df_Anomalous_Vendor_Reasons['V'].str.contains("File*|registry*", na=False)]
Of course one should choose False or True for the na argument based on intended logic. Whichever Boolean value you choose for filling na will be inverted.
当然,应该根据预期的逻辑为 na 参数选择 False 或 True 。您选择用于填充 na 的任何布尔值都将被反转。

