在 Pandas 中使用 .notnull() 时正确的语法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38702332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the Right Syntax When Using .notnull() in Pandas?
提问by MEhsan
I want to use .notnull()
on several columns of a dataframe to eliminate the rows which contain "NaN" values.
我想.notnull()
在数据框的几列上使用来消除包含“NaN”值的行。
Let say I have the following df
:
假设我有以下内容df
:
A B C
0 1 1 1
1 1 NaN 1
2 1 NaN NaN
3 NaN 1 1
I tried to use this syntax but it does not work? do you know what I am doing wrong?
我尝试使用这种语法但它不起作用?你知道我做错了什么吗?
df[[df.A.notnull()],[df.B.notnull()],[df.C.notnull()]]
I get this Error:
我收到此错误:
TypeError: 'Series' objects are mutable, thus they cannot be hashed
What should I do to get the following output?
我应该怎么做才能获得以下输出?
A B C
0 1 1 1
Any idea?
任何的想法?
回答by jezrael
You can first select subset of columns by df[['A','B','C']]
, then apply notnull
and specify if all
values in mask are True
:
您可以首先通过 选择列的子集df[['A','B','C']]
,然后应用notnull
并指定all
掩码中的值是否为True
:
print (df[['A','B','C']].notnull())
A B C
0 True True True
1 True False True
2 True False False
3 False True True
print (df[['A','B','C']].notnull().all(1))
0 True
1 False
2 False
3 False
dtype: bool
print (df[df[['A','B','C']].notnull().all(1)])
A B C
0 1.0 1.0 1.0
Another solution is from Ayhan
comment with dropna
:
print (df.dropna(subset=['A', 'B', 'C']))
A B C
0 1.0 1.0 1.0
what is same as:
什么是相同的:
print (df.dropna(subset=['A', 'B', 'C'], how='any'))
and means drop all rows, where is at least one NaN
value.
and 意味着删除所有行,其中至少有一个NaN
值。
回答by Jan Trienes
You can apply multiple conditions by combining them with the &
operator (this works not only for the notnull()
function).
您可以通过将它们与&
运算符组合来应用多个条件(这不仅适用于notnull()
函数)。
df[(df.A.notnull() & df.B.notnull() & df.C.notnull())]
A B C
0 1.0 1.0 1.0
Alternatively, you can just drop all columns which contain NaN
. The original DataFrame is not modified, instead a copy is returned.
或者,您可以删除所有包含NaN
. 原始 DataFrame 不会被修改,而是返回一个副本。
df.dropna()
df.dropna()
回答by Sudhin Joseph
You can simply do:
你可以简单地做:
df.dropna()