在 Pandas 中使用 .notnull() 时正确的语法是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38702332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:43:04  来源:igfitidea点击:

What is the Right Syntax When Using .notnull() in Pandas?

pythonpandasdataframenull

提问by MEhsan

I want to use .notnull()on several columns of a dataframe to eliminate the rows which contain "NaN" values.

我想.notnull()在数据框的几列上使用来消除包含“NaN”值的行。

Let say I have the following df:

假设我有以下内容df

  A   B   C
0 1   1   1
1 1   NaN 1
2 1   NaN NaN
3 NaN 1   1

I tried to use this syntax but it does not work? do you know what I am doing wrong?

我尝试使用这种语法但它不起作用?你知道我做错了什么吗?

df[[df.A.notnull()],[df.B.notnull()],[df.C.notnull()]]

I get this Error:

我收到此错误:

TypeError: 'Series' objects are mutable, thus they cannot be hashed

What should I do to get the following output?

我应该怎么做才能获得以下输出?

  A   B   C
0 1   1   1

Any idea?

任何的想法?

回答by jezrael

You can first select subset of columns by df[['A','B','C']], then apply notnulland specify if allvalues in mask are True:

您可以首先通过 选择列的子集df[['A','B','C']],然后应用notnull并指定all掩码中的值是否为True

print (df[['A','B','C']].notnull())
       A      B      C
0   True   True   True
1   True  False   True
2   True  False  False
3  False   True   True

print (df[['A','B','C']].notnull().all(1))
0     True
1    False
2    False
3    False
dtype: bool

print (df[df[['A','B','C']].notnull().all(1)])
     A    B    C
0  1.0  1.0  1.0

Another solution is from Ayhancomment with dropna:

另一个解决方案来自Ayhan评论dropna

print (df.dropna(subset=['A', 'B', 'C']))
     A    B    C
0  1.0  1.0  1.0

what is same as:

什么是相同的:

print (df.dropna(subset=['A', 'B', 'C'], how='any'))

and means drop all rows, where is at least one NaNvalue.

and 意味着删除所有行,其中至少有一个NaN值。

回答by Jan Trienes

You can apply multiple conditions by combining them with the &operator (this works not only for the notnull()function).

您可以通过将它们与&运算符组合来应用多个条件(这不仅适用于notnull()函数)。

df[(df.A.notnull() & df.B.notnull() & df.C.notnull())]
     A    B    C
0  1.0  1.0  1.0

Alternatively, you can just drop all columns which contain NaN. The original DataFrame is not modified, instead a copy is returned.

或者,您可以删除所有包含NaN. 原始 DataFrame 不会被修改,而是返回一个副本。

df.dropna()

df.dropna()

回答by Sudhin Joseph

You can simply do:

你可以简单地做:

df.dropna()