pandas 对比 np.nan 和 isnull() 的区别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41342609/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:40:50  来源:igfitidea点击:

The difference between comparison to np.nan and isnull()

pythonpandasnumpy

提问by sergzach

I supposed that

我以为

data[data.agefm.isnull()]

and

data[data.agefm == numpy.nan]

are equivalent. But no, the first truly returns rows where agefm is NaN, but the second returns an empty DataFrame. I thank that omitted values are always equal to np.nan, but it seems wrong.

是等价的。但是不,第一个真正返回agefm 为NaN 的行,但第二个返回一个空的DataFrame。我感谢省略的值总是等于np.nan,但这似乎是错误的。

agefm column has float64 type:

agefm 列具有 float64 类型:

(Pdb) data.agefm.describe()
count    2079.000000
mean       20.686388
std         5.002383
min        10.000000
25%        17.000000
50%        20.000000
75%        23.000000
max        46.000000
Name: agefm, dtype: float64

Could you explain me please, what does data[data.agefm == np.nan]mean exactly?

你能解释一下data[data.agefm == np.nan]吗,具体是什么意思?

回答by piRSquared

np.nanis not comparable to np.nan... directly.

np.nan不能np.nan直接与……相提并论。

np.nan == np.nan

False

While

尽管

np.isnan(np.nan)

True

Could also do

也可以做

pd.isnull(np.nan)

True


examples
Filters nothing because nothing is equal to np.nan

示例 不
过滤任何内容,因为没有任何内容等于np.nan

s = pd.Series([1., np.nan, 2.])
s[s != np.nan]

0    1.0
1    NaN
2    2.0
dtype: float64

Filters out the null

过滤掉空值

s = pd.Series([1., np.nan, 2.])
s[s.notnull()]

0    1.0
2    2.0
dtype: float64

Use odd comparison behavior to get what we want anyway. If np.nan != np.nanis Truethen

无论如何,使用奇怪的比较行为来获得我们想要的东西。如果np.nan != np.nanTrue那么

s = pd.Series([1., np.nan, 2.])
s[s == s]

0    1.0
2    2.0
dtype: float64

Just dropna

只是 dropna

s = pd.Series([1., np.nan, 2.])
s.dropna()

0    1.0
2    2.0
dtype: float64