pandas 在 np.where 子句之后,熊猫无法识别 NaN。为什么?或者这是一个错误?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34752625/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
NaN is not recognized in pandas after np.where clause. Why? Or is it a bug?
提问by keiv.fly
NaN is not recognized in pandas after np.where clause. Why? Or is it a bug?
在 np.where 子句之后,Pandas无法识别 NaN。为什么?或者这是一个错误?
The last line of this code should be "True"
此代码的最后一行应为“True”
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: a=pd.Series([1,np.nan])
In [4]: b=pd.DataFrame(["a","b"])
In [5]: b["1"]=np.where(
a.isnull(),
np.nan,
"Hello"
)
In [6]: b
Out[6]:
0 1
0 a Hello
1 b nan
In [7]: b[1].isnull()
Out[7]:
0 False
1 False
Name: 1, dtype: bool
回答by BrenBarn
You can see why if you look at the result of the where:
如果您查看以下结果,您就会明白为什么where:
>>> np.where(a.isnull(), np.nan, "Hello")
array([u'Hello', u'nan'],
dtype='<U32')
Because your other value is a string, whereconverts your NaNto a string as well and gives you a string-dtyped result. (The exact dtype you get may different depending on your platform and/or Python version.) So you don't actually have a NaN in your result at all, you just have the string "nan".
因为您的另一个值是字符串,所以也where将您的值转换NaN为字符串并为您提供字符串类型的结果。(您获得的确切 dtype 可能因您的平台和/或 Python 版本而异。)因此,您的结果中实际上根本没有 NaN,您只有 string "nan"。
If you want to do this type of mapping (in particular, mapping that changes dtypes) in pandas, it's usually better to use pandas constructs like .mapand avoid dropping into numpy, because as you saw, numpy tends to do unhelpful things when it has to resolve conflicting types. Here's an example of how to do it all in pandas:
如果您想在 Pandas 中进行这种类型的映射(特别是更改 dtypes 的映射),通常最好使用 Pandas 结构,例如.map避免掉入numpy,因为如您所见,numpy 在必须时往往会做无用的事情解决冲突类型。以下是如何在 Pandas 中完成所有操作的示例:
>>> b["X"] = a.isnull().map({True: np.nan, False: "Hello"})
>>> b
0 X
0 a Hello
1 b NaN
>>> b.X.isnull()
0 False
1 True
Name: X, dtype: bool

