Python Numpy isnan() 在浮点数组上失败(来自 Pandas 数据帧应用)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36000993/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:15:13  来源:igfitidea点击:

Numpy isnan() fails on an array of floats (from pandas dataframe apply)

pythonarraysnumpypandas

提问by tim654321

I have an array of floats (some normal numbers, some nans) that is coming out of an apply on a pandas dataframe.

我有一组浮点数(一些正常数字,一些 nans),它们来自对 Pandas 数据框的应用。

For some reason, numpy.isnan is failing on this array, however as shown below, each element is a float, numpy.isnan runs correctly on each element, the type of the variable is definitely a numpy array.

出于某种原因,numpy.isnan 在这个数组上失败了,但是如下所示,每个元素都是一个浮点数,numpy.isnan 在每个元素上正确运行,变量的类型肯定是一个 numpy 数组。

What's going on?!

这是怎么回事?!

set([type(x) for x in tester])
Out[59]: {float}

tester
Out[60]: 
array([-0.7000000000000001, nan, nan, nan, nan, nan, nan, nan, nan, nan,
   nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
   nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
   nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
   nan, nan], dtype=object)

set([type(x) for x in tester])
Out[61]: {float}

np.isnan(tester)
Traceback (most recent call last):

File "<ipython-input-62-e3638605b43c>", line 1, in <module>
np.isnan(tester)

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

set([np.isnan(x) for x in tester])
Out[65]: {False, True}

type(tester)
Out[66]: numpy.ndarray

回答by unutbu

np.isnancan be applied to NumPy arrays of native dtype (such as np.float64):

np.isnan可以应用于原生 dtype 的 NumPy 数组(例如 np.float64):

In [99]: np.isnan(np.array([np.nan, 0], dtype=np.float64))
Out[99]: array([ True, False], dtype=bool)

but raises TypeError when applied to object arrays:

但是在应用于对象数组时会引发 TypeError:

In [96]: np.isnan(np.array([np.nan, 0], dtype=object))
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''


Since you have Pandas, you could use pd.isnullinstead -- it can accept NumPy arrays of object or native dtypes:

既然你有 Pandas,你可以pd.isnull改用——它可以接受对象或本机 dtypes 的 NumPy 数组:

In [97]: pd.isnull(np.array([np.nan, 0], dtype=float))
Out[97]: array([ True, False], dtype=bool)

In [98]: pd.isnull(np.array([np.nan, 0], dtype=object))
Out[98]: array([ True, False], dtype=bool)

Note that Noneis also considered a null value in object arrays.

请注意,None它也被视为对象数组中的空值。

回答by Severin Pappadeux

On top of @unutbu answer, you could coerce pandas numpy object array to native (float64) type, something along the line

在@unutbu 答案之上,您可以将 Pandas numpy 对象数组强制转换为本机 (float64) 类型,沿着这条线

import pandas as pd
pd.to_numeric(df['tester'], errors='coerce')

Specify errors='coerce' to force strings that can't be parsed to a numeric value to become NaN. Column type would be dtype: float64, and then isnancheck should work

指定 errors='coerce' 强制无法解析为数值的字符串变为 NaN。列类型将是dtype: float64,然后isnan检查应该工作

回答by Statham

A great substitute for np.isnan() and pd.isnull() is

np.isnan() 和 pd.isnull() 的一个很好的替代品是

for i in range(0,a.shape[0]):
    if(a[i]!=a[i]):
       //do something here
       //a[i] is nan

since only nan is not equal to itself.

因为只有 nan 不等于它自己。