有效地检查 Python/numpy/pandas 中的任意对象是否为 NaN？

Question

提问by Dun Peal

My numpy arrays use np.nanto designate missing values. As I iterate over the data set, I need to detect such missing values and handle them in special ways.

我的 numpy 数组用于np.nan指定缺失值。当我遍历数据集时，我需要检测此类缺失值并以特殊方式处理它们。

Naively I used numpy.isnan(val), which works well unless valisn't among the subset of types supported by numpy.isnan(). For example, missing data can occur in string fields, in which case I get:

我天真地使用了numpy.isnan(val)，除非val不在numpy.isnan(). 例如，缺失数据可能出现在字符串字段中，在这种情况下，我得到：

>>> np.isnan('some_string')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Not implemented for this type

Other than writing an expensive wrapper that catches the exception and returns False, is there a way to handle this elegantly and efficiently?

除了编写一个昂贵的包装器来捕获异常并返回之外False，有没有办法优雅有效地处理这个问题？

Answer 1

采纳答案by Marius

pandas.isnull()(also pd.isna(), in newer versions) checks for missing values in both numeric and string/object arrays. From the documentation, it checks for:

pandas.isnull()（也在pd.isna()较新版本中）检查数字和字符串/对象数组中的缺失值。从文档中，它检查：

NaN in numeric arrays, None/NaN in object arrays

数值数组中的 NaN，对象数组中的 None/NaN

Quick example:

快速示例：

import pandas as pd
import numpy as np
s = pd.Series(['apple', np.nan, 'banana'])
pd.isnull(s)
Out[9]: 
0    False
1     True
2    False
dtype: bool

The idea of using numpy.nanto represent missing values is something that pandasintroduced, which is why pandashas the tools to deal with it.

使用numpy.nan来表示缺失值的想法是pandas引入的，这就是为什么pandas有工具来处理它。

Datetimes too (if you use pd.NaTyou won't need to specify the dtype)

日期时间也是（如果您使用pd.NaT，则不需要指定 dtype）

In [24]: s = Series([Timestamp('20130101'),np.nan,Timestamp('20130102 9:30')],dtype='M8[ns]')

In [25]: s
Out[25]: 
0   2013-01-01 00:00:00
1                   NaT
2   2013-01-02 09:30:00
dtype: datetime64[ns]``

In [26]: pd.isnull(s)
Out[26]: 
0    False
1     True
2    False
dtype: bool

Answer 2

回答by Hammer

Is your type really arbitrary? If you know it is just going to be a int float or string you could just do

你的类型真的很随意吗？如果你知道它只是一个 int 浮点数或字符串，你可以做

 if val.dtype == float and np.isnan(val):

assuming it is wrapped in numpy , it will always have a dtype and only float and complex can be NaN

假设它被包裹在 numpy 中，它总是有一个 dtype 并且只有 float 和 complex 可以是 NaN

有效地检查 Python/numpy/pandas 中的任意对象是否为 NaN？

提问by Dun Peal

采纳答案by Marius

回答by Hammer

相关推荐

最近更新

标签

有效地检查 Python/numpy/pandas 中的任意对象是否为 NaN？

提问by Dun Peal

采纳答案by Marius

回答by Hammer

相关推荐

JSON - 在 python 中循环生成一个 json

Python 访问数据帧的最后一个索引值

Python 如何在 Pandas 的特定列索引处插入一列？

Python 在 Tkinter 中使用 OpenCV

相关推荐

最近更新

标签