如何在 Pandas 的应用函数中测试 nan?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35232705/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:37:54  来源:igfitidea点击:

How to test for nan's in an apply function in pandas?

pythonpandasdataframenan

提问by Hunle

I have a simple applyfunction that I execute on some of the columns. But, it keeps getting tripped up by NaNvalues in pandas.

我有一个apply在某些列上执行的简单函数。但是,它不断得到由绊倒NaN价值观pandas

input_data = np.array(
[
[random.randint(0,9) for x in range(2)]+['']+['g'],
[random.randint(0,9) for x in range(3)]+['g'],
[random.randint(0,9) for x in range(3)]+['a'],
[random.randint(0,9) for x in range(3)]+['b'],
[random.randint(0,9) for x in range(3)]+['b']
]
)

input_df = pd.DataFrame(data=input_data, columns=['B', 'C', 'D', 'label'])

I have a simple lambda like this:

我有一个像这样的简单 lambda:

input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if not np.isnan(aCode) else aCode)

And it gets tripped up by the NaN values:

它被 NaN 值绊倒了:

File "<pyshell#460>", line 1, in <lambda>
    input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if not np.isnan(aCode) else aCode)
TypeError: Not implemented for this type

So, I tried just testing for nan values that Pandas adds:

因此,我尝试只测试 Pandas 添加的 nan 值:

np.isnan(input_df['D'].values[0])
np.isnan(input_df['D'].iloc[0])

Both get the same error.

两者都得到相同的错误。

I do not know how to test for nan values other than np.isnan. Is there an easier way to do this? Thanks.

我不知道如何测试 nan 值以外的np.isnan. 有没有更简单的方法来做到这一点?谢谢。

回答by EdChum

your code fails because your first entry is an empty string and np.isnandoesn't understand empty strings:

您的代码失败,因为您的第一个条目是一个空字符串并且np.isnan不理解空字符串:

In [55]:
input_df['D'].iloc[0]

Out[55]:
''

In [56]:
np.isnan('')

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-56-a9f139a0c5b8> in <module>()
----> 1 np.isnan('')

TypeError: Not implemented for this type

ps.notnulldoes work:

ps.notnull确实有效:

In [57]:
import re
input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if pd.notnull(aCode) else aCode)

Out[57]:
0     
1    3
2    3
3    0
4    3
Name: D, dtype: object

However, if you just want to replace something then just use .str.replace:

但是,如果您只想替换某些内容,则只需使用.str.replace

In [58]:
input_df['D'].str.replace('\.','')

Out[58]:
0     
1    3
2    3
3    0
4    3
Name: D, dtype: object