如何在 Pandas 的应用函数中测试 nan?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35232705/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to test for nan's in an apply function in pandas?
提问by Hunle
I have a simple apply
function that I execute on some of the columns. But, it keeps getting tripped up by NaN
values in pandas
.
我有一个apply
在某些列上执行的简单函数。但是,它不断得到由绊倒NaN
价值观pandas
。
input_data = np.array(
[
[random.randint(0,9) for x in range(2)]+['']+['g'],
[random.randint(0,9) for x in range(3)]+['g'],
[random.randint(0,9) for x in range(3)]+['a'],
[random.randint(0,9) for x in range(3)]+['b'],
[random.randint(0,9) for x in range(3)]+['b']
]
)
input_df = pd.DataFrame(data=input_data, columns=['B', 'C', 'D', 'label'])
I have a simple lambda like this:
我有一个像这样的简单 lambda:
input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if not np.isnan(aCode) else aCode)
And it gets tripped up by the NaN values:
它被 NaN 值绊倒了:
File "<pyshell#460>", line 1, in <lambda>
input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if not np.isnan(aCode) else aCode)
TypeError: Not implemented for this type
So, I tried just testing for nan values that Pandas adds:
因此,我尝试只测试 Pandas 添加的 nan 值:
np.isnan(input_df['D'].values[0])
np.isnan(input_df['D'].iloc[0])
Both get the same error.
两者都得到相同的错误。
I do not know how to test for nan values other than np.isnan
. Is there an easier way to do this? Thanks.
我不知道如何测试 nan 值以外的np.isnan
. 有没有更简单的方法来做到这一点?谢谢。
回答by EdChum
your code fails because your first entry is an empty string and np.isnan
doesn't understand empty strings:
您的代码失败,因为您的第一个条目是一个空字符串并且np.isnan
不理解空字符串:
In [55]:
input_df['D'].iloc[0]
Out[55]:
''
In [56]:
np.isnan('')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-56-a9f139a0c5b8> in <module>()
----> 1 np.isnan('')
TypeError: Not implemented for this type
ps.notnull
does work:
ps.notnull
确实有效:
In [57]:
import re
input_df['D'].apply(lambda aCode: re.sub('\.', '', aCode) if pd.notnull(aCode) else aCode)
Out[57]:
0
1 3
2 3
3 0
4 3
Name: D, dtype: object
However, if you just want to replace something then just use .str.replace
:
但是,如果您只想替换某些内容,则只需使用.str.replace
:
In [58]:
input_df['D'].str.replace('\.','')
Out[58]:
0
1 3
2 3
3 0
4 3
Name: D, dtype: object