错误:无法在 Pandas 中将浮点 NaN 转换为整数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44896377/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Error:cannot convert float NaN to integer in pandas
提问by
I have the dataframe:
我有数据框:
a b c d
0 nan Y nan nan
1 1.27838e+06 N 3 96
2 nan N 2 nan
3 284633 Y nan 44
I try to change the data which is non zero to interger type to avoid exponential data(1.27838e+06):
我尝试将非零数据更改为整数类型以避免指数数据(1.27838e+06):
f=lambda x : int(x)
df['a']=np.where(df['a']==None,np.nan,df['a'].apply(f))
But I get error also event thought I wish to change the dtype of not null value, anyone can point out my error? thanks
但是我也收到错误,认为我希望更改非空值的 dtype,任何人都可以指出我的错误吗?谢谢
回答by Ken Wei
Pandas doesn't have the ability to store NaN values for integers. Strictly speaking, you could have a column with mixed data types, but this can be computationally inefficient. So if you insist, you can do
Pandas 不能为 integers 存储 NaN 值。严格来说,您可以有一个包含混合数据类型的列,但这在计算上可能效率低下。所以如果你坚持,你可以做到
df['a'] = df['a'].astype('O')
df.loc[df['a'].notnull(), 'a'] = df.loc[df['a'].notnull(), 'a'].astype(int)
回答by lmiguelvargasf
As far as I have read in the pandas documentation, it is not possible to represent an integer NaN
:
据我在Pandas文档中读到的,不可能表示一个整数NaN
:
"In the absence of high performance NA support being built into NumPy from the ground up, the primary casualty is the ability to represent NAs in integer arrays."
“在 NumPy 中没有从头开始内置高性能 NA 支持的情况下,主要的损失是在整数数组中表示 NA 的能力。”
As it is explained later, it is due to memory and performance reasons, and also so that the resulting Series continues to be “numeric”. One possibility is to use dtype=object
arrays instead.
正如后面解释的那样,这是由于内存和性能原因,而且结果系列仍然是“数字”。一种可能性是改用dtype=object
数组。