Python 带有 base 10 错误的 long() 的熊猫无效文字

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38918653/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:37:37  来源:igfitidea点击:

pandas invalid literal for long() with base 10 error

pythonpandasdataframecastingint

提问by Night Walker

I am trying to do: df['Num_Detections'] = df['Num_Detections'].astype(int)

我正在尝试做: df['Num_Detections'] = df['Num_Detections'].astype(int)

And i get following error:

我收到以下错误:

ValueError: invalid literal for long() with base 10: '12.0'

ValueError:long() 的无效文字,基数为 10:'12.0'

My data looks looks following:

我的数据看起来如下:

>>> df['Num_Detections'].head()
Out[6]: 
sku_name
DOBRIY MORS GRAPE-CRANBERRY-RASBERRY 1L     12.0
AQUAMINERALE 5.0L                            9.0
DOBRIY PINEAPPLE 1.5L                        2.0
FRUKT.SAD APPLE 0.95L                      154.0
DOBRIY PEACH-APPLE 0.33L                    71.0
Name: Num_Detections, dtype: object

Any idea how to do the conversion correctly ?

知道如何正确进行转换吗?

Thanks for help.

感谢帮助。

回答by jezrael

There is some value, which cannot be converted to int.

有一些值,无法转换为int.

You can use to_numericand get NaNwhere is problematic value:

您可以使用to_numeric并获取NaN有问题的值:

df['Num_Detections'] = pd.to_numeric(df['Num_Detections'], errors='coerce')

If need check rows with problematic values, use boolean indexingwith mask with isnull:

如果需要检查具有问题值的行,请使用boolean indexing带掩码的isnull

print (df[ pd.to_numeric(df['Num_Detections'], errors='coerce').isnull()])

Sample:

样本:

df = pd.DataFrame({'Num_Detections':[1,2,'a1']})

print (df)
  Num_Detections
0              1
1              2
2             a1

print (df[ pd.to_numeric(df['Num_Detections'], errors='coerce').isnull()])
  Num_Detections
2             a1

df['Num_Detections'] = pd.to_numeric(df['Num_Detections'], errors='coerce')
print (df)
   Num_Detections
0             1.0
1             2.0
2             NaN