如何在 Pandas 中使用 base 10 错误修复 int() 的无效文字

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43858595/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:33:52  来源:igfitidea点击:

How do I fix invalid literal for int() with base 10 error in pandas

python-2.7pandasintjupyter-notebookvalueerror

提问by Caribgirl

This is the error that is showing up whenever i try to convert the dataframe to int.

这是每当我尝试将数据帧转换为 int 时出现的错误。

("invalid literal for int() with base 10: '260,327,021'", 'occurred at index Population1'

(“以 10 为基数的 int() 的无效文字:'260,327,021'”,'发生在索引 Population1'

Everything in the df is a number. I assume the error is due to the extra quote at the end but how do i fix it?

df 中的所有内容都是数字。我认为错误是由于最后的额外报价引起的,但我该如何解决?

回答by piRSquared

I run this

我运行这个

int('260,327,021')

and get this

得到这个

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-448-a3ba7c4bd4fe> in <module>()
----> 1 int('260,327,021')

ValueError: invalid literal for int() with base 10: '260,327,021'
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-448-a3ba7c4bd4fe> in <module>()
----> 1 int('260,327,021')

ValueError: invalid literal for int() with base 10: '260,327,021'

I assure you that not everything in your dataframe is a number. It may look like a number, but it is a string with commas in it.

我向您保证,并非数据框中的所有内容都是数字。它可能看起来像一个数字,但它是一个包含逗号的字符串。

You'll want to replace your commas and then turn to an int

你会想要替换你的逗号,然后转向 int

pd.Series(['260,327,021']).str.replace(',', '').astype(int)

0    260327021
dtype: int64

回答by kristian

Others might encounter the following issue, when the string is a float:

当字符串是浮点数时,其他人可能会遇到以下问题:

    >>> int("34.54545")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '34.54545'

The workaround for this is to convert to a float first and then to an int:

解决方法是先转换为浮点数,然后再转换为整数:

>>> int(float("34.54545"))
34

Or pandas specific:

或Pandas特定:

df.astype(float).astype(int)

回答by Abhishek Sinha

I solved the error using pandas.to_numeric

我使用pandas.to_numeric解决了错误

In your case,

在你的情况下,

data.Population1 = pd.to_numeric(data.Population1, errors="coerce")

'data' is the parent Object.

“数据”是父对象。

After that, you can convert float to int as well

之后,您也可以将 float 转换为 int

data.Population1.astype(int)