错误:无法解析位置 6116 处的字符串“*” - 将对象类型转换为 Int - Pandas
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45177209/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Error: Unable to parse string "*" at position 6116 - Convert Object Type to Int - Pandas
提问by i.n.n.m
This question has been asked in many threads and has worked for others, but not for me. I am trying to convert object
data type into int
to perform a group by aggregation.
Following are what I tried and the errors I got so far, (I am using python 3)
According to this link, I tried these two:
这个问题已经在许多线程中被问到并且对其他人有用,但对我却没有。我正在尝试将object
数据类型转换int
为通过聚合执行组。以下是我尝试过的以及到目前为止遇到的错误(我正在使用 python 3)根据此链接,我尝试了以下两个:
df['my_var'] = df['my_var'].astype(str).astype(int)
df['my_var'] = df['my_var'].astype(int)
Same error for both:
两者都有相同的错误:
ValueError: invalid literal for int() with base 10: '*'
ValueError:int() 的无效文字,基数为 10:'*'
And then I tried,
然后我尝试了,
df['my_var'] = pd.to_numeric(df['my_var'])
I got an error:
我有一个错误:
ValueError: Unable to parse string "*" at position 6116
ValueError:无法解析位置 6116 处的字符串“*”
This is how dtypes
looks like,
长dtypes
这样,
print (df.dtypes)
my_var object
dtype: object
I know some of the similar questions are down voted, however, I did not succeed using those answers. Is it a version error? I am finding it difficult to understand this error. Any help or suggestion would be appreciated.
我知道一些类似的问题被否决了,但是,我没有成功使用这些答案。是版本错误吗?我发现很难理解这个错误。任何帮助或建议将不胜感激。
采纳答案by i.n.n.m
After getting suggestions from #DYZ and #MaxU, it was an error due to the special character *
in a row in in my DataFrame. (Error message was obvious)
从#DYZ 和#MaxU 获得建议后,由于*
我的 DataFrame 中一行中的特殊字符,这是一个错误。(错误信息很明显)
As suggested, using,
按照建议,使用,
df[df['my_var']=='*']
and
和
df.loc[pd.to_numeric(df['my_var'], errors='coerce').isnull()]
I found where exactly the special character was. Then used regular expression method to strip off special characters using this thread.
我找到了特殊字符的确切位置。然后使用正则表达式方法使用此线程剥离特殊字符。
回答by A.Kot
I used 0 to replace anything that isn't a number but you can use any other value that makes sense to you e.g. -999999 (not a suggested practice obviously but just an example)
我用 0 来替换任何不是数字的东西,但你可以使用任何其他对你有意义的值,例如 -999999(显然不是建议的做法,只是一个例子)
pd.to_numeric(df.my_var, errors='coerce').fillna(0).astype(int)