Python 无法在 Pandas 中将字符串转换为浮点数(ValueError)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39125665/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Cannot convert string to float in pandas (ValueError)
提问by John_Mtl
I have a dataframe created form a JSON output that looks like this:
我有一个从 JSON 输出创建的数据框,如下所示:
Total Revenue Average Revenue Purchase count Rate
Date
Monday 1,304.40 CA$ 20.07 CA$ 2,345 1.54 %
The value stored are received as string from the JSON. I am trying to:
存储的值作为字符串从 JSON 接收。我在尝试着:
1) Remove all characters in the entry (ex: CA$ or %) 2) convert rate and revenue columns to float 3) Convert count columns as int
1) 删除条目中的所有字符(例如:CA$ 或 %) 2) 将 rate 和收入列转换为浮点数 3) 将 count 列转换为 int
I tried to do the following:
我尝试执行以下操作:
df[column] = (df[column].str.split()).apply(lambda x: float(x[0]))
It works fine except when I have a value with a coma (ex: 1,465 won't work whereas 143 would).
它工作正常,除非我有一个昏迷的值(例如:1,465 不起作用而 143 会)。
I tried to use several function to replace the "," by "", etc. Nothing worked so far. I always receive the following error:
我尝试使用几个函数来替换“,”为“”等。到目前为止没有任何效果。我总是收到以下错误:
ValueError: could not convert string to float: '1,304.40'
ValueError:无法将字符串转换为浮点数:'1,304.40'
回答by DeepSpace
These strings have commas as thousands separators so you will have to remove them before the call to float
:
这些字符串以逗号作为千位分隔符,因此您必须在调用之前将它们删除float
:
df[column] = (df[column].str.split()).apply(lambda x: float(x[0].replace(',', '')))
This can be simplified a bit by moving split
inside the lambda
:
这可以通过在split
内部移动来简化一点lambda
:
df[column] = df[column].apply(lambda x: float(x.split()[0].replace(',', '')))
回答by jezrael
Another solution with list
comprehension, if need apply string
functionsworking only with Series
(columns of DataFrame
) like str.split
and str.replace
:
另一个具有list
理解力的解决方案,如果需要应用仅适用于(列)的string
函数Series
,DataFrame
例如str.split
和str.replace
:
df = pd.concat([df[col].str.split()
.str[0]
.str.replace(',','').astype(float) for col in df], axis=1)
#if need convert column Purchase count to int
df['Purchase count'] = df['Purchase count'].astype(int)
print (df)
Total Revenue Average Revenue Purchase count Rate
Date
Monday 1304.4 20.07 2345 1.54