Python 熊猫将字符串转换为数据框中多列的浮点数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16643695/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:15:21  来源:igfitidea点击:

pandas convert strings to float for multiple columns in dataframe

pythonpandas

提问by user1507844

I'm new to pandas and trying to figure out how to convert multiple columns which are formatted as strings to float64's. Currently I'm doing the below, but it seems like apply() or applymap() should be able to accomplish this task even more efficiently...unfortunately I'm a bit too much of a rookie to figure out how. Currently the values are percentages formatted as strings like '15.5%'

我是熊猫的新手,并试图弄清楚如何将格式化为字符串的多列转换为 float64。目前我正在执行以下操作,但似乎 apply() 或 applymap() 应该能够更有效地完成这项任务......不幸的是,我是一个新手,无法弄清楚如何做。目前,这些值是百分比格式的字符串,如“15.5%”

for column in ['field1', 'field2', 'field3']:
    data[column] = data[column].str.rstrip('%').astype('float64') / 100

采纳答案by Jeff

Starting in 0.11.1 (coming out this week), replace has a new option to replace with a regex, so this becomes possible

从 0.11.1(本周推出)开始,replace 有一个新选项可以用正则表达式替换,所以这成为可能

In [14]: df = DataFrame('10.0%',index=range(100),columns=range(10))

In [15]: df.replace('%','',regex=True).astype('float')/100
Out[15]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 100 entries, 0 to 99
Data columns (total 10 columns):
0    100  non-null values
1    100  non-null values
2    100  non-null values
3    100  non-null values
4    100  non-null values
5    100  non-null values
6    100  non-null values
7    100  non-null values
8    100  non-null values
9    100  non-null values
dtypes: float64(10)

And a bit faster

而且快一点

In [16]: %timeit df.replace('%','',regex=True).astype('float')/100
1000 loops, best of 3: 1.16 ms per loop

 In [18]: %timeit df.applymap(lambda x: float(x[:-1]))/100
1000 loops, best of 3: 1.67 ms per loop

回答by waitingkuo

df.applymap(lambda x:float(x.rstrip('%'))/100)

回答by nigel76

answering a comment in the accepted answer: for specific columns make sure you don't do it inplace.

在接受的答案中回答评论:对于特定的列,请确保您没有就地进行。

df['Column1'] = df['Column1'].replace('%','',regex=True).astype('float')/100