Python 熊猫将字符串转换为数据框中多列的浮点数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16643695/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas convert strings to float for multiple columns in dataframe
提问by user1507844
I'm new to pandas and trying to figure out how to convert multiple columns which are formatted as strings to float64's. Currently I'm doing the below, but it seems like apply() or applymap() should be able to accomplish this task even more efficiently...unfortunately I'm a bit too much of a rookie to figure out how. Currently the values are percentages formatted as strings like '15.5%'
我是熊猫的新手,并试图弄清楚如何将格式化为字符串的多列转换为 float64。目前我正在执行以下操作,但似乎 apply() 或 applymap() 应该能够更有效地完成这项任务......不幸的是,我是一个新手,无法弄清楚如何做。目前,这些值是百分比格式的字符串,如“15.5%”
for column in ['field1', 'field2', 'field3']:
data[column] = data[column].str.rstrip('%').astype('float64') / 100
采纳答案by Jeff
Starting in 0.11.1 (coming out this week), replace has a new option to replace with a regex, so this becomes possible
从 0.11.1(本周推出)开始,replace 有一个新选项可以用正则表达式替换,所以这成为可能
In [14]: df = DataFrame('10.0%',index=range(100),columns=range(10))
In [15]: df.replace('%','',regex=True).astype('float')/100
Out[15]:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 100 entries, 0 to 99
Data columns (total 10 columns):
0 100 non-null values
1 100 non-null values
2 100 non-null values
3 100 non-null values
4 100 non-null values
5 100 non-null values
6 100 non-null values
7 100 non-null values
8 100 non-null values
9 100 non-null values
dtypes: float64(10)
And a bit faster
而且快一点
In [16]: %timeit df.replace('%','',regex=True).astype('float')/100
1000 loops, best of 3: 1.16 ms per loop
In [18]: %timeit df.applymap(lambda x: float(x[:-1]))/100
1000 loops, best of 3: 1.67 ms per loop
回答by waitingkuo
df.applymap(lambda x:float(x.rstrip('%'))/100)
回答by nigel76
answering a comment in the accepted answer: for specific columns make sure you don't do it inplace.
在接受的答案中回答评论:对于特定的列,请确保您没有就地进行。
df['Column1'] = df['Column1'].replace('%','',regex=True).astype('float')/100

