Python 将字符串转换为 DataFrame 中的浮点数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16729483/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:27:23  来源:igfitidea点击:

Converting strings to floats in a DataFrame

pythonpandas

提问by Neer

How to covert a DataFrame column containing strings and NaNvalues to floats. And there is another column whose values are strings and floats; how to convert this entire column to floats.

如何将包含字符串和NaN值的 DataFrame 列转换为浮点数。还有另一列的值是字符串和浮点数;如何将整个列转换为浮点数。

回答by root

You can try df.column_name = df.column_name.astype(float). As for the NaNvalues, you need to specify how they should be converted, but you can use the .fillnamethod to do it.

你可以试试df.column_name = df.column_name.astype(float)。至于NaN值,您需要指定它们应该如何转换,但您可以使用.fillna方法来完成。

Example:

例子:

In [12]: df
Out[12]: 
     a    b
0  0.1  0.2
1  NaN  0.3
2  0.4  0.5

In [13]: df.a.values
Out[13]: array(['0.1', nan, '0.4'], dtype=object)

In [14]: df.a = df.a.astype(float).fillna(0.0)

In [15]: df
Out[15]: 
     a    b
0  0.1  0.2
1  0.0  0.3
2  0.4  0.5

In [16]: df.a.values
Out[16]: array([ 0.1,  0. ,  0.4])

回答by Jeff

NOTE:pd.convert_objectshas now been deprecated. You should use pd.Series.astype(float)or pd.to_numericas described in other answers.

注意:pd.convert_objects现在已被弃用。您应该使用pd.Series.astype(float)pd.to_numeric如其他答案中所述。

This is available in 0.11. Forces conversion (or set's to nan) This will work even when astypewill fail; its also series by series so it won't convert say a complete string column

这在 0.11 中可用。强制转换(或设置为 nan)即使astype失败也能正常工作;它也是一个系列的系列,所以它不会转换成一个完整的字符串列

In [10]: df = DataFrame(dict(A = Series(['1.0','1']), B = Series(['1.0','foo'])))

In [11]: df
Out[11]: 
     A    B
0  1.0  1.0
1    1  foo

In [12]: df.dtypes
Out[12]: 
A    object
B    object
dtype: object

In [13]: df.convert_objects(convert_numeric=True)
Out[13]: 
   A   B
0  1   1
1  1 NaN

In [14]: df.convert_objects(convert_numeric=True).dtypes
Out[14]: 
A    float64
B    float64
dtype: object

回答by Claude COULOMBE

df['MyColumnName'] = df['MyColumnName'].astype('float64') 

回答by Salvador Dali

In a newer version of pandas (0.17 and up), you can use to_numericfunction. It allows you to convert the whole dataframe or just individual columns. It also gives you an ability to select how to treat stuff that can't be converted to numeric values:

在较新版本的Pandas(0.17 及更高版本)中,您可以使用to_numeric函数。它允许您转换整个数据框或仅转换单个列。它还使您能够选择如何处理无法转换为数值的内容:

import pandas as pd
s = pd.Series(['1.0', '2', -3])
pd.to_numeric(s)
s = pd.Series(['apple', '1.0', '2', -3])
pd.to_numeric(s, errors='ignore')
pd.to_numeric(s, errors='coerce')

回答by ArmandduPlessis

Here is an example

这是一个例子

                            GHI             Temp  Power Day_Type
2016-03-15 06:00:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:01:00 -7.99999952505459e-7    18.2    0   NaN
2016-03-15 06:02:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:03:00 -7.99999952505459e-7    18.3    0   NaN
2016-03-15 06:04:00 -7.99999952505459e-7    18.3    0   NaN

but if this is all string values...as was in my case... Convert the desired columns to floats:

但如果这都是字符串值......就像我的情况......将所需的列转换为浮点数:

df_inv_29['GHI'] = df_inv_29.GHI.astype(float)
df_inv_29['Temp'] = df_inv_29.Temp.astype(float)
df_inv_29['Power'] = df_inv_29.Power.astype(float)

Your dataframe will now have float values :-)

您的数据框现在将具有浮点值 :-)

回答by Paul Mwaniki

you have to replace empty strings ('') with np.nan before converting to float. ie:

在转换为浮点数之前,您必须用 np.nan 替换空字符串 ('')。IE:

df['a']=df.a.replace('',np.nan).astype(float)