如何在python中将数据类型:对象转换为float64?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28277137/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:02:17  来源:igfitidea点击:

How to convert datatype:object to float64 in python?

pythonpandas

提问by Ning Chen

I am going around in circles and tried so many different ways so I guess my core understanding is wrong. I would be grateful for help in understanding my encoding/decoding issues.

我兜兜转转,尝试了很多不同的方法,所以我想我的核心理解是错误的。我将不胜感激帮助理解我的编码/解码问题。

I import the dataframe from SQL and it seems that some datatypes:float64 are converted to Object. Thus, I cannot do any calculation. I fail to convert the Object back to float64.

我从 SQL 导入数据框,似乎某些数据类型:float64 被转换为对象。因此,我无法进行任何计算。我无法将对象转换回 float64。

df.head()

df.head()

Date        WD  Manpower 2nd     CTR    2ndU    T1    T2      T3      T4 

2013/4/6    6   NaN     2,645   5.27%   0.29    407     533     454     368
2013/4/7    7   NaN     2,118   5.89%   0.31    257     659     583     369
2013/4/13   6   NaN     2,470   5.38%   0.29    354     531     473   383
2013/4/14   7   NaN     2,033   6.77%   0.37    396     748     681     458
2013/4/20   6   NaN     2,690   5.38%   0.29    361     528     541     381

df.dtypes

df.dtypes

WD             float64
Manpower       float64
2nd             object
CTR             object
2ndU           float64
T1              object
T2              object
T3              object
T4              object
T5              object

dtype: object

SQL table:

SQL表:

enter image description here

在此处输入图片说明

采纳答案by EdChum

You can convert most of the columns by just calling convert_objects:

您只需调用即可转换大部分列convert_objects

In [36]:

df = df.convert_objects(convert_numeric=True)
df.dtypes
Out[36]:
Date         object
WD            int64
Manpower    float64
2nd          object
CTR          object
2ndU        float64
T1            int64
T2          int64
T3           int64
T4        float64
dtype: object

For column '2nd' and 'CTR' we can call the vectorised strmethods to replace the thousands separator and remove the '%' sign and then astypeto convert:

对于列“2nd”和“CTR”,我们可以调用向量化str方法来替换千位分隔符并删除“%”符号,然后astype进行转换:

In [39]:

df['2nd'] = df['2nd'].str.replace(',','').astype(int)
df['CTR'] = df['CTR'].str.replace('%','').astype(np.float64)
df.dtypes
Out[39]:
Date         object
WD            int64
Manpower    float64
2nd           int32
CTR         float64
2ndU        float64
T1            int64
T2            int64
T3            int64
T4           object
dtype: object
In [40]:

df.head()
Out[40]:
        Date  WD  Manpower   2nd   CTR  2ndU   T1    T2   T3     T4
0   2013/4/6   6       NaN  2645  5.27  0.29  407   533  454    368
1   2013/4/7   7       NaN  2118  5.89  0.31  257   659  583    369
2  2013/4/13   6       NaN  2470  5.38  0.29  354   531  473    383
3  2013/4/14   7       NaN  2033  6.77  0.37  396   748  681    458
4  2013/4/20   6       NaN  2690  5.38  0.29  361   528  541    381

Or you can do the string handling operations above without the call to astypeand then call convert_objectsto convert everything in one go.

或者,您可以在不调用 的情况下执行上述字符串处理操作astype,然后调用convert_objects以一次性转换所有内容。

UPDATE

更新

Since version 0.17.0convert_objectsis deprecated and there isn't a top-level function to do this so you need to do:

由于版本0.17.0convert_objects已被弃用并且没有顶级函数来执行此操作,因此您需要执行以下操作:

df.apply(lambda col:pd.to_numeric(col, errors='coerce'))

df.apply(lambda col:pd.to_numeric(col, errors='coerce'))

See the docsand this related question: pandas: to_numeric for multiple columns

请参阅文档和此相关问题:pandas: to_numeric for multiple columns

回答by Nirali Khoda

You can try this:

你可以试试这个:

df['2nd'] = pd.to_numeric(df['2nd'].str.replace(',', ''))
df['CTR'] = pd.to_numeric(df['CTR'].str.replace('%', ''))

回答by Amir

Or you can use regular expression to handle multiple items as the general case of this issue,

或者你可以使用正则表达式来处理多个项目作为这个问题的一般情况,

df['2nd'] = pd.to_numeric(df['2nd'].str.replace(r'[,.%]','')) 
df['CTR'] = pd.to_numeric(df['CTR'].str.replace(r'[^\d%]',''))

回答by Sesquipedalism

convert_objects is deprecated.

不推荐使用 convert_objects。

For pandas >= 0.17.0, use pd.to_numeric

对于 >= 0.17.0 的熊猫,使用pd.to_numeric

df["2nd"] = pd.to_numeric(df["2nd"])

回答by S. Jessen

I had this problem in a DataFrame (df) created from an Excel-sheet with several internal header rows.

df在从具有多个内部标题行的 Excel 工作表创建的 DataFrame ( ) 中遇到了这个问题。

After cleaning out the internal header rows from df, the columns' values were of "non-null object" type (DataFrame.info()).

从 中清除内部标题行后df,列的值属于“非空对象”类型 ( DataFrame.info())。

This code converted all numerical values of multiple columns to int64 and float64 in one go:

此代码将多列的所有数值一次性转换为 int64 和 float64:

for i in range(0, len(df.columns)):
    df.iloc[:,i] = pd.to_numeric(df.iloc[:,i], errors='ignore')
    # errors='ignore' lets strings remain as 'non-null objects'