pandas Python - 根据数据帧列中的数据将对象数据类型转换为整数、字符串或浮点数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47785101/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:53:52  来源:igfitidea点击:

Python - convert object data type to integer, string or float based on data in dataframe column

pythonpandasnumpy

提问by vbala2014

I have the following issue. I have a dataframe which has various types of columns (int, float, string, etc) - but since they were imported into python using a .csv file all columns are showing as objectdata type. Example is below:

我有以下问题。我有一个数据框,其中包含各种类型的列(int、float、string 等) - 但由于它们是使用 .csv 文件导入到 python 中的,因此所有列都显示为object数据类型。示例如下:

print df_centers

output:

输出:

center name                      ID    state  activity type cost  usage
Bay area recreational facility  10019   LA       swimming       0.5%
Ith area recreational facility  10020   NY       basketball  0  100%

all of these columns have data type = object, i am trying to convert the object data types to their relevant and more meaningful data types. Example below:

所有这些列都具有数据类型 = 对象,我正在尝试将对象数据类型转换为其相关且更有意义的数据类型。下面的例子:

df_centers['cost'] = df_centers['cost'].astype('int')

The reason i am trying to do an INT conversion on the cost field is because i have to do some analysis later on it but python returns with the following error message:

我尝试在成本字段上进行 INT 转换的原因是因为我稍后必须对其进行一些分析,但是 python 返回并显示以下错误消息:

ValueError: invalid literal for long() with base 10: ''

i have also tried converting usage to 0.5% to float with the below and it returns the error message:

我还尝试将使用率转换为 0.5% 以浮动以下内容并返回错误消息:

df_centers['usage'] = df_centers['usage'].astype('float')

output i get is:

我得到的输出是:

invalid literal for float(): 100%

Any suggestions on how i can get this data type conversion done from object to a more relevant type?

关于如何将这种数据类型从对象转换为更相关类型的任何建议?

回答by Gary02127

Usually in CSV files, if it's not just comma-delimited but rather an Excel file, etc, the "object" has a type and value which may help you decipher what's what.

通常在 CSV 文件中,如果它不仅仅是逗号分隔而是 Excel 文件等,则“对象”具有一个类型和值,可以帮助您破译什么是什么。

In the interim, to convert money values to numbers, strip off the leading '$' and convert to float. For the percentages, strip off the %, convert the number to float, then divide it by 100.

在此期间,要将货币值转换为数字,请去除前导“$”并转换为浮点数。对于百分比,去掉 %,将数字转换为浮点数,然后除以 100。

So, this:

所以这:

df_centers['cost']  = df_centers['cost'].astype('int')
df_centers['usage'] = df_centers['usage'].astype('float')

should be:

应该:

df_centers['cost']  = df_centers['cost'].str.lstrip('$').astype('int')
#                                       ^^^^^^^^^^^^^^^^
df_centers['usage'] = df_centers['usage'].str.rstrip('%').astype('float') / 100.0
#                                        ^^^^^^^^^^^^^^^^