Python 将百分比字符串转换为在 Pandas read_csv 中浮动

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25669588/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:28:09  来源:igfitidea点击:

Convert percent string to float in pandas read_csv

pythonpandas

提问by KieranPC

Is there a way to convert values like '34%' directly to int or float when using read_csv in pandas? I would like that it is directly read as 0.34.

在 Pandas 中使用 read_csv 时,有没有办法将“34%”之类的值直接转换为 int 或 float?我希望它直接读取为 0.34。

Using this in read_csv did not work:

在 read_csv 中使用它不起作用:

read_csv(..., dtype={'col':np.float})

After loading the csv as 'df' this also did not work with the error "invalid literal for float(): 34%"

将 csv 加载为 'df' 后,这也不适用于错误“float() 的文字无效:34%”

df['col'] = df['col'].astype(float)

I ended up using this which works but is long winded:

我最终使用了这个有效但冗长的:

df['col'] = df['col'].apply(lambda x: np.nan if x in ['-'] else x[:-1]).astype(float)/100

采纳答案by EdChum

You can define a custom function to convert your percents to floats

您可以定义一个自定义函数来将百分比转换为浮点数

In [149]:
# dummy data
temp1 = """index col 
113 34%
122 50%
123 32%
301 12%"""
# custom function taken from https://stackoverflow.com/questions/12432663/what-is-a-clean-way-to-convert-a-string-percent-to-a-float
def p2f(x):
    return float(x.strip('%'))/100
# pass to convertes param as a dict
df = pd.read_csv(io.StringIO(temp1), sep='\s+',index_col=[0], converters={'col':p2f})
df
Out[149]:
        col
index      
113    0.34
122    0.50
123    0.32
301    0.12
In [150]:
# check that dtypes really are floats
df.dtypes
Out[150]:
col    float64
dtype: object

My percent to float code is courtesy of ashwini's answer: What is a clean way to convert a string percent to a float?

我的百分比浮动代码是由 ashwini 的回答提供的:将字符串百分比转换为浮点数的干净方法是什么?

回答by Gary02127

You were very close with your dfattempt. Try changing:

你非常接近你的df尝试。尝试改变:

df['col'] = df['col'].astype(float)

to:

到:

df['col'] = df['col'].str.rstrip('%').astype('float') / 100.0
#                     ^ use str funcs to elim '%'     ^ divide by 100
# could also be:     .str[:-1].astype(...

Pandas supports Python's string processing ability. Just precede the string function you want with .strand see if it does what you need. (This includes string slicing, too, of course.)

Pandas 支持 Python 的字符串处理能力。只需在您想要的字符串函数之前.str,看看它是否满足您的需求。(当然,这也包括字符串切片。)

Above we utilize .str.rstrip()to get rid of the trailing percent sign, then we divide the array in its entirety by 100.0 to convert from percentage to actual value. For example, 45% is equivalent to 0.45.

上面我们使用.str.rstrip()去除尾随百分号,然后我们将整个数组除以 100.0 以从百分比转换为实际值。例如,45% 相当于 0.45。

Although .str.rstrip('%')could also just be .str[:-1], I prefer to explicitly remove the '%' rather than blindly removing the last char, just in case...

虽然.str.rstrip('%')也可能只是.str[:-1],但我更喜欢明确删除 '%' 而不是盲目地删除最后一个字符,以防万一......