Python 将百分比字符串转换为在 Pandas read_csv 中浮动
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25669588/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert percent string to float in pandas read_csv
提问by KieranPC
Is there a way to convert values like '34%' directly to int or float when using read_csv in pandas? I would like that it is directly read as 0.34.
在 Pandas 中使用 read_csv 时,有没有办法将“34%”之类的值直接转换为 int 或 float?我希望它直接读取为 0.34。
Using this in read_csv did not work:
在 read_csv 中使用它不起作用:
read_csv(..., dtype={'col':np.float})
After loading the csv as 'df' this also did not work with the error "invalid literal for float(): 34%"
将 csv 加载为 'df' 后,这也不适用于错误“float() 的文字无效:34%”
df['col'] = df['col'].astype(float)
I ended up using this which works but is long winded:
我最终使用了这个有效但冗长的:
df['col'] = df['col'].apply(lambda x: np.nan if x in ['-'] else x[:-1]).astype(float)/100
采纳答案by EdChum
You can define a custom function to convert your percents to floats
您可以定义一个自定义函数来将百分比转换为浮点数
In [149]:
# dummy data
temp1 = """index col
113 34%
122 50%
123 32%
301 12%"""
# custom function taken from https://stackoverflow.com/questions/12432663/what-is-a-clean-way-to-convert-a-string-percent-to-a-float
def p2f(x):
return float(x.strip('%'))/100
# pass to convertes param as a dict
df = pd.read_csv(io.StringIO(temp1), sep='\s+',index_col=[0], converters={'col':p2f})
df
Out[149]:
col
index
113 0.34
122 0.50
123 0.32
301 0.12
In [150]:
# check that dtypes really are floats
df.dtypes
Out[150]:
col float64
dtype: object
My percent to float code is courtesy of ashwini's answer: What is a clean way to convert a string percent to a float?
我的百分比浮动代码是由 ashwini 的回答提供的:将字符串百分比转换为浮点数的干净方法是什么?
回答by Gary02127
You were very close with your dfattempt. Try changing:
你非常接近你的df尝试。尝试改变:
df['col'] = df['col'].astype(float)
to:
到:
df['col'] = df['col'].str.rstrip('%').astype('float') / 100.0
# ^ use str funcs to elim '%' ^ divide by 100
# could also be: .str[:-1].astype(...
Pandas supports Python's string processing ability. Just precede the string function you want with .strand see if it does what you need. (This includes string slicing, too, of course.)
Pandas 支持 Python 的字符串处理能力。只需在您想要的字符串函数之前.str,看看它是否满足您的需求。(当然,这也包括字符串切片。)
Above we utilize .str.rstrip()to get rid of the trailing percent sign, then we divide the array in its entirety by 100.0 to convert from percentage to actual value. For example, 45% is equivalent to 0.45.
上面我们使用.str.rstrip()去除尾随百分号,然后我们将整个数组除以 100.0 以从百分比转换为实际值。例如,45% 相当于 0.45。
Although .str.rstrip('%')could also just be .str[:-1], I prefer to explicitly remove the '%' rather than blindly removing the last char, just in case...
虽然.str.rstrip('%')也可能只是.str[:-1],但我更喜欢明确删除 '%' 而不是盲目地删除最后一个字符,以防万一......

