pandas 从两列计算和创建百分比列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36332147/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculating and creating percentage column from two columns
提问by djhc
I have a df (Apple_farm
) and need to calculate a percentage based off values found in two of the columns (Good_apples
and Total_apples
) and then add the resulting values to a new column within Apple_farm called 'Perc_Good'.
我有一个 df ( Apple_farm
),需要根据在两列 (Good_apples
和Total_apples
) 中找到的值计算百分比,然后将结果值添加到 Apple_farm 中名为“Perc_Good”的新列中。
I have tried:
我试过了:
Apple_farm['Perc_Good'] = (Apple_farm['Good_apples'] / Apple_farm['Total_apples']) *100
However this results in this error:
但是,这会导致此错误:
TypeError: unsupported operand type(s) for /: 'str' and 'str'
类型错误:不支持 / 的操作数类型:'str' 和 'str'
Doing
正在做
Print Apple_farm['Good_apples']
and Print Apple_farm['Total_apples']
Print Apple_farm['Good_apples']
和 Print Apple_farm['Total_apples']
Yields a list with numerical values however dividing them seems to result in them being converted to strings?
产生一个带有数值的列表,但是将它们分开似乎会导致它们被转换为字符串?
I have also tried to define a new function:
我还尝试定义一个新函数:
def percentage(amount, total):
percent = amount/total*100
return percent
but are unsure on how to use this.
但不确定如何使用它。
Any help would be appreciated as I am fairly new to Python and pandas!
任何帮助将不胜感激,因为我对 Python 和 Pandas 还很陌生!
回答by jezrael
I think you need convert string
columns to float
or int
, because their type
is string
(but looks like numbers):
我认为您需要将string
列转换为float
or int
,因为它们type
是string
(但看起来像数字):
Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(float)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(float)
Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(int)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(int)
Sample:
样本:
import pandas as pd
Good_apples = ["10", "20", "3", "7", "9"]
Total_apples = ["20", "80", "30", "70", "90"]
d = {"Good_apples": Good_apples, "Total_apples": Total_apples}
Apple_farm = pd.DataFrame(d)
print Apple_farm
Good_apples Total_apples
0 10 20
1 20 80
2 3 30
3 7 70
4 9 90
print Apple_farm.dtypes
Good_apples object
Total_apples object
dtype: object
print Apple_farm.at[0,'Good_apples']
10
print type(Apple_farm.at[0,'Good_apples'])
<type 'str'>
Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(int)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(int)
print Apple_farm.dtypes
Good_apples int32
Total_apples int32
dtype: object
print Apple_farm.at[0,'Good_apples']
10
print type(Apple_farm.at[0,'Good_apples'])
<type 'numpy.int32'>
Apple_farm['Perc_Good'] = (Apple_farm['Good_apples'] / Apple_farm['Total_apples']) *100
print Apple_farm
Good_apples Total_apples Perc_Good
0 10 20 50.0
1 20 80 25.0
2 3 30 10.0
3 7 70 10.0
4 9 90 10.0