pandas 如何将列中的值更改为二进制?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51016230/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to change values in a column into binary?
提问by Akshat Bhardwaj
New to python and I am stuck at this. My CSV file contain this:
python的新手,我被困在这个问题上。我的 CSV 文件包含以下内容:
Sr,Gender
1,Male
2,Male
3,Female
Now I want to convert the Gender values into binary so the the file will look something like:
现在我想将 Gender 值转换为二进制文件,这样文件看起来像:
Sr,Gender
1,1
2,1
3,0
So, I imported the CSV file as data
and ran this code:
因此,我将 CSV 文件导入为data
并运行以下代码:
data["Gender_new"]=1
data["Gender_new"][data["Gender"]=="Male"]=0
data["Gender_new"]=1=data["Gender_new"].astype(float)
But I got the error ValueError: could not convert string 'Male' to float:
但我得到了错误 ValueError: could not convert string 'Male' to float:
What am I doing wrong and how can I make this work?
我做错了什么,我怎样才能做到这一点?
Thanks
谢谢
回答by Nidhin Sajeev
Try this:
尝试这个:
import pandas as pd
file = open("your.csv", "r")
data = pd.read_csv(file, sep = ",")
gender = {'male': 1,'female': 0}
data.Gender = [gender[item] for item in data.Gender]
print(data)
Or
或者
data.Gender[data.Gender == 'male'] = 1
data.Gender[data.Gender == 'female'] = 0
print(data)
回答by Burhan Khalid
You can do the conversion as you load the file:
您可以在加载文件时进行转换:
d = pandas.read_csv('yourfile.csv', converters={'Gender': lambda x: int(x == 'Male')})
The converters
argument takes a dictionary whose keys are the column names (or indices), and the value is a function to call for each item. The function must return the converted value.
该converters
参数使用字典的键是列名(或指数),该值是要求每个项目的功能。该函数必须返回转换后的值。
The other way to do it is to convert it once you have the dataframe, as @DJK pointed in their comment:
另一种方法是在拥有数据帧后转换它,正如@DJK 在他们的评论中指出的那样:
data['Gender'] = (data['Gender'] == 'Male').astype(int)