pandas 如何将列中的值更改为二进制?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51016230/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to change values in a column into binary?
提问by Akshat Bhardwaj
New to python and I am stuck at this. My CSV file contain this:
python的新手,我被困在这个问题上。我的 CSV 文件包含以下内容:
Sr,Gender
1,Male
2,Male
3,Female
Now I want to convert the Gender values into binary so the the file will look something like:
现在我想将 Gender 值转换为二进制文件,这样文件看起来像:
Sr,Gender
1,1
2,1
3,0
So, I imported the CSV file as dataand ran this code:
因此,我将 CSV 文件导入为data并运行以下代码:
data["Gender_new"]=1
data["Gender_new"][data["Gender"]=="Male"]=0
data["Gender_new"]=1=data["Gender_new"].astype(float)
But I got the error ValueError: could not convert string 'Male' to float:
但我得到了错误 ValueError: could not convert string 'Male' to float:
What am I doing wrong and how can I make this work?
我做错了什么,我怎样才能做到这一点?
Thanks
谢谢
回答by Nidhin Sajeev
Try this:
尝试这个:
import pandas as pd
file = open("your.csv", "r")
data = pd.read_csv(file, sep = ",")
gender = {'male': 1,'female': 0}
data.Gender = [gender[item] for item in data.Gender]
print(data)
Or
或者
data.Gender[data.Gender == 'male'] = 1
data.Gender[data.Gender == 'female'] = 0
print(data)
回答by Burhan Khalid
You can do the conversion as you load the file:
您可以在加载文件时进行转换:
d = pandas.read_csv('yourfile.csv', converters={'Gender': lambda x: int(x == 'Male')})
The convertersargument takes a dictionary whose keys are the column names (or indices), and the value is a function to call for each item. The function must return the converted value.
该converters参数使用字典的键是列名(或指数),该值是要求每个项目的功能。该函数必须返回转换后的值。
The other way to do it is to convert it once you have the dataframe, as @DJK pointed in their comment:
另一种方法是在拥有数据帧后转换它,正如@DJK 在他们的评论中指出的那样:
data['Gender'] = (data['Gender'] == 'Male').astype(int)

