如何更改数据帧 Python 中的值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45070896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to change values in a dataframe Python
提问by handavidbang
I've searched for an answer for the past 30 min, but the only solutions are either for a single column or in R. I have a dataset in which I want to change the ('Y/N') values to 1 and 0 respectively. I feel like copying and pasting the code below 17 times is very inefficient.
我在过去 30 分钟内搜索了答案,但唯一的解决方案是针对单列或在 R 中。我有一个数据集,我想在其中将 ('Y/N') 值更改为 1 和 0分别。感觉把下面的代码复制粘贴17次是非常低效的。
df.loc[df.infants == 'n', 'infants'] = 0
df.loc[df.infants == 'y', 'infants'] = 1
df.loc[df.infants == '?', 'infants'] = 1
My solution is the following. This doesn't cause an error, but the values in the dataframe doesn't change. I'm assuming I need to do something like df = df_new. But how to do this?
我的解决方案如下。这不会导致错误,但数据帧中的值不会改变。我假设我需要做一些类似 df = df_new 的事情。但是如何做到这一点呢?
for coln in df:
for value in coln:
if value == 'y':
value = '1'
elif value == 'n':
value = '0'
else:
value = '1'
EDIT: There are 17 columns in this dataset, but there is another dataset I'm hoping to tackle which contains 56 columns.
编辑:此数据集中有 17 列,但我希望处理另一个包含 56 列的数据集。
republican n y n.1 y.1 y.2 y.3 n.2 n.3 n.4 y.4 ? y.5 y.6 y.7 n.5 y.8
0 republican n y n y y y n n n n n y y y n ?
1 democrat ? y y ? y y n n n n y n y y n n
2 democrat n y y n ? y n n n n y n y n n y
3 democrat y y y n y y n n n n y ? y y y y
4 democrat n y y n y y n n n n n n y y y y
采纳答案by Luis Miguel
This should work:
这应该有效:
for col in df.columns():
df.loc[df[col] == 'n', col] = 0
df.loc[df[col] == 'y', col] = 1
df.loc[df[col] == '?', col] = 1
回答by forayer
You can change the values using the map function.
您可以使用map 函数更改这些值。
Ex.:
前任。:
x = {'y': 1, 'n': 0}
for col in df.columns():
df[col] = df[col].map(x)
This way you map each column of your dataframe.
这样您就可以映射数据框的每一列。
回答by Dondon Jie
Maybe you can try apply,
也许你可以尝试申请,
import pandas as pd
# create dataframe
number = [1,2,3,4,5]
sex = ['male','female','female','female','male']
df_new = pd.DataFrame()
df_new['number'] = number
df_new['sex'] = sex
df_new.head()
# create def for category to number 0/1
def tran_cat_to_num(df):
if df['sex'] == 'male':
return 1
elif df['sex'] == 'female':
return 0
# create sex_new
df_new['sex_new']=df_new.apply(tran_cat_to_num,axis=1)
df_new
raw
生的
number sex
0 1 male
1 2 female
2 3 female
3 4 female
4 5 male
after use apply
使用后申请
number sex sex_new
0 1 male 1
1 2 female 0
2 3 female 0
3 4 female 0
4 5 male 1
回答by Saikat Kumar Dey
This should do:
这应该做:
df.infants = df.infants.map({ 'Y' : 1, 'N' : 0})
df.infants = df.infants.map({ 'Y' : 1, 'N' : 0})
回答by jezrael
I think simpliest is use replace
by dict
:
我认为simpliest是使用replace
由dict
:
np.random.seed(100)
df = pd.DataFrame(np.random.choice(['n','y','?'], size=(5,5)),
columns=list('ABCDE'))
print (df)
A B C D E
0 n n n ? ?
1 n ? y ? ?
2 ? ? y n n
3 n n ? n y
4 y ? ? n n
d = {'n':0,'y':1,'?':1}
df = df.replace(d)
print (df)
A B C D E
0 0 0 0 1 1
1 0 1 1 1 1
2 1 1 1 0 0
3 0 0 1 0 1
4 1 1 1 0 0
回答by May Pilijay El
All the solutions above are correct, but what you could also do is:
以上所有解决方案都是正确的,但您还可以做的是:
df["infants"] = df["infants"].replace("Y", 1).replace("N", 0).replace("?", 1)
which now that I read more carefully is very similar to using replace with dict !
df["infants"] = df["infants"].replace("Y", 1).replace("N", 0).replace("?", 1)
现在我读得更仔细,这与使用 replace with dict 非常相似!
回答by Saurabh
Use dataframe.replace():
df.replace({'infants':{'y':1,'?':1,'n':0}},inplace=True)