从 Pandas 中不同数据框中的另一个匹配列更新数据框中的列值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36589619/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
update a column value in a dataframe from another matching column in different dataframe in Pandas
提问by Satya
i have two dataframes
我有两个数据框
df
city mail
a satya
b def
c akash
d satya
e abc
f xyz
#Another Dataframe d as
city mail
x satya
y def
z akash
u ash
So now i need to update city in df from updated values in 'd' comparing the mails, if some mail id not found it should remain as it was. So it should look like
因此,现在我需要根据比较邮件的“d”中的更新值更新 df 中的城市,如果未找到某些邮件 ID,则应保持原样。所以它应该看起来像
df ### o/p should be like
city mail
x satya
y def
z akash
x satya #repeated so same value should placed here
e abc # not found so as it was
f xyz
I have tried --
我试过了 -
s = {'mail': ['satya', 'def', 'akash', 'satya', 'abc', 'xyz'],'city': ['a', 'b', 'c', 'd', 'e', 'f']}
s1 = {'mail': ['satya', 'def', 'akash', 'ash'],'city': ['x', 'y', 'z', 'u']}
df = pd.DataFrame(s)
d = pd.DataFrame(s1)
#from google i tried
df.loc[df.mail.isin(d.mail),['city']] = d['city']
#giving erronous result as
#给出错误的结果为
city mail
x satya
y def
z akash
u satya ###this value should be for city 'x'
e abc
f xyz
I can't do a merge here on='mail',how='left', as in one dataframe i have less customer.So after merging, how can i map the value of non matching mail's city in merged one.
我无法在 ='mail',how='left' 上进行合并,因为在一个数据框中,我的客户较少。因此,合并后,我如何映射合并后的不匹配邮件城市的值。
Please suggest.
请建议。
回答by Alexander
It looks like you want to update the city
value in df
from the city
value in d
. The update
function is based on the index, so this first needs to be set.
看起来您想从city
indf
中的city
值更新in中的值d
。该update
功能是基于索引,所以这首先需要进行设置。
# Add extra columns to dataframe.
df['mobile_no'] = ['212-555-1111'] * len(df)
df['age'] = [20] * len(df)
# Update city values keyed on `mail`.
new_city = df[['mail', 'city']].set_index('mail')
new_city.update(d.set_index('mail'))
df['city'] = new_city.values
>>> df
city mail mobile_no age
0 x satya 212-555-1111 20
1 y def 212-555-1111 20
2 z akash 212-555-1111 20
3 x satya 212-555-1111 20
4 e abc 212-555-1111 20
5 f xyz 212-555-1111 20