从 Pandas 中不同数据框中的另一个匹配列更新数据框中的列值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36589619/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:02:34  来源:igfitidea点击:

update a column value in a dataframe from another matching column in different dataframe in Pandas

pythonpandas

提问by Satya

i have two dataframes

我有两个数据框

 df
 city   mail
  a    satya
  b    def
  c    akash
  d    satya
  e    abc
  f    xyz
#Another Dataframe d as
 city   mail
 x      satya
 y      def
 z      akash
 u      ash

So now i need to update city in df from updated values in 'd' comparing the mails, if some mail id not found it should remain as it was. So it should look like

因此,现在我需要根据比较邮件的“d”中的更新值更新 df 中的城市,如果未找到某些邮件 ID,则应保持原样。所以它应该看起来像

 df ### o/p should be like
 city   mail
  x    satya
  y    def
  z    akash
  x    satya  #repeated so same value should placed here
  e    abc     # not found so as it was
  f    xyz

I have tried --

我试过了 -

s = {'mail': ['satya', 'def', 'akash', 'satya', 'abc', 'xyz'],'city': ['a', 'b', 'c', 'd', 'e', 'f']}
s1 = {'mail': ['satya', 'def', 'akash', 'ash'],'city': ['x', 'y', 'z', 'u']}
df = pd.DataFrame(s)
d = pd.DataFrame(s1)
#from google i tried
df.loc[df.mail.isin(d.mail),['city']] = d['city']

#giving erronous result as

#给出错误的结果为

 city   mail
 x  satya
 y  def
 z  akash
 u  satya  ###this value should be for city 'x'
 e    abc
 f    xyz

I can't do a merge here on='mail',how='left', as in one dataframe i have less customer.So after merging, how can i map the value of non matching mail's city in merged one.

我无法在 ='mail',how='left' 上进行合并,因为在一个数据框中,我的客户较少。因此,合并后,我如何映射合并后的不匹配邮件城市的值。

Please suggest.

请建议。

回答by Alexander

It looks like you want to update the cityvalue in dffrom the cityvalue in d. The updatefunction is based on the index, so this first needs to be set.

看起来您想从cityindf中的city值更新in中的值d。该update功能是基于索引,所以这首先需要进行设置。

# Add extra columns to dataframe.
df['mobile_no'] = ['212-555-1111'] * len(df)
df['age'] = [20] * len(df)

# Update city values keyed on `mail`.
new_city = df[['mail', 'city']].set_index('mail')
new_city.update(d.set_index('mail'))
df['city'] = new_city.values

>>> df
  city   mail     mobile_no  age
0    x  satya  212-555-1111   20
1    y    def  212-555-1111   20
2    z  akash  212-555-1111   20
3    x  satya  212-555-1111   20
4    e    abc  212-555-1111   20
5    f    xyz  212-555-1111   20