如何比较 Pandas 中两个数据框的值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38650273/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:41:51  来源:igfitidea点击:

How to Compare Values of two Dataframes in Pandas?

pythonpandasdictionarydataframe

提问by MEhsan

I have two dataframes dfand df2like this

我有两个数据框dfdf2就像这样

    id  initials
0   100 J
1   200 S
2   300 Y

    name  initials
0   John   J
1   Smith  S
2   Nathan N

I want to compare the values in the initialscolumns found in (dfand df2) and copy the name (in df2) which its initial is matching to the initial in the first dataframe (df)

我想比较initials(dfdf2) 中找到的列中的值,并复制df2其首字母与第一个数据帧 ( df) 中首字母匹配的名称 (in )

import pandas as pd

for i in df.initials:
    for j in df2.initials:
        if i == j:
        # copy the name value of this particular initial to df

The output should be like this:

输出应该是这样的:

     id name
 0   100 Johon
 1   200 Smith
 2   300   

Any idea how to solve this problem?

知道如何解决这个问题吗?

回答by mechanical_meat

How about?:

怎么样?:

df3 = df.merge(df2,on='initials',
                   how='outer').drop(['initials'],axis=1).dropna(subset=['id'])
>>> df3
      id    name
0  100.0    John
1  200.0   Smith
2  300.0     NaN

So the 'initials' column is dropped and so is anything with np.nanin the 'id' column.

所以'initials'列被删除,np.nan'id'列中的任何内容也被删除。

If you don't want the np.nanin there tack on a .fillna():

如果你不想np.nan在那里大头钉.fillna()

df3 = df.merge(df2,on='initials',
                   how='outer').drop(['initials'],axis=1).dropna(subset=['id']).fillna('')
>>> df3
      id   name
0  100.0   John
1  200.0  Smith
2  300.0

回答by Vidhya G

df1 
    id initials                                                                                                                                                               
0  100        J                                                                                                                                                               
1  200        S                                                                                                                                                               
2  300        Y 

df2
     name initials                                                                                                                                                            
0    John        J                                                                                                                                                            
1   Smith        S                                                                                                                                                            
2  Nathan        N

Use Boolean masks: df2.initials==df1.initialswill tell you which values in the two initialscolumns are the same.

使用布尔掩码:df2.initials==df1.initials将告诉您两initials列中哪些值相同。

0     True                                                                                                                                                                    
1     True                                                                                                                                                                    
2    False

Use this mask to create a new column:

使用此掩码创建新列:

df1['name'] = df2.name[df2.initials==df1.initials]

Remove the initialscolumn in df1:

删除中的initialsdf1

df1.drop('initials', axis=1)

Replace the NaNusing fillna(' ')

替换NaN使用fillna(' ')

df1.fillna('', inplace=True) #inplace to avoid creating a copy

    id   name                                                                                                                                                                 
0  100   John                                                                                                                                                                 
1  200  Smith                                                                                                                                                                 
2  300