Pandas:比较两列并返回匹配的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32400893/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:51:29  来源:igfitidea点击:

Pandas: compare two columns and return matched rows

pandasmatchdataframevlookup

提问by Boosted_d16

I have two dataframes with multiple columns.

我有两个带有多列的数据框。

I would like to compare df1['postcode'] and df2['pcd'] and build a new df based on the matched values of these two columns.

我想比较 df1['postcode'] 和 df2['pcd'] 并根据这两列的匹配值构建一个新的 df。

Note- the length of the two columns I want to match is not the same.

注意 - 我要匹配的两列的长度不一样。

df1
  postcode brand
1 znuee    soony 
2 eusjk    nike
3 zieum    addidas
4 psosk    ferrari

df2
  pcd      brand
1 dodkm    soony 
2 eusjk    nike
3 sjksj    addidas
4 psosk    ferrari

Output:

输出:

newdf
  pcd      brand
1 eusjk    nike
2 psosk    ferrari

my attempt but i get a mismatch length on the columns

我的尝试,但列上的长度不匹配

newdf = (df2['postcode'] == df1).all(axis=1).astype(int)

Do i need to use some kind of loopup function?

我需要使用某种循环功能吗?

回答by EdChum

You can perform an inner merge:

您可以执行内部merge

In [134]:
df1.merge(df2, left_on=['postcode', 'brand'], right_on=['pcd', 'brand'])

Out[134]:
  postcode    brand    pcd
0    eusjk     nike  eusjk
1    psosk  ferrari  psosk

You can then drop the 'postcode' column or rename it first:

然后,您可以删除“邮政编码”列或先重命名它:

In [136]:

df1.rename(columns={'postcode':'pcd'}).merge(df2)
Out[136]:
     pcd    brand
0  eusjk     nike
1  psosk  ferrari