Pandas:比较两列并返回匹配的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32400893/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: compare two columns and return matched rows
提问by Boosted_d16
I have two dataframes with multiple columns.
我有两个带有多列的数据框。
I would like to compare df1['postcode'] and df2['pcd'] and build a new df based on the matched values of these two columns.
我想比较 df1['postcode'] 和 df2['pcd'] 并根据这两列的匹配值构建一个新的 df。
Note- the length of the two columns I want to match is not the same.
注意 - 我要匹配的两列的长度不一样。
df1
postcode brand
1 znuee soony
2 eusjk nike
3 zieum addidas
4 psosk ferrari
df2
pcd brand
1 dodkm soony
2 eusjk nike
3 sjksj addidas
4 psosk ferrari
Output:
输出:
newdf
pcd brand
1 eusjk nike
2 psosk ferrari
my attempt but i get a mismatch length on the columns
我的尝试,但列上的长度不匹配
newdf = (df2['postcode'] == df1).all(axis=1).astype(int)
Do i need to use some kind of loopup function?
我需要使用某种循环功能吗?
回答by EdChum
You can perform an inner merge:
您可以执行内部merge:
In [134]:
df1.merge(df2, left_on=['postcode', 'brand'], right_on=['pcd', 'brand'])
Out[134]:
postcode brand pcd
0 eusjk nike eusjk
1 psosk ferrari psosk
You can then drop the 'postcode' column or rename it first:
然后,您可以删除“邮政编码”列或先重命名它:
In [136]:
df1.rename(columns={'postcode':'pcd'}).merge(df2)
Out[136]:
pcd brand
0 eusjk nike
1 psosk ferrari

