pandas 基于多个条件加入两个熊猫数据框

Question

提问by iprof0214

df_aand df_bare two dataframes that looks like following

df_a并且df_b是两个数据框，如下所示

df_a
A   B       C      D     E
x1  Apple   0.3   0.9    0.6
x1  Orange  0.1   0.5    0.2
x2  Apple   0.2   0.2    0.1
x2  Orange  0.3   0.4    0.9
x2  Mango   0.1   0.2    0.3
x3  Orange  0.3   0.1    0.2


df_b
A   B_new   F    
x1  Apple   0.3  
x1  Mango   0.2  
x1  Orange  0.1   
x2  Apple   0.2   
x2  Orange  0.3     
x2  Mango   0.1  
x3  Orange  0.3  
x3  Mango   0.2  
x3  Apple   0.1

I want my final_dfto contain all the rows contained in df_asuch that it contemplates the unique combination of df_a['A'] == df_b['A']and df_a['B'] == df_b['B_new'].

我希望我的final_df包含包含在所有的行df_a，使得它设想的独特组合df_a['A'] == df_b['A']和df_a['B'] == df_b['B_new']。

I've tried doing outer join and then drop duplicates w.r.t columns A and B in final_dfbut the value of B_new is not retained.

我试过做外连接，然后删除重复的列 A 和 B，final_df但不保留 B_new 的值。

Following is how I want my result_dfto look like:

以下是我希望我result_df的样子：

result_df

结果_df

 A   B       C      D     E   B_new  F
x1  Apple   0.3   0.9    0.6  Apple  0.3
x1  Orange  0.1   0.5    0.2  Orange 0.1
x2  Apple   0.2   0.2    0.1  Apple   0.2 
x2  Orange  0.3   0.4    0.9  Orange  0.3
x2  Mango   0.1   0.2    0.3  Mango   0.1
x3  Orange  0.3   0.1    0.2  Orange  0.3

I also tried left outer join:

我也试过左外连接：

final_df = pd.merge(df_a, df_b, how="left", on=['A'])

The size of this dataframe is a union of df_aand df_bwhich is not what I want.

此数据框的大小是df_a和的并集，df_b这不是我想要的。

Appreciate any suggestions.

感谢任何建议。

Answer 1

采纳答案by jpp

You need an inner merge, specifying bothmerge columns in each case:

您需要内部合并，在每种情况下指定两个合并列：

res = df_a.merge(df_b, how='inner', left_on=['A', 'B'], right_on=['A', 'B_new'])

print(res)

    A       B    C    D    E   B_new    F
0  x1   Apple  0.3  0.9  0.6   Apple  0.3
1  x1  Orange  0.1  0.5  0.2  Orange  0.1
2  x2   Apple  0.2  0.2  0.1   Apple  0.2
3  x2  Orange  0.3  0.4  0.9  Orange  0.3
4  x2   Mango  0.1  0.2  0.3   Mango  0.1
5  x3  Orange  0.3  0.1  0.2  Orange  0.3

Answer 2

回答by Daniel

You can still achieve this with a left join which is very ideal.
See below:

您仍然可以使用非常理想的左连接来实现这一点。
见下文：

final_df = pd.merge(df_a, df_b[['A', 'B_new','F']], how="left", left_on=['A', 'B'], right_on=['A', 'B_new']);

pandas 基于多个条件加入两个熊猫数据框

提问by iprof0214

采纳答案by jpp

回答by Daniel

相关推荐

最近更新

标签

pandas 基于多个条件加入两个熊猫数据框

提问by iprof0214

采纳答案by jpp

回答by Daniel

相关推荐

pandas 熊猫如何交换或重新排序列

从 url 下载 csv 并使其成为数据框 python pandas

pandas Python创建条形图比较2组数据

pandas AttributeError: 'numpy.ndarray' 对象没有属性 'iloc'

相关推荐

最近更新

标签