Pandas,合并多列上的两个数据框,并乘以结果

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/54657907/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:19:23  来源:igfitidea点击:

Pandas, merging two dataframes on multiple columns, and multiplying result

pythonpandasdataframe

提问by remh

I have a dataframe,df1 that looks something like this:

我有一个数据框,df1,看起来像这样:

Name Event Factor1
John A     2
John B     3
Ken  A     1.5
....

and an additional dataframe,df2 like this:

和一个额外的数据框,df2,如下所示:

Name Event Factor2
John A     1.2
John B     .5
Ken  A     2

I would like to join both of these dataframes on the two columns Name and Event, with the resulting columns factor 1 and 2 multiplied by each other.

我想在两列名称和事件上加入这两个数据框,结果列因子 1 和 2 彼此相乘。

Name Event FactorResult
John A     2.4
John B     1.5
Ken  A     3

What would be the best way to do this? I am unsure on how to join these on two columns. I know I can join and then multiply the two columns, but I'm wondering if there is a better way than merging them first, then multiplying and dropping the unneeded columns?

什么是最好的方法来做到这一点?我不确定如何在两列上加入这些。我知道我可以加入然后将两列相乘,但我想知道是否有比先合并它们然后相乘和删除不需要的列更好的方法?

回答by Vaishali

If your dataframes are identically labelled, you don't need to merge,

如果您的数据框标记相同,则无需合并,

(df1.set_index(['Name', 'Event'])['Factor1'] * df2.set_index(['Name', 'Event'])['Factor2']).reset_index(name = 'FactorResult')

    Name    Event   FactorResult
0   John    A       2.4
1   John    B       1.5
2   Ken     A       3.0

回答by Dani Mesejo

You could mergeand them multiply:

你可以合并它们相乘:

merged = df1.merge(df2, on=['Name', 'Event'])
merged['ResultFactor'] = merged.Factor1 * merged.Factor2
result = merged.drop(['Factor1', 'Factor2'], axis=1)

print(result)

Output

输出

   Name Event  ResultFactor
0  John     A           2.4
1  John     B           1.5
2   Ken     A           3.0

回答by Vinoth

df  = pd.merge(left=df1, right=df2, on=['Name','Event'], how='inner']
df['FactorResult'] = df['Factor1'] * df['Factor2']
df = df[['Name', 'Event', 'FactorResult']]