pandas 基于公共列合并多个数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/52223045/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merge multiple dataframes based on a common column
提问by FunnyCoder
I have Three dataframes. All of them have a common column and I need to merge them based on the common column without missing any data
我有三个数据框。他们都有一个共同的列,我需要根据共同的列合并它们而不会丢失任何数据
Input
输入
>>>df1 0 Col1 Col2 Col3 1 data1 3 4 2 data2 4 3 3 data3 2 3 4 data4 2 4 5 data5 1 4 >>>df2 0 Col1 Col4 Col5 1 data1 7 4 2 data2 6 9 3 data3 1 4 >>>df3 0 Col1 Col6 Col7 1 data2 5 8 2 data3 2 7 3 data5 5 3
Expected Output
预期产出
>>>df 0 Col1 Col2 Col3 Col4 Col5 Col6 Col7 1 data1 3 4 7 4 2 data2 4 3 6 9 5 8 3 data3 2 3 1 4 2 7 4 data4 2 4 5 data5 1 4 5 3
回答by Zero
Use merge
and reduce
使用merge
和reduce
In [86]: from functools import reduce
In [87]: reduce(lambda x,y: pd.merge(x,y, on='Col1', how='outer'), [df1, df2, df3])
Out[87]:
Col1 Col2 Col3 Col4 Col5 Col6 Col7
0 data1 3 4 7.0 4.0 NaN NaN
1 data2 4 3 6.0 9.0 5.0 8.0
2 data3 2 3 1.0 4.0 2.0 7.0
3 data4 2 4 NaN NaN NaN NaN
4 data5 1 4 NaN NaN 5.0 3.0
Details
细节
In [88]: df1
Out[88]:
Col1 Col2 Col3
0 data1 3 4
1 data2 4 3
2 data3 2 3
3 data4 2 4
4 data5 1 4
In [89]: df2
Out[89]:
Col1 Col4 Col5
0 data1 7 4
1 data2 6 9
2 data3 1 4
In [90]: df3
Out[90]:
Col1 Col6 Col7
0 data2 5 8
1 data3 2 7
2 data5 5 3
回答by Sandeep Kadapa
Using pd.concat
:
使用pd.concat
:
df1.set_index('Col1',inplace=True)
df2.set_index('Col1',inplace=True)
df3.set_index('Col1',inplace=True)
df = pd.concat([df1,df2,df3],axis=1,sort=False).reset_index()
df.rename(columns = {'index':'Col1'})
Col1 Col2 Col3 Col4 Col5 Col6 Col7
0 data1 3 4 7.0 4.0 NaN NaN
1 data2 4 3 6.0 9.0 5.0 8.0
2 data3 2 3 1.0 4.0 2.0 7.0
3 data4 2 4 NaN NaN NaN NaN
4 data5 1 4 NaN NaN 5.0 3.0
回答by marco_gorelli
You can do
你可以做
df1.merge(df2, how='left', left_on='Col1', right_on='Col1').merge(df3, how='left', left_on='Col1', right_on='Col1')
df1.merge(df2, how='left', left_on='Col1', right_on='Col1').merge(df3, how='left', left_on='Col1', right_on='Col1')
回答by Julian Silvestri
Try this line of code here:
在这里试试这行代码:
df.set_index('key').join(df2.set_index('key'))
You can check the documentation on the 'key' to reference your code properlly. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.join.html
您可以查看有关“密钥”的文档以正确引用您的代码。 https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.join.html
Set the 'key' equal to the column you wish to merge with the rest!
将“键”设置为您希望与其余列合并的列!
Hope this helps.
希望这可以帮助。