pandas 将表/数据帧与 Python 中的公共列连接起来

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13793321/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:31:40  来源:igfitidea点击:

Joining Table/DataFrames with common Column in Python

pythonpandas

提问by Rahul Bhatia

I have two DataFrames:

我有两个数据帧:

df1 = ['Date_Time',
    'Temp_1',
    'Latitude',
    'N_S',
    'Longitude',
    'E_W']

df2 = ['Date_Time',
    'Year',
    'Month',
    'Day',
    'Hour',
    'Minute',
    'Seconds']

As You can see both DataFrames have Date_Timeas a common column. I want to Join these two DataFrames by matching Date_Time.

正如您所看到的,两个 DataFrame 都有Date_Time一个公共列。我想通过匹配Date_Time.

My current code is: df.join(df2, on='Date_Time'), but this is giving an error.

我当前的代码是:df.join(df2, on='Date_Time'),但这是一个错误。

回答by Andy Hayden

You are looking for a merge:

您正在寻找一个merge

df1.merge(df2, on='Date_Time')

The keywords are the same as for join, but joinuses only the index, see "Database-style DataFrame joining/merging".

关键字与 for 相同join,但join仅使用索引,请参阅“数据库样式的 DataFrame 加入/合并”

Here's a simple example:

这是一个简单的例子:

import pandas as pd
df1 = pd.DataFrame([[1, 2, 3]])
df2 = pd.DataFrame([[1, 7, 8],[4, 9, 9]], columns=[0, 3, 4])

In [4]: df1
Out[4]: 
   0  1  2
0  1  2  3

In [5]: df2
Out[5]: 
   0  3  4
0  1  7  8
1  4  9  9

In [6]: df1.merge(df2, on=0)
Out[6]: 
   0  1  2  3  4
0  1  2  3  7  8

In [7]: df1.merge(df2, on=0, how='outer')
Out[7]: 
   0   1   2  3  4
0  1   2   3  7  8
1  4 NaN NaN  9  9

If you try and join on a column you get an error:

如果您尝试加入列,则会出现错误:

In [8]: df1.join(df2, on=0)
# error!
Exception: columns overlap: array([0], dtype=int64)

See "Joining key columns on an index".

请参阅“连接索引上的键列”