Python 将索引上的数据帧与熊猫合并

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36538780/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:00:18  来源:igfitidea点击:

Merging dataframes on index with pandas

pythonpandasmergedataframe

提问by km1234

I have two dataframes and each one has two index columns. I would like to merge them. For example, the first dataframe is the following:

我有两个数据框,每个数据框都有两个索引列。我想合并它们。例如,第一个数据帧如下:

                   V1

A      1/1/2012    12
       2/1/2012    14
B      1/1/2012    15
       2/1/2012    8
C      1/1/2012    17
       2/1/2012    9

The second dataframe is the following:

第二个数据框如下:

                   V2

A      1/1/2012    15
       3/1/2012    21             
B      1/1/2012    24
       2/1/2012    9
D      1/1/2012    7
       2/1/2012    16

and as result I would like to get the following:

结果我想得到以下内容:

                   V1   V2

A      1/1/2012    12   15
       2/1/2012    14   N/A
       3/1/2012    N/A  21           
B      1/1/2012    15   24
       2/1/2012    8    9
C      1/1/2012    7    N/A
       2/1/2012    16   N/A
D      1/1/2012    N/A  7
       2/1/2012    N/A  16

I have tried a few versions using the pd.mergeand .joinmethods, but nothing seems to work. Do you have any suggestions?

我已经使用pd.merge.join方法尝试了几个版本,但似乎没有任何效果。你有什么建议吗?

回答by Alexander

You should be able to use join, which joins on the index as default. Given your desired result, you must use outeras the join type.

您应该能够使用join, 默认情况下连接索引。给定您想要的结果,您必须使用outer作为连接类型。

>>> df1.join(df2, how='outer')
            V1  V2
A 1/1/2012  12  15
  2/1/2012  14 NaN
  3/1/2012 NaN  21
B 1/1/2012  15  24
  2/1/2012   8   9
C 1/1/2012  17 NaN
  2/1/2012   9 NaN
D 1/1/2012 NaN   7
  2/1/2012 NaN  16

Signature: _.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False) Docstring: Join columns with other DataFrame either on index or on a key column. Efficiently Join multiple DataFrame objects by index at once by passing a list.

签名:_.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False) Docstring:在索引或键列上将列与其他 DataFrame 连接。通过传递一个列表,一次通过索引有效地连接多个 DataFrame 对象。

回答by root

You can do this with merge:

你可以这样做merge

df_merged = df1.merge(df2, how='outer', left_index=True, right_index=True)

The keyword argument how='outer'keeps all indices from both frames, filling in missing indices with NaN. The left_indexand right_indexkeyword arguments have the merge be done on the indices. If you get all NaNin a column after doing a merge, another troubleshooting step is to verify that your indices have the same dtypes.

关键字参数how='outer'保留两个帧中的所有索引,用NaN. 在left_indexright_index关键字参数有合并是对指数进行。如果NaN在合并后将所有内容都放在一列中,另一个故障排除步骤是验证您的索引是否具有相同的dtypes.

The mergecode above produces the following output for me:

merge上面的代码为我生成以下输出:

                V1    V2
A 2012-01-01  12.0  15.0
  2012-02-01  14.0   NaN
  2012-03-01   NaN  21.0
B 2012-01-01  15.0  24.0
  2012-02-01   8.0   9.0
C 2012-01-01  17.0   NaN
  2012-02-01   9.0   NaN
D 2012-01-01   NaN   7.0
  2012-02-01   NaN  16.0