合并两个不同长度的python pandas数据帧,但将所有行保留在输出数据帧中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33086881/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 12:44:59  来源:igfitidea点击:

Merge two python pandas data frames of different length but keep all rows in output data frame

pythonpandasmergedataframe

提问by sequence_hard

I have the following problem: I have two pandas data frames of different length containing some rows and columns that have common values and some that are different, like this:

我有以下问题:我有两个不同长度的 Pandas 数据框,其中包含一些具有共同值的行和列以及一些不同的行和列,如下所示:

df1:                                 df2:

      Column1  Column2  Column3           ColumnA  ColumnB ColumnC
    0    a        x        x            0    c        y       y
    1    c        x        x            1    e        z       z
    2    e        x        x            2    a        s       s
    3    d        x        x            3    d        f       f
    4    h        x        x
    5    k        x        x            

What I want to do now is merging the two dataframes so that if ColumnA and Column1 have the same value the rows from df2 are appended to the corresponding row in df1, like this:

我现在想要做的是合并两个数据框,以便如果 ColumnA 和 Column1 具有相同的值,则 df2 中的行将附加到 df1 中的相应行,如下所示:

df1:
    Column1  Column2  Column3  ColumnB  ColumnC
  0    a        x        x        s        s
  1    c        x        x        y        y
  2    e        x        x        z        z
  3    d        x        x        f        f
  4    h        x        x        NaN      NaN
  5    k        x        x        NaN      NaN

I know that the merge is doable through

我知道合并是可行的

df1.merge(df2,left_on='Column1', right_on='ColumnA')

but this command drops all rows that are not the same in Column1 and ColumnA in both files. Instead of that I want to keep these rows in df1 and just assign NaN to them in the columns where other rows have a value from df2, as shown above. Is there a smooth way to do this in pandas?

但是此命令会删除两个文件中 Column1 和 ColumnA 中不相同的所有行。相反,我想将这些行保留在 df1 中,并在其他行具有来自 df2 的值的列中将 NaN 分配给它们,如上所示。在熊猫中有没有一种顺利的方法来做到这一点?

Thanks in advance!

提前致谢!

采纳答案by Sina

You can read the documentation here: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html

您可以在此处阅读文档:http: //pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html

What you are looking for is a left join. The default option is an inner join. You can change this behavior by passing a different how argument:

您正在寻找的是左连接。默认选项是内部联接。您可以通过传递不同的 how 参数来更改此行为:

df1.merge(df2,how='left', left_on='Column1', right_on='ColumnA')

回答by sgrg

Looks like you're looking for something like a left-join. See if this example helps: http://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html#left-outer-join

看起来你正在寻找像左连接这样的东西。看看这个例子是否有帮助:http: //pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html#left-outer-join

You can basically pass a parameter to merge()called how='left'

您基本上可以将参数传递给merge()被调用how='left'

回答by Nirali Khoda

You can simply use merge with using on and list as well

您也可以简单地将 merge 与 using on 和 list 一起使用

result = df1.merge(df2, on=['Column1'])

For more information follow link

欲了解更多信息,请点击 链接