合并两个不同长度的python pandas数据帧,但将所有行保留在输出数据帧中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33086881/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merge two python pandas data frames of different length but keep all rows in output data frame
提问by sequence_hard
I have the following problem: I have two pandas data frames of different length containing some rows and columns that have common values and some that are different, like this:
我有以下问题:我有两个不同长度的 Pandas 数据框,其中包含一些具有共同值的行和列以及一些不同的行和列,如下所示:
df1: df2:
Column1 Column2 Column3 ColumnA ColumnB ColumnC
0 a x x 0 c y y
1 c x x 1 e z z
2 e x x 2 a s s
3 d x x 3 d f f
4 h x x
5 k x x
What I want to do now is merging the two dataframes so that if ColumnA and Column1 have the same value the rows from df2 are appended to the corresponding row in df1, like this:
我现在想要做的是合并两个数据框,以便如果 ColumnA 和 Column1 具有相同的值,则 df2 中的行将附加到 df1 中的相应行,如下所示:
df1:
Column1 Column2 Column3 ColumnB ColumnC
0 a x x s s
1 c x x y y
2 e x x z z
3 d x x f f
4 h x x NaN NaN
5 k x x NaN NaN
I know that the merge is doable through
我知道合并是可行的
df1.merge(df2,left_on='Column1', right_on='ColumnA')
but this command drops all rows that are not the same in Column1 and ColumnA in both files. Instead of that I want to keep these rows in df1 and just assign NaN to them in the columns where other rows have a value from df2, as shown above. Is there a smooth way to do this in pandas?
但是此命令会删除两个文件中 Column1 和 ColumnA 中不相同的所有行。相反,我想将这些行保留在 df1 中,并在其他行具有来自 df2 的值的列中将 NaN 分配给它们,如上所示。在熊猫中有没有一种顺利的方法来做到这一点?
Thanks in advance!
提前致谢!
采纳答案by Sina
You can read the documentation here: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html
您可以在此处阅读文档:http: //pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html
What you are looking for is a left join. The default option is an inner join. You can change this behavior by passing a different how argument:
您正在寻找的是左连接。默认选项是内部联接。您可以通过传递不同的 how 参数来更改此行为:
df1.merge(df2,how='left', left_on='Column1', right_on='ColumnA')
回答by sgrg
Looks like you're looking for something like a left-join. See if this example helps: http://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html#left-outer-join
看起来你正在寻找像左连接这样的东西。看看这个例子是否有帮助:http: //pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html#left-outer-join
You can basically pass a parameter to merge()
called how='left'
您基本上可以将参数传递给merge()
被调用how='left'