如何在 Pandas 中合并“(df1 & not df2)”数据框?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32676027/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:54:22  来源:igfitidea点击:

How to do "(df1 & not df2)" dataframe merge in pandas?

pythonjoinpandasmergedataframe

提问by GeorgeOfTheRF

I have 2 pandas dataframes df1 & df2 with common columns/keys (x,y).

我有 2 个 Pandas 数据框 df1 和 df2,它们具有公共列/键(x,y)。

I want to merge do a "(df1 & not df2)" kind of merge on keys (x,y), meaning I want my code to return a dataframe containing rows with (x,y) only in df1 & not in df2.

我想合并对键 (x,y) 执行“(df1 & not df2)”类型的合并,这意味着我希望我的代码返回一个数据框,其中包含仅在 df1 中而不在 df2 中的 (x,y) 行。

SAS has an equivalent functionality

SAS 具有等效的功能

data final;
merge df1(in=a) df2(in=b);
by x y;
if a & not b;
run;

Who to replicate the same functionality in pandas elegantly? It would have been great if we can specify how="left-right" in merge().

谁来优雅地复制 Pandas 中的相同功能?如果我们可以在 merge() 中指定 how="left-right" 就太好了。

回答by GeorgeOfTheRF

I just upgraded to version 0.17.0 RC1 which was released 10 days ago. Just found out that pd.merge() have new argument in this new release called indicator=True to acheive this in pandonic way!!

我刚刚升级到 10 天前发布的 0.17.0 RC1 版本。刚刚发现 pd.merge() 在这个名为 indicator=True 的新版本中有新的参数,可以以 pandonic 的方式实现这一点!!

df=pd.merge(df1,df2,on=['x','y'],how="outer",indicator=True)
df=df[df['_merge']=='left_only']

indicator: Add a column to the output DataFrame called _merge with information on the source of each row. _merge is Categorical-type and takes on a value of left_only for observations whose merge key only appears in 'left' DataFrame, right_only for observations whose merge key only appears in 'right' DataFrame, and both if the observation's merge key is found in both.

指标:将一列添加到名为 _merge 的输出 DataFrame 中,其中包含有关每行源的信息。_merge 是 Categorical 类型,对于合并键仅出现在“左”DataFrame 中的观察值采用 left_only 值,对于合并键仅出现在“right”DataFrame 中的观察值采用 right_only,如果在两者中都找到了观察值的合并键,则两者都采用.

http://pandas-docs.github.io/pandas-docs-travis/merging.html#database-style-dataframe-joining-merging

http://pandas-docs.github.io/pandas-docs-travis/merging.html#database-style-dataframe-joining-merging