仅当 Pandas 中的值为空/Null 时才合并
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30296798/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merge Only When Value is Empty/Null in Pandas
提问by Jonathan Kennedy
I have two dataframes in Pandas which are being merged together df.A and df.B, df.A is the original, and df.B has the new data I want to bring over. The merge works fine and as expected I get two columns col_x and col_y in the merged df.
我在 Pandas 中有两个数据帧,它们正在合并在一起 df.A 和 df.B,df.A 是原始数据,而 df.B 有我想带来的新数据。合并工作正常,正如预期的那样,我在合并的 df 中得到两列 col_x 和 col_y。
However, in some rows, the original df.A has values where the other df.B does not. My question is, how can I selectively take the values from col_x and col_y and place them into a new col such as col_z ?
但是,在某些行中,原始 df.A 具有其他 df.B 没有的值。我的问题是,如何有选择地从 col_x 和 col_y 获取值并将它们放入新的 col 中,例如 col_z ?
Here's what I mean, how can I merge df.A:
这就是我的意思,我如何合并 df.A:
date impressions spend col
1/1/15 100000 3.00 ABC123456
1/2/15 145000 5.00 ABCD00000
1/3/15 300000 15.00 (null)
with df.B
与 df.B
date col
1/1/15 (null)
1/2/15 (null)
1/3/15 DEF123456
To get:
要得到:
date impressions spend col_z
1/1/15 100000 3.00 ABC123456
1/2/15 145000 5.00 ABCD00000
1/3/15 300000 15.00 DEF123456
Any help or point in the right direction would be really appreciated!
任何帮助或指向正确方向的点都将不胜感激!
Thanks
谢谢
采纳答案by EdChum
OK assuming that your (null) values are in fact NaN values and not that string then the following works:
好的,假设您的(空)值实际上是 NaN 值而不是那个字符串,那么以下工作:
In [10]:
# create the merged df
merged = dfA.merge(dfB, on='date')
merged
Out[10]:
date impressions spend col_x col_y
0 2015-01-01 100000 3 ABC123456 NaN
1 2015-01-02 145000 5 ABCD00000 NaN
2 2015-01-03 300000 15 NaN DEF123456
You can use whereto conditionally assign a value from the _x and _y columns:
您可以使用where_x 和 _y 列有条件地分配一个值:
In [11]:
# now create col_z using where
merged['col_z'] = merged['col_x'].where(merged['col_x'].notnull(), merged['col_y'])
merged
Out[11]:
date impressions spend col_x col_y col_z
0 2015-01-01 100000 3 ABC123456 NaN ABC123456
1 2015-01-02 145000 5 ABCD00000 NaN ABCD00000
2 2015-01-03 300000 15 NaN DEF123456 DEF123456
You can then dropthe extraneous columns:
然后drop,您可以使用无关的列:
In [13]:
merged = merged.drop(['col_x','col_y'],axis=1)
merged
Out[13]:
date impressions spend col_z
0 2015-01-01 100000 3 ABC123456
1 2015-01-02 145000 5 ABCD00000
2 2015-01-03 300000 15 DEF123456
回答by Oskar_U
IMO the shortest and yet readable solution is something like that:
IMO 最短但可读的解决方案是这样的:
df.A.loc[df.A['col'].isna(), 'col'] = df.A.merge(df.B, how='left', on='date')['col_y']
What it basically does is assigning values from merged table column col_yto primary df.Atable, for those rows in colcolumn, which are empty (.isna()condition).
它的主要作用是将合并表列中的值分配col_y给主df.A表,对于列中col为空的那些行(.isna()条件)。

