合并具有来自两个不同列的匹配值的数据帧 - Pandas
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45175060/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merge DataFrames with Matching Values From Two Different Columns - Pandas
提问by i.n.n.m
I have two different DataFrames that I want to merge with date
and hours
columns. I saw some threadsthat are there, but I could not find the solution for my issue. I also read thisdocument and tried different combinations, however, did not work well.
我有两个不同的 DataFrame 想要合并date
和hours
列。我看到了一些线程,但我找不到解决我的问题的方法。我也阅读了这个文档并尝试了不同的组合,但是效果不佳。
Example of my two different DataFrames,
我的两个不同数据帧的示例,
DF1
DF1
date hours var1 var2
0 2013-07-10 00:00:00 150.322617 52.225920
1 2013-07-10 01:00:00 155.250917 53.365296
2 2013-07-10 02:00:00 124.918667 51.158249
3 2013-07-10 03:00:00 143.839217 53.138251
.....
9 2013-09-10 09:00:00 148.135818 86.676341
10 2013-09-10 10:00:00 147.833517 53.658016
11 2013-09-10 12:00:00 149.580233 69.745368
12 2013-09-10 13:00:00 163.715317 14.524894
13 2013-09-10 14:00:00 168.856650 10.762779
DF2
DF2
date hours myvar1 myvar2
0 2013-07-10 09:00:00 1.617 98.56
1 2013-07-10 10:00:00 2.917 23.60
2 2013-07-10 12:00:00 19.667 36.15
3 2013-07-10 13:00:00 14.217 45.16
.....
20 2013-09-10 20:00:00 1.517 53.56
21 2013-09-10 21:00:00 5.233 69.47
22 2013-09-10 22:00:00 13.717 14.25
23 2013-09-10 23:00:00 18.850 10.69
As you can see in both DataFrames, DF2
starts with 09:00:00
and I want to join with DF1
09:00:00
, which is basically the matchind dates and times. So far, I tried many different combination using previous threads and the documentation mentioned above. An example,
正如您在两个 DataFrame 中所见,DF2
以 开头,09:00:00
我想加入DF1
09:00:00
,这基本上是匹配日期和时间。到目前为止,我使用以前的线程和上面提到的文档尝试了许多不同的组合。一个例子,
merged_df = DF2.merge(DF1, how = 'left', on = ['date', 'hours'])
This was introduces NAN
values for right right
DataFrame. I know, I do not have to use both date
and hours
columns, however, still getting the same result. I tried R
quick like this, which works perfectly fine.
这是NAN
为正确的right
DataFrame引入值。我知道,我不必同时使用date
和hours
列,但是,仍然得到相同的结果。我R
像这样快速尝试,效果很好。
merged_df <- left_join(DF1, DF2, by = 'date')
Is there anyway in pandas
to merge DatFrames just with matching values without getting NaN
values?
是否有pandas
将 DatFrames 仅与匹配值合并而不获取NaN
值的方法?
回答by Scott Boston
Use how='inner'
in pd.merge
:
使用how='inner'
在pd.merge
:
merged_df = DF2.merge(DF1, how = 'inner', on = ['date', 'hours'])
This will perform and "inner-join" thereby omitting rows in each dataframe that do not match. Hence, no NaN in either the right or left part of merged dataframe.
这将执行和“内部连接”,从而省略每个数据帧中不匹配的行。因此,合并数据帧的右侧或左侧部分都没有 NaN。