Python 按列名加入熊猫数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20375561/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Joining pandas dataframes by column names
提问by Alexis Eggermont
I have two dataframes with the following column names:
我有两个具有以下列名称的数据框:
frame_1:
event_id, date, time, county_ID
frame_2:
countyid, state
I would like to get a dataframe with the following columns by joining (left) on county_ID = countyid:
我想通过加入(左)获得包含以下列的数据框county_ID = countyid:
joined_dataframe
event_id, date, time, county, state
I cannot figure out how to do it if the columns on which I want to join are not the index. What's the easiest way? Thanks!
如果我想加入的列不是索引,我不知道该怎么做。最简单的方法是什么?谢谢!
采纳答案by Woody Pride
you can use the left_on and right_on options as follows:
您可以使用 left_on 和 right_on 选项,如下所示:
pd.merge(frame_1, frame_2, left_on='county_ID', right_on='countyid')
I was not sure from the question if you only wanted to merge if the key was in the left hand dataframe. If that is the case then the following will do that (the above will in effect do a many to many merge)
如果键在左侧数据框中,我不确定您是否只想合并。如果是这种情况,则以下将执行此操作(上述实际上将进行多对多合并)
pd.merge(frame_1, frame_2, how='left', left_on='county_ID', right_on='countyid')
回答by behzad.nouri
you need to make county_IDas index for the right frame:
您需要county_ID为正确的框架制作索引:
frame_2.join ( frame_1.set_index( [ 'county_ID' ], verify_integrity=True ),
on=[ 'countyid' ], how='left' )
for your information, in pandas left join breaks when the right frame has non unique values on the joining column. see this bug.
供您参考,当右侧框架在连接列上具有非唯一值时,在 Pandas 中,左连接会中断。看到这个错误。
so you need to verify integrity before joining by , verify_integrity=True
所以你需要在加入之前验证完整性, verify_integrity=True

![Python SOCKET 错误:[Errno 111] 连接被拒绝](/res/img/loading.gif)