在 2 个 Pandas 数据帧之间查找
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39993238/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
vlookup between 2 Pandas dataframes
提问by Andrei Cozma
I have 2 pandas Dataframes as follows.
我有 2 个Pandas数据框,如下所示。
DF1:
DF1:
Security ISIN
ABC I1
DEF I2
JHK I3
LMN I4
OPQ I5
and DF2:
和DF2:
ISIN Value
I2 100
I3 200
I5 300
I would like to end up with a third dataframe looking like this:
我想最终得到一个看起来像这样的第三个数据框:
DF3:
DF3:
Security Value
DEF 100
JHK 200
OPQ 300
回答by jezrael
You can use merge
, by default is inner join, so how=inner
is omit and if there is only one common column in both Dataframes
, you can also omit parameter on='ISIN'
:
您可以使用merge
,默认情况下是内连接,因此how=inner
省略,如果两者中只有一个公共列Dataframes
,您也可以省略参数on='ISIN'
:
df3 = pd.merge(df1, df2)
#remove column ISIN
df3.drop('ISIN', axis=1, inplace=True)
print (df3)
Security Value
0 DEF 100
1 JHK 200
2 OPQ 300
Or map
column ISIN
by Series
from df1
:
或map
列ISIN
由Series
来自df1
:
print (df1.set_index('ISIN')['Security'])
ISIN
I1 ABC
I2 DEF
I3 JHK
I4 LMN
I5 OPQ
Name: Security, dtype: object
#create new df by copy of df2
df3 = df2.copy()
df3['Security'] = df3.ISIN.map(df1.set_index('ISIN')['Security'])
#remove column ISIN
df3.drop('ISIN', axis=1, inplace=True)
#change order of columns
df3 = df3[['Security','Value']]
print (df3)
Security Value
0 DEF 100
1 JHK 200
2 OPQ 300
回答by Matt
You can use pd.merge
to automatically do an inner join on ISIN
. The following line of code should get you going:
您可以使用pd.merge
自动对 进行内部联接ISIN
。以下代码行应该可以帮助您:
df3 = pd.merge(df1, df2)[['Security', 'Value']]
Which results in df3
:
结果是df3
:
Security Value
0 DEF 100
1 JHK 200
2 OPQ 300
The fully reproducible code sample looks like:
完全可重现的代码示例如下所示:
import pandas as pd
df1 = pd.DataFrame({
'Security': ['ABC', 'DEF', 'JHK', 'LMN', 'OPQ'],
'ISIN' : ['I1', 'I2', 'I3', 'I4', 'I5']
})
df2 = pd.DataFrame({
'Value': [100, 200, 300],
'ISIN' : ['I2', 'I3', 'I5']
})
df3 = pd.merge(df1, df2)[['Security', 'Value']]
print(df3)