Python - 类似于 VLOOKUP (Excel) 的功能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41511730/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-20 01:06:49  来源:igfitidea点击:

Python - function similar to VLOOKUP (Excel)

pythonpandasdataframevlookup

提问by Christian

i am trying to join two data frames but cannot get my head around the possibilities Python has to offer.

我正在尝试加入两个数据框,但无法理解 Python 必须提供的可能性。

First dataframe:

第一个数据框:

ID MODEL   REQUESTS ORDERS
1  Golf    123      4
2  Passat  34       5
3  Model 3 500      8
4  M3      5        0

Second dataframe:

第二个数据框:

MODEL   TYPE  MAKE
Golf    Sedan Volkswagen
M3      Coupe BMW
Model 3 Sedan Tesla

What I want is to add another column in the first dataframe called "make" so that it looks like this:

我想要的是在第一个名为“make”的数据框中添加另一列,使其看起来像这样:

ID MODEL   MAKE       REQUESTS ORDERS
1  Golf    Volkswagen 123      4
2  Passat  Volkswagen 34       5
3  Model 3 Tesla      500      8
4  M3      BMW        5        0

I already looked at merge, join and map but all examples just appended the required information at the end of the dataframe.

我已经看过合并、连接和映射,但所有示例都只是在数据帧的末尾附加了所需的信息。

回答by jezrael

I think you can use insertwith mapby Seriescreated with df2(if some value in column MODELin df2is missing get NaN):

我认为您可以使用insertwith mapby Seriescreated with df2(如果 in 列MODEL中的某些值df2缺少 get NaN):

df1.insert(2, 'MAKE', df1['MODEL'].map(df2.set_index('MODEL')['MAKE']))
print (df1)
   ID    MODEL        MAKE  REQUESTS  ORDERS
0   1     Golf  Volkswagen       123       4
1   2   Passat         NaN        34       5
2   3  Model 3       Tesla       500       8
3   4       M3         BMW         5       0

回答by Bhagabat Behera

Although not in this case, but there might be scenarios where df2 has more than two columns and you would just want to add one out of those to df1 based on a specific column as key. Here is a generic code that you may find useful.

虽然不是在这种情况下,但可能存在 df2 有两列以上的情况,您只想根据特定列作为键将其中的一列添加到 df1 中。这是您可能会发现有用的通用代码。

df = pd.merge(df1, df2[['MODEL', 'MAKE']], on = 'MODEL', how = 'left')

回答by Ted Petrou

The joinmethod acts very similarly to a VLOOKUP. It joins a column in the first dataframe with the index of the second dataframe so you must set MODELas the index in the second dataframe and only grab the MAKEcolumn.

join方法的作用与 VLOOKUP 非常相似。它将第一个数据帧中的列与第二个数据帧的索引连接起来,因此您必须将其设置MODEL为第二个数据帧中的索引,并且只抓取该MAKE列。

df.join(df1.set_index('MODEL')['MAKE'], on='MODEL')

Take a look at the documentation for joinas it actually uses the word VLOOKUP.

查看join文档,因为它实际上使用了 VLOOKUP 这个词。

回答by Yona

I always found merge to be an easy way to do this:

我总是发现合并是一个简单的方法来做到这一点:

df1.merge(df2[['MODEL', 'MAKE']], how = 'left')

However, I must admit it would not be as short and nice if you wanted to call the new column something else than 'MAKE'.

但是,我必须承认,如果您想将新列称为“MAKE”以外的其他名称,它就不会那么短和好。