pandas 熊猫表查找

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35469399/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:43:15  来源:igfitidea点击:

Pandas table lookup

pythonpandaslookup

提问by Zenvega

I have a pandas lookup table which looks like this

我有一个看起来像这样的Pandas查找表

Grade   Lower_Boundary  Upper_Boundary
1   -110    -96
2   -96 -91
3   -91 -85
4   -85 -81
5   -81 -77
6   -77 -72
7   -72 -68
8   -68 -63
9   -63 -58
10  -58 -54
11  -54 -50
12  -50 -46
13  -46 -42
14  -42 -38
15  -38 -34
16  -34 -28
17  -28 -18
18  -18 -11
19  -11 -11
20  -11 -9

I have another pandas dataframe that looks contains score. I want to assign 'Grade' to the score column, by looking up the look up table. So based on which interval of lower and upper boundary the score falls, the grade should be assigned from that row in the lookup table. Is there a way to do it without typing a bunch of if then else statements? I am thinking just of excel's index match.

我有另一个看起来包含分数的Pandas数据框。我想通过查找查找表将“等级”分配给分数列。因此,根据分数落在下边界和上边界的哪个区间,应该从查找表中的那一行分配等级。有没有办法在不输入一堆 if then else 语句的情况下做到这一点?我只想到 excel 的索引匹配。

Score   Grade
-75 6
-75 6
-60 9
-66 8
-66 8
-98 1
-60 9
-82 4
-70 7
-60 9
-60 9
-60 9
-56 10
-70 7
-70 7
-70 7
-66 8
-56 10
-66 8
-66 8

回答by IanS

A one-line solution (I call your lookup table lookup):

一种单行解决方案(我称您为查找表lookup):

df['Score'].apply(lambda score: lookup['Grade'][(lookup['Lower_Boundary'] <= score) & (lookup['Upper_Boundary'] > score)].values[0])


Explanation:

说明

For a given score, here is how to find the grade:

对于给定的分数,以下是查找成绩的方法:

score = -75
match = (lookup['Lower_Boundary'] <= score) & (lookup['Upper_Boundary'] > score)
grade = lookup['Grade'][match]

This return a series of length 1. You can get its value with, for instance:

这将返回一系列长度为 1 的值。例如,您可以获取其值:

grade.values[0]

All you need to do is applythe above to the score column. If you want a one-liner, use a lambdafunction:

您需要做的就是apply对分数列进行上述操作。如果你想要一个单线,使用一个lambda函数:

df['Score'].apply(lambda score: lookup['Grade'][(lookup['Lower_Boundary'] <= score) & (lookup['Upper_Boundary'] > score)].values[0])

Otherwise the following would be more readable:

否则以下内容将更具可读性:

def lookup_grade(score):
    match = (lookup['Lower_Boundary'] <= score) & (lookup['Upper_Boundary'] > score)
    grade = lookup['Grade'][match]
    return grade.values[0]

df['Score'].apply(lookup_grade)

This approach would also make it easier to deal with cases when no match is found.

这种方法还可以更轻松地处理找不到匹配项的情况。