pandas isin 熊猫的问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32978362/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:59:26  来源:igfitidea点击:

Problems with isin pandas

pythonpandas

提问by qwertylpc

Sorry, I just asked this question: Pythonic Way to have multiple Or's when conditioning in a dataframebut marked it as answered prematurely because it passed my overly simplistic test case, but isn't working more generally. (If it is possible to merge and reopen the question that would be great...)

抱歉,我刚刚问了这个问题: Pythonic Way to have multiple Or's whenconditioning in a dataframe但过早地将其标记为已回答,因为它通过了我过于简单的测试用例,但不能更普遍地工作。(如果可以合并并重新打开这个问题,那就太好了......)

Here is the full issue:

这是完整的问题:

sum(data['Name'].isin(eligible_players))
> 0

sum(data['Name'] == "Antonio Brown")
> 68

"Antonio Brown" in eligible_players
> True

Basically if I understand correctly, I am showing that Antonio Brown is in eligible players and he is in the dataframe. However, for some reason the .isin() isn't working properly.

基本上,如果我理解正确的话,我会展示安东尼奥·布朗在符合条件的球员中并且他在数据框中。但是,由于某种原因, .isin() 不能正常工作。

As I said in my prior question, I am looking for a way to check many ors to select the proper rows

正如我在我之前的问题中所说,我正在寻找一种方法来检查许多 ors 以选择正确的行

____ EDIT ____

____ 编辑 ____

In[14]:
eligible_players
Out[14]:
Name
Antonio Brown       378
Demaryius Thomas    334
Jordy Nelson        319
Dez Bryant          309
Emmanuel Sanders    293
Odell Beckham       289
Julio Jones         288
Randall Cobb        284
Jeremy Maclin       267
T.Y. Hilton         255
Alshon Jeffery      252
Golden Tate         250
Mike Evans          236
DeAndre Hopkins     223
Calvin Johnson      220
Kelvin Benjamin     218
Julian Edelman      213
Anquan Boldin       213
Steve Smith         213
Roddy White         208
Brandon LaFell      205
Mike Wallace        205
A.J. Green          203
DeSean Hymanson      200
Jordan Matthews     194
Eric Decker         194
Sammy Watkins       190
Torrey Smith        186
Andre Johnson       186
Jarvis Landry       178
Eddie Royal         176
Brandon Marshall    175
Vincent Hymanson     175
Rueben Randle       174
Marques Colston     173
Mohamed Sanu        171
Keenan Allen        170
James Jones         168
Malcom Floyd        168
Kenny Stills        167
Greg Jennings       162
Kendall Wright      162
Doug Baldwin        160
Michael Floyd       159
Robert Woods        158
Name: Pts, dtype: int64

and

In [31]:
data.tail(110)
Out[31]:
Name    Pts year    week    pos Team
28029   Dez Bryant  25  2014    17  WR  DAL
28030   Antonio Brown   25  2014    17  WR  PIT
28031   Jordan Matthews 24  2014    17  WR  PHI
28032   Randall Cobb    23  2014    17  WR  GB
28033   Rueben Randle   21  2014    17  WR  NYG
28034   Demaryius Thomas    19  2014    17  WR  DEN
28035   Calvin Johnson  19  2014    17  WR  DET
28036   Torrey Smith    18  2014    17  WR  BAL
28037   Roddy White 17  2014    17  WR  ATL
28038   Steve Smith 17  2014    17  WR  BAL
28039   DeSean Hymanson  16  2014    17  WR  WAS
28040   Mike Evans  16  2014    17  WR  TB
28041   Anquan Boldin   16  2014    17  WR  SF
28042   Adam Thielen    15  2014    17  WR  MIN
28043   Cecil Shorts    15  2014    17  WR  JAC
28044   A.J. Green  15  2014    17  WR  CIN
28045   Jordy Nelson    14  2014    17  WR  GB
28046   Brian Hartline  14  2014    17  WR  MIA
28047   Robert Woods    13  2014    17  WR  BUF
28048   Kenny Stills    13  2014    17  WR  NO
28049   Emmanuel Sanders    13  2014    17  WR  DEN
28050   Eddie Royal 13  2014    17  WR  SD
28051   Marques Colston 13  2014    17  WR  NO
28052   Chris Owusu 12  2014    17  WR  NYJ
28053   Brandon LaFell  12  2014    17  WR  NE
28054   Dontrelle Inman 12  2014    17  WR  SD
28055   Reggie Wayne    11  2014    17  WR  IND
28056   Paul Richardson 11  2014    17  WR  SEA
28057   Cole Beasley    11  2014    17  WR  DAL
28058   Jarvis Landry   10  2014    17  WR  MIA

回答by DSM

(Aside: once you posted what you were actually using, it only took seconds to see the problem.)

(旁白:一旦您发布了您实际使用的内容,只需几秒钟就可以看到问题。)

Series.isin(something)iterates over somethingto determine the set of things you want to test membership in. But your eligible_playersisn't a list, it's a Series. And iteration over a Series is iteration over the values, even though membership (in) is with respect to the index:

Series.isin(something)迭代something以确定您想要测试成员资格的一组事物。但您eligible_players不是一个列表,它是一个 Series。对 Series 的迭代是对values 的迭代,即使成员资格 ( in) 是关于索引的:

In [72]: eligible_players = pd.Series([10,20,30], index=["A","B","C"])

In [73]: list(eligible_players)
Out[73]: [10, 20, 30]

In [74]: "A" in eligible_players
Out[74]: True

So in your case, you could use eligible_players.indexinstead to pass the right names:

因此,在您的情况下,您可以使用eligible_players.index来传递正确的名称:

In [75]: df = pd.DataFrame({"Name": ["A","B","C","D"]})

In [76]: df
Out[76]: 
  Name
0    A
1    B
2    C
3    D

In [77]: df["Name"].isin(eligible_players) # remember, this will be [10, 20, 30]
Out[77]: 
0    False
1    False
2    False
3    False
Name: Name, dtype: bool

In [78]: df["Name"].isin(eligible_players.index)
Out[78]: 
0     True
1     True
2     True
3    False
Name: Name, dtype: bool

In [79]: df["Name"].isin(eligible_players.index).sum()
Out[79]: 3