根据列中的最大值过滤 DataFrame - Pandas
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/18906530/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Filter DataFrame based on Max value in Column - Pandas
提问by DJElbow
Using pandas, I have a DataFrame that looks like this:
使用 Pandas,我有一个如下所示的 DataFrame:
Hour            Browser     Metric1   Metric2   Metric3
2013-08-18 00   IE          1000      500       3000
2013-08-19 00   FF          2000      250       6000
2013-08-20 00   Opera       3000      450       9000
2001-03-21 00   Chrome/29   3000      450       9000
2013-08-21 00   Chrome/29   3000      450       9000
2014-01-22 00   Chrome/29   3000      750       9000
I want to create an array of browsers which have a maximum value of Metric1 > 2000. Is there a best way to do this? You can see basically what I am trying to do with the code below.
我想创建一个浏览器数组,其最大值为 Metric1 > 2000。有没有最好的方法来做到这一点?您基本上可以看到我正在尝试用下面的代码做什么。
browsers = df[df.Metric1.max() > 2000]['Browser'].unique()
回答by Andy Hayden
You could groupby Browser and take the max:
您可以按浏览器分组并取最大值:
In [11]: g = df.groupby('Browser')
In [12]: g['Metric1'].max()
Out[12]:
Browser
Chrome/29    3000
FF           2000
IE           1000
Opera        3000
Name: Metric1, dtype: int64
In [13]: over2000 = g['Metric1'].max() > 2000
In [14]: over2000
Out[14]:
Browser
Chrome/29     True
FF           False
IE           False
Opera         True
Name: Metric1, dtype: bool
To get out the array, use this as a boolean mask:
要取出数组,请将其用作布尔掩码:
In [15]: over2000[over2000].index.values
Out[15]: array(['Chrome/29', 'Opera'], dtype=object)

