计算 Pandas GroupBy 上的任意百分位数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19894939/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculate Arbitrary Percentile on Pandas GroupBy
提问by Alex Rothberg
Currently there is a median
method on the Pandas's GroupBy
objects.
目前有一个median
关于 PandasGroupBy
对象的方法。
Is there is a way to calculate an arbitrary percentile
(see: http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.percentile.html) on the groupings?
有没有办法计算分组上的任意值percentile
(参见:http: //docs.scipy.org/doc/numpy-dev/reference/generated/numpy.percentile.html)?
Median would be the calcuation of percentile with q=50
.
中位数将是百分位数的计算q=50
。
回答by TomAugspurger
You want the quantile
method:
你想要的quantile
方法:
In [47]: df
Out[47]:
A B C
0 0.719391 0.091693 one
1 0.951499 0.837160 one
2 0.975212 0.224855 one
3 0.807620 0.031284 one
4 0.633190 0.342889 one
5 0.075102 0.899291 one
6 0.502843 0.773424 one
7 0.032285 0.242476 one
8 0.794938 0.607745 one
9 0.620387 0.574222 one
10 0.446639 0.549749 two
11 0.664324 0.134041 two
12 0.622217 0.505057 two
13 0.670338 0.990870 two
14 0.281431 0.016245 two
15 0.675756 0.185967 two
16 0.145147 0.045686 two
17 0.404413 0.191482 two
18 0.949130 0.943509 two
19 0.164642 0.157013 two
In [48]: df.groupby('C').quantile(.95)
Out[48]:
A B
C
one 0.964541 0.871332
two 0.826112 0.969558
回答by Anshuman Goel
I found another useful solution here
我在这里找到了另一个有用的解决方案
If I have to use groupby
another approach can be:
如果我必须使用groupby
另一种方法可以是:
def percentile(n):
def percentile_(x):
return np.percentile(x, n)
percentile_.__name__ = 'percentile_%s' % n
return percentile_
Using the below call, I am able to achieve the same result as the solution given by @TomAugspurger
使用下面的调用,我能够获得与@TomAugspurger 给出的解决方案相同的结果
df.groupby('C').agg([percentile(50), percentile(95)])
df.groupby('C').agg([percentile(50), percentile(95)])