postgresql 中的第 n 个百分位计算
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14316562/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
nth percentile calculations in postgresql
提问by Codek
I've been surprisingly unable to find an nth percentile function for postgresql.
我出人意料地找不到 postgresql 的第 n 个百分位函数。
I am using this via mondrian olap tool so i just need an aggregate function which returns a 95th percentile.
我通过 mondrian olap 工具使用它,所以我只需要一个返回第 95 个百分位数的聚合函数。
I did find this link:
我确实找到了这个链接:
http://www.postgresql.org/message-id/[email protected]
http://www.postgresql.org/message-id/[email protected]
But for some reason the code in that percentile function is returning nulls in some cases with certain queries. I've checked the data and there's nothing odd in the data that would seem to cause that!
但出于某种原因,该百分位函数中的代码在某些情况下与某些查询返回空值。我检查了数据,数据中没有任何奇怪的地方会导致这种情况!
回答by alfonx
With PostgreSQL 9.4 there is native support for percentiles now, implemented in Ordered-Set Aggregate Functions:
PostgreSQL 9.4 现在有对百分位数的原生支持,在Ordered-Set Aggregate Functions 中实现:
percentile_cont(fraction) WITHIN GROUP (ORDER BY sort_expression)
continuous percentile: returns a value corresponding to the specified fraction in the ordering, interpolating between adjacent input items if needed
percentile_cont(fractions) WITHIN GROUP (ORDER BY sort_expression)
multiple continuous percentile: returns an array of results matching the shape of the fractions parameter, with each non-null element replaced by the value corresponding to that percentile
percentile_cont(fraction) WITHIN GROUP (ORDER BY sort_expression)
连续百分位数:返回与排序中指定分数相对应的值,如果需要,在相邻输入项之间进行插值
percentile_cont(fractions) WITHIN GROUP (ORDER BY sort_expression)
多个连续百分位数:返回与分数参数形状匹配的结果数组,每个非空元素替换为对应于该百分位数的值
See the documentation for more details: http://www.postgresql.org/docs/current/static/functions-aggregate.html
有关更多详细信息,请参阅文档:http: //www.postgresql.org/docs/current/static/functions-aggregate.html
and see here for some examples: https://github.com/michaelpq/michaelpq.github.io/blob/master/_posts/2014-02-27-postgres-9-4-feature-highlight-within-group.markdown
并在此处查看一些示例:https: //github.com/michaelpq/michaelpq.github.io/blob/master/_posts/2014-02-27-postgres-9-4-feature-highlight-within-group.markdown
回答by Mike
The ntile
function is very useful here. I have a table test_temp
:
该ntile
功能在这里非常有用。我有一张桌子test_temp
:
select * from test_temp
score
integer
3
5
2
10
4
8
7
12
select score, ntile(4) over (order by score) as quartile from test_temp;
score quartile
integer integer
2 1
3 1
4 2
5 2
7 3
8 3
10 4
12 4
ntile(4) over (order by score)
orders the columns by score, splits it into four even groups (if the number divides evenly) and assigns the group number based on the order.
ntile(4) over (order by score)
按分数对列进行排序,将其分成四个偶数组(如果数字均分)并根据顺序分配组号。
Since I have 8 numbers here, they represent the 0th, 12.5th, 25th, 37.5th, 50th, 62.5th, 75th and 87.5th percentiles. So if I only take the results where the quartile
is 2, I'll have the 25th and 37.5th percentiles.
由于我这里有 8 个数字,它们分别代表第 0、12.5、25、37.5、50、62.5、75 和 87.5 个百分位数。因此,如果我只采用 2 的结果quartile
,我将获得第 25 个和第 37.5 个百分位数。
with ranked_test as (
select score, ntile(4) over (order by score) as quartile from temp_test
)
select min(score) from ranked_test
where quartile = 2
group by quartile;
returns 4
, the third highest number on the list of 8.
返回4
8 列表中第三大的数字。
If you had a larger table and used ntile(100)
the column you filter on would be the percentile, and you could use the same query as above.
如果您有一个更大的表并使用ntile(100)
您过滤的列将是百分位数,您可以使用与上面相同的查询。