postgresql 如何在 postgres 查询中排名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10265234/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to rank in postgres query
提问by WildBill
I'm trying to rank a subset of data within a table but I think I am doing something wrong. I cannot find much information about the rank() feature for postgres, maybe I'm looking in the wrong place. Either way:
我正在尝试对表中的数据子集进行排名,但我认为我做错了什么。我找不到关于 postgres rank() 功能的很多信息,也许我找错了地方。无论哪种方式:
I'd like to know the rank of an id that falls within a cluster of a table based on a date. My query is as follows:
我想知道基于日期的表集群中的 id 的等级。我的查询如下:
select cluster_id,feed_id,pub_date,rank
from (select feed_id,pub_date,cluster_id,rank()
over (order by pub_date asc) from url_info)
as bar where cluster_id = 9876 and feed_id = 1234;
I'm modeling this after the following stackoverflow post: postgres rank
我在以下stackoverflow帖子之后对此进行建模: postgres rank
The reason I think I am doing something wrong is that there are only 39 rows in url_info that are in cluster_id 9876 and this query ran for 10 minutes and never came back. (actually re-ran it for quite a while and it returned no results, yet there is a row in cluster 9876 for id 1234) I'm expecting this will tell me something like "id 1234 was 5th for the criteria given). It will return a relative rank according to my query constraints, correct?
我认为我做错了什么的原因是 url_info 中只有 39 行在 cluster_id 9876 中,并且这个查询运行了 10 分钟并且再也没有回来。(实际上重新运行它很长一段时间,它没有返回任何结果,但集群 9876 中有一行用于 id 1234)我希望这会告诉我类似“id 1234 是给定标准的第 5 行”。将根据我的查询约束返回相对排名,对吗?
This is postgres 8.4 btw.
这是 postgres 8.4 顺便说一句。
回答by dbenhur
By placing the rank() function in the subselect and not specifying a PARTITION BY in the over clause or any predicate in that subselect, your query is asking to produce a rank over the entire url_info table ordered by pub_date. This is likely why it ran so long as to rank over all of url_info, Pg must sort the entire table by pub_date, which will take a while if the table is very large.
通过将 rank() 函数放在 subselect 中,并且不在 over 子句或该 subselect 中的任何谓词中指定 PARTITION BY,您的查询要求对按 pub_date 排序的整个 url_info 表生成排名。这可能就是为什么它运行的时间长到对所有 url_info 进行排名,Pg 必须按 pub_date 对整个表进行排序,如果表非常大,这将需要一段时间。
It appears you want to generate a rank for just the set of records selected by the where clause, in which case, all you need do is eliminate the subselect and the rank function is implicitly over the set of records matching that predicate.
看起来您只想为 where 子句选择的记录集生成排名,在这种情况下,您需要做的就是消除子选择,并且排名函数隐式地覆盖与该谓词匹配的记录集。
select
cluster_id
,feed_id
,pub_date
,rank() over (order by pub_date asc) as rank
from url_info
where cluster_id = 9876 and feed_id = 1234;
If what you really wanted was the rank within the cluster, regardless of the feed_id, you can rank in a subselect which filters to that cluster:
如果您真正想要的是集群中的排名,无论 feed_id 如何,您都可以在筛选到该集群的子选择中进行排名:
select ranked.*
from (
select
cluster_id
,feed_id
,pub_date
,rank() over (order by pub_date asc) as rank
from url_info
where cluster_id = 9876
) as ranked
where feed_id = 1234;
回答by Anvesh
Sharing another example of DENSE_RANK() of PostgreSQL. Find top 3 students sample query. Reference taken from this blog:
分享另一个 PostgreSQL DENSE_RANK() 的例子。查找前 3 名学生示例查询。 参考来自此博客:
Create a table with sample data:
创建一个包含示例数据的表:
CREATE TABLE tbl_Students
(
StudID INT
,StudName CHARACTER VARYING
,TotalMark INT
);
INSERT INTO tbl_Students
VALUES
(1,'Anvesh',88),(2,'Neevan',78)
,(3,'Roy',90),(4,'Mahi',88)
,(5,'Maria',81),(6,'Jenny',90);
Using DENSE_RANK(), Calculate RANK of students:
使用 DENSE_RANK(),计算学生的 RANK:
;WITH cteStud AS
(
SELECT
StudName
,Totalmark
,DENSE_RANK() OVER (ORDER BY TotalMark DESC) AS StudRank
FROM tbl_Students
)
SELECT
StudName
,Totalmark
,StudRank
FROM cteStud
WHERE StudRank <= 3;
The Result:
结果:
studname | totalmark | studrank
----------+-----------+----------
Roy | 90 | 1
Jenny | 90 | 1
Anvesh | 88 | 2
Mahi | 88 | 2
Maria | 81 | 3
(5 rows)