postgresql 如何在 postgres 查询中排名

Question

提问by WildBill

I'm trying to rank a subset of data within a table but I think I am doing something wrong. I cannot find much information about the rank() feature for postgres, maybe I'm looking in the wrong place. Either way:

我正在尝试对表中的数据子集进行排名，但我认为我做错了什么。我找不到关于 postgres rank() 功能的很多信息，也许我找错了地方。无论哪种方式：

I'd like to know the rank of an id that falls within a cluster of a table based on a date. My query is as follows:

我想知道基于日期的表集群中的 id 的等级。我的查询如下：

select cluster_id,feed_id,pub_date,rank 
from (select feed_id,pub_date,cluster_id,rank() 
    over (order by pub_date asc) from url_info) 
as bar where cluster_id = 9876 and feed_id = 1234;

I'm modeling this after the following stackoverflow post: postgres rank

我在以下stackoverflow帖子之后对此进行建模： postgres rank

The reason I think I am doing something wrong is that there are only 39 rows in url_info that are in cluster_id 9876 and this query ran for 10 minutes and never came back. (actually re-ran it for quite a while and it returned no results, yet there is a row in cluster 9876 for id 1234) I'm expecting this will tell me something like "id 1234 was 5th for the criteria given). It will return a relative rank according to my query constraints, correct?

我认为我做错了什么的原因是 url_info 中只有 39 行在 cluster_id 9876 中，并且这个查询运行了 10 分钟并且再也没有回来。（实际上重新运行它很长一段时间，它没有返回任何结果，但集群 9876 中有一行用于 id 1234）我希望这会告诉我类似“id 1234 是给定标准的第 5 行”。将根据我的查询约束返回相对排名，对吗？

This is postgres 8.4 btw.

这是 postgres 8.4 顺便说一句。

Answer 1

回答by dbenhur

By placing the rank() function in the subselect and not specifying a PARTITION BY in the over clause or any predicate in that subselect, your query is asking to produce a rank over the entire url_info table ordered by pub_date. This is likely why it ran so long as to rank over all of url_info, Pg must sort the entire table by pub_date, which will take a while if the table is very large.

通过将 rank() 函数放在 subselect 中，并且不在 over 子句或该 subselect 中的任何谓词中指定 PARTITION BY，您的查询要求对按 pub_date 排序的整个 url_info 表生成排名。这可能就是为什么它运行的时间长到对所有 url_info 进行排名，Pg 必须按 pub_date 对整个表进行排序，如果表非常大，这将需要一段时间。

It appears you want to generate a rank for just the set of records selected by the where clause, in which case, all you need do is eliminate the subselect and the rank function is implicitly over the set of records matching that predicate.

看起来您只想为 where 子句选择的记录集生成排名，在这种情况下，您需要做的就是消除子选择，并且排名函数隐式地覆盖与该谓词匹配的记录集。

select 
  cluster_id
 ,feed_id
 ,pub_date
 ,rank() over (order by pub_date asc) as rank
from url_info
where cluster_id = 9876 and feed_id = 1234;

If what you really wanted was the rank within the cluster, regardless of the feed_id, you can rank in a subselect which filters to that cluster:

如果您真正想要的是集群中的排名，无论 feed_id 如何，您都可以在筛选到该集群的子选择中进行排名：

select ranked.*
from (
  select 
    cluster_id
   ,feed_id
   ,pub_date
   ,rank() over (order by pub_date asc) as rank
  from url_info
  where cluster_id = 9876
) as ranked
where feed_id = 1234;

Answer 2

回答by Anvesh

Sharing another example of DENSE_RANK() of PostgreSQL. Find top 3 students sample query. Reference taken from this blog:

分享另一个 PostgreSQL DENSE_RANK() 的例子。查找前 3 名学生示例查询。 参考来自此博客：

Create a table with sample data:

创建一个包含示例数据的表：

CREATE TABLE tbl_Students
(
    StudID INT
    ,StudName CHARACTER VARYING
    ,TotalMark INT
);

INSERT INTO tbl_Students 
VALUES 
(1,'Anvesh',88),(2,'Neevan',78)
,(3,'Roy',90),(4,'Mahi',88)
,(5,'Maria',81),(6,'Jenny',90);

Using DENSE_RANK(), Calculate RANK of students:

使用 DENSE_RANK()，计算学生的 RANK：

;WITH cteStud AS
(
    SELECT 
        StudName
        ,Totalmark
        ,DENSE_RANK() OVER (ORDER BY TotalMark DESC) AS StudRank
    FROM tbl_Students
)
SELECT 
    StudName
    ,Totalmark
    ,StudRank
FROM cteStud 
WHERE StudRank <= 3;

The Result:

结果：

studname | totalmark | studrank
----------+-----------+----------
 Roy      |        90 |        1
 Jenny    |        90 |        1
 Anvesh   |        88 |        2
 Mahi     |        88 |        2
 Maria    |        81 |        3
(5 rows)

postgresql 如何在 postgres 查询中排名

提问by WildBill

回答by dbenhur

回答by Anvesh

相关推荐

最近更新

标签

postgresql 如何在 postgres 查询中排名

提问by WildBill

回答by dbenhur

回答by Anvesh

相关推荐

postgresql “函数不存在”，但我真的认为它存在

postgresql 仅选择今天（从午夜开始）的时间戳

postgresql pgAdmin 中的主键和外键

使用命令行参数运行 PostgreSQL .sql 文件

相关推荐

最近更新

标签