SQL RANK() 与 ROW_NUMBER()

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7747327/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 12:28:05  来源:igfitidea点击:

SQL RANK() versus ROW_NUMBER()

sqlsql-servertsql

提问by dotNET Hobbiest

I'm confused about the differences between these. Running the following SQL gets me two idential result sets. Can someone please explain the differences?

我对这些之间的差异感到困惑。运行以下 SQL 会得到两个相同的结果集。有人可以解释这些差异吗?

SELECT ID, [Description], RANK()       OVER(PARTITION BY StyleID ORDER BY ID) as 'Rank'      FROM SubStyle
SELECT ID, [Description], ROW_NUMBER() OVER(PARTITION BY StyleID ORDER BY ID) as 'RowNumber' FROM SubStyle

回答by Martin Smith

You will only see the difference if you have ties within a partition for a particular ordering value.

如果您在特定排序值的分区内有联系,您将只会看到差异。

RANKand DENSE_RANKare deterministic in this case, all rows with the same value for both the ordering and partitioning columns will end up with an equal result, whereas ROW_NUMBERwill arbitrarily (non deterministically) assign an incrementing result to the tied rows.

RANK并且 DENSE_RANK在这种情况下是确定性的,排序和分区列具有相同值的所有行最终将得到相同的结果,而ROW_NUMBER将任意(非确定性地)将递增结果分配给绑定行。

Example:(All rows have the same StyleIDso are in the same partition and within that partition the first 3 rows are tied when ordered by ID)

示例:(所有行都相同,StyleID因此位于同一分区中,并且在该分区内,前 3 行在按 排序时并列ID

WITH T(StyleID, ID)
     AS (SELECT 1,1 UNION ALL
         SELECT 1,1 UNION ALL
         SELECT 1,1 UNION ALL
         SELECT 1,2)
SELECT *,
       RANK() OVER(PARTITION BY StyleID ORDER BY ID)       AS 'RANK',
       ROW_NUMBER() OVER(PARTITION BY StyleID ORDER BY ID) AS 'ROW_NUMBER',
       DENSE_RANK() OVER(PARTITION BY StyleID ORDER BY ID) AS 'DENSE_RANK'
FROM   T  

Returns

退货

StyleID     ID       RANK      ROW_NUMBER      DENSE_RANK
----------- -------- --------- --------------- ----------
1           1        1         1               1
1           1        1         2               1
1           1        1         3               1
1           2        4         4               2

You can see that for the three identical rows the ROW_NUMBERincrements, the RANKvalue remains the same then it leaps to 4. DENSE_RANKalso assigns the same rank to all three rows but then the next distinct value is assigned a value of 2.

您可以看到,对于三个相同的行,ROW_NUMBER增量、RANK值保持不变然后跃升至4DENSE_RANK还为所有三行分配相同的等级,但随后为下一个不同的值分配值 2。

回答by Ritesh Mengji

ROW_NUMBER :Returns a unique number for each row starting with 1. For rows that have duplicate values,numbers are arbitarily assigned.

ROW_NUMBER :为每行返回一个唯一编号,从 1 开始。对于具有重复值的行,将任意分配编号。

Rank :Assigns a unique number for each row starting with 1,except for rows that have duplicate values,in which case the same ranking is assigned and a gap appears in the sequence for each duplicate ranking.

Rank :为每行分配一个唯一编号,从 1 开始,具有重复值的行除外,在这种情况下,分配相同的排名,并且每个重复排名的序列中都会出现间隙。

回答by Lukas Eder

This article covers an interesting relationship between ROW_NUMBER()and DENSE_RANK()(the RANK()function is not treated specifically). When you need a generated ROW_NUMBER()on a SELECT DISTINCTstatement, the ROW_NUMBER()will produce distinct values beforethey are removed by the DISTINCTkeyword. E.g. this query

这篇文章介绍了ROW_NUMBER()和之间的一个有趣的关系DENSE_RANK()(该RANK()函数没有特别处理)。当您需要ROW_NUMBER()SELECT DISTINCT语句上生成a 时,ROW_NUMBER()它们被DISTINCT关键字删除之前生成不同的值。例如这个查询

SELECT DISTINCT
  v, 
  ROW_NUMBER() OVER (ORDER BY v) row_number
FROM t
ORDER BY v, row_number

... might produce this result (DISTINCThas no effect):

... 可能会产生此结果(DISTINCT无效):

+---+------------+
| V | ROW_NUMBER |
+---+------------+
| a |          1 |
| a |          2 |
| a |          3 |
| b |          4 |
| c |          5 |
| c |          6 |
| d |          7 |
| e |          8 |
+---+------------+

Whereas this query:

而这个查询:

SELECT DISTINCT
  v, 
  DENSE_RANK() OVER (ORDER BY v) row_number
FROM t
ORDER BY v, row_number

... produces what you probably want in this case:

...在这种情况下产生您可能想要的内容:

+---+------------+
| V | ROW_NUMBER |
+---+------------+
| a |          1 |
| b |          2 |
| c |          3 |
| d |          4 |
| e |          5 |
+---+------------+

Note that the ORDER BYclause of the DENSE_RANK()function will need all other columns from the SELECT DISTINCTclause to work properly.

请注意,该函数的ORDER BY子句DENSE_RANK()需要该子句中的所有其他列SELECT DISTINCT才能正常工作。

The reason for this is that logically, window functions are calculated before DISTINCTis applied.

这样做的原因是从逻辑上讲,窗口函数是在DISTINCT应用之前计算的

All three functions in comparison

三种功能对比

Using PostgreSQL / Sybase / SQL standard syntax (WINDOWclause):

使用 PostgreSQL/Sybase/SQL 标准语法(WINDOW子句):

SELECT
  v,
  ROW_NUMBER() OVER (window) row_number,
  RANK()       OVER (window) rank,
  DENSE_RANK() OVER (window) dense_rank
FROM t
WINDOW window AS (ORDER BY v)
ORDER BY v

... you'll get:

... 你会得到:

+---+------------+------+------------+
| V | ROW_NUMBER | RANK | DENSE_RANK |
+---+------------+------+------------+
| a |          1 |    1 |          1 |
| a |          2 |    1 |          1 |
| a |          3 |    1 |          1 |
| b |          4 |    4 |          2 |
| c |          5 |    5 |          3 |
| c |          6 |    5 |          3 |
| d |          7 |    7 |          4 |
| e |          8 |    8 |          5 |
+---+------------+------+------------+

回答by NotMe

Quite a bit:

相当一点:

The rank of a row is one plus the number of ranks that come before the row in question.

行的排名是相关行之前的排名数加一。

Row_number is the distinct rank of rows, without any gap in the ranking.

Row_number 是行的不同排名,排名没有任何差距。

http://www.bidn.com/blogs/marcoadf/bidn-blog/379/ranking-functions-row_number-vs-rank-vs-dense_rank-vs-ntile

http://www.bidn.com/blogs/marcoadf/bidn-blog/379/ranking-functions-row_number-vs-rank-vs-dense_rank-vs-ntile

回答by DSR

Simple query without partition clause:

没有分区子句的简单查询:

select 
    sal, 
    RANK() over(order by sal desc) as Rank,
    DENSE_RANK() over(order by sal desc) as DenseRank,
    ROW_NUMBER() over(order by sal desc) as RowNumber
from employee 

Output:

输出:

    --------|-------|-----------|----------
    sal     |Rank   |DenseRank  |RowNumber
    --------|-------|-----------|----------
    5000    |1      |1          |1
    3000    |2      |2          |2
    3000    |2      |2          |3
    2975    |4      |3          |4
    2850    |5      |4          |5
    --------|-------|-----------|----------

回答by sansalk

Look this example.

看看这个例子。

CREATE TABLE [dbo].#TestTable(
    [id] [int] NOT NULL,
    [create_date] [date] NOT NULL,
    [info1] [varchar](50) NOT NULL,
    [info2] [varchar](50) NOT NULL,
)

Insert some data

插入一些数据

INSERT INTO dbo.#TestTable (id, create_date, info1, info2)
VALUES (1, '1/1/09', 'Blue', 'Green')
INSERT INTO dbo.#TestTable (id, create_date, info1, info2)
VALUES (1, '1/2/09', 'Red', 'Yellow')
INSERT INTO dbo.#TestTable (id, create_date, info1, info2)
VALUES (1, '1/3/09', 'Orange', 'Purple')
INSERT INTO dbo.#TestTable (id, create_date, info1, info2)
VALUES (2, '1/1/09', 'Yellow', 'Blue')
INSERT INTO dbo.#TestTable (id, create_date, info1, info2)
VALUES (2, '1/5/09', 'Blue', 'Orange')
INSERT INTO dbo.#TestTable (id, create_date, info1, info2)
VALUES (3, '1/2/09', 'Green', 'Purple')
INSERT INTO dbo.#TestTable (id, create_date, info1, info2)
VALUES (3, '1/8/09', 'Red', 'Blue')

Repeat same Values for 1

对 1 重复相同的值

INSERT INTO dbo.#TestTable (id, create_date, info1, info2) VALUES (1, '1/1/09', 'Blue', 'Green')

INSERT INTO dbo.#TestTable (id, create_date, info1, info2) VALUES (1, '1/1/09', 'Blue', 'Green')

Look All

查看全部

SELECT * FROM #TestTable

Look your results

看看你的结果

SELECT Id,
    create_date,
    info1,
    info2,
    ROW_NUMBER() OVER (PARTITION BY Id ORDER BY create_date DESC) AS RowId,
    RANK() OVER(PARTITION BY Id ORDER BY create_date DESC)    AS [RANK]
FROM #TestTable

Need to understand the different

需要了解不同

回答by user2629395

Also, pay attention to ORDER BY in PARTITION (Standard AdventureWorks db is used for example) when using RANK.

另外,在使用 RANK 时,请注意 PARTITION 中的 ORDER BY(例如使用标准 AdventureWorks db)。

SELECT as1.SalesOrderID, as1.SalesOrderDetailID, RANK() OVER (PARTITION BY as1.SalesOrderID ORDER BY as1.SalesOrderID ) ranknoequal , RANK() OVER (PARTITION BY as1.SalesOrderID ORDER BY as1.SalesOrderDetailId ) ranknodiff FROM Sales.SalesOrderDetail as1 WHERE SalesOrderId = 43659 ORDER BY SalesOrderDetailId;

SELECT as1.SalesOrderID, as1.SalesOrderDetailID, RANK() OVER (PARTITION BY as1.SalesOrderID ORDER BY as1.SalesOrderID ) ranknoequal , RANK() OVER (PARTITION BY as1.SalesOrderID ORDER BY as1.SalesOrderDetailId FROM .SalesOrderDetailId FROM ) SalesOrderId = 43659 ORDER BY SalesOrderDetailId;

Gives result:

给出结果:

的SalesOrderID SalesOrderDetailID rank_same_as_partition rank_salesorderdetailid


43659 1 1 1


43659 2 1 2


43659 3 1 3


43659 4 1 4


43659 5 1 5


43659 6 1 6


43659 7 1 7


43659 8 1 8


43659 9 1 9


43659 10 110


43659 11 1 11


43659 12 1 12


But if change order by to (use OrderQty :

但是如果将订单更改为(使用 OrderQty :

SELECT as1.SalesOrderID, as1.OrderQty, RANK() OVER (PARTITION BY as1.SalesOrderID ORDER BY as1.SalesOrderID ) ranknoequal , RANK() OVER (PARTITION BY as1.SalesOrderID ORDER BY as1.OrderQty ) rank_orderqty FROM Sales.SalesOrderDetail as1 WHERE SalesOrderId = 43659 ORDER BY OrderQty;

SELECT as1.SalesOrderID, as1.OrderQty, RANK() OVER (PARTITION BY as1.SalesOrderID ORDER BY as1.SalesOrderID ) ranknoequal , RANK() OVER (PARTITION BY as1.SalesOrderID ORDER BY as1.OrderQHER ) rank.1SalesOrderQHER rank.1Salesqty SalesOrderId = 43659 ORDER BY OrderQty;

Gives:

给出:

的SalesOrderID OrderQty rank_salesorderid rank_orderqty


43659 1 1 1


43659 1 1 1


43659 1 1 1


43659 1 1 1


43659 1 1 1


43659 1 1 1


43659 2 1 7


43659 2 1 7


43659 3 1 9


43659 3 1 9


43659 4 1 11


43659 6 1 12


Notice how the Rank changes when we use OrderQty (rightmost column second table) in ORDER BY and how it changes when we use SalesOrderDetailID (rightmost column first table) in ORDER BY.

请注意当我们在 ORDER BY 中使用 OrderQty(最右边的第二列表)时排名如何变化,以及当我们在 ORDER BY 中使用 SalesOrderDetailID(最右边的第一列表)时排名如何变化。

回答by SarahLaMont

I haven't done anything with rank, but I discovered this today with row_number().

我没有对排名做过任何事情,但我今天用 row_number() 发现了这一点。

select item, name, sold, row_number() over(partition by item order by sold) as row from table_name

This will result in some repeating row numbers since in my case each name holds all items. Each item will be ordered by how many were sold.

这将导致一些重复的行号,因为在我的情况下,每个名称都包含所有项目。每件商品将按售出数量排序。

+--------+------+-----+----+
|glasses |store1|  30 | 1  |
|glasses |store2|  35 | 2  |
|glasses |store3|  40 | 3  |
|shoes   |store2|  10 | 1  |
|shoes   |store1|  20 | 2  |
|shoes   |store3|  22 | 3  |
+--------+------+-----+----+