SQL 如何组合 GROUP BY 和 ROW_NUMBER?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9617616/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 14:44:22  来源:igfitidea点击:

How to combine GROUP BY and ROW_NUMBER?

sqlsql-server-2005tsqlgroup-by

提问by Tim Schmelter

I hope following sample code is self-explanatory:

我希望以下示例代码是不言自明的:

declare @t1 table (ID int,Price money, Name varchar(10))
declare @t2 table (ID int,Orders int,  Name varchar(10))
declare @relation  table (t1ID int,t2ID int)
insert into @t1 values(1, 200, 'AAA');
insert into @t1 values(2, 150, 'BBB');
insert into @t1 values(3, 100, 'CCC');
insert into @t2 values(1,25,'aaa');
insert into @t2 values(2,35,'bbb');
insert into @relation values(1,1);
insert into @relation values(2,1);
insert into @relation values(3,2);

select T2.ID AS T2ID
,T2.Name as T2Name
,T2.Orders
,T1.ID AS T1ID
,T1.Name As T1Name
,T1Sum.Price
FROM @t2 T2
INNER JOIN (
    SELECT Rel.t2ID
        ,MAX(Rel.t1ID)AS t1ID 
-- the MAX returns an arbitrary ID, what i need is: 
--      ,ROW_NUMBER()OVER(Partition By Rel.t2ID Order By Price DESC)As PriceList
        ,SUM(Price)AS Price
        FROM @t1 T1 
        INNER JOIN @relation Rel ON Rel.t1ID=T1.ID
        GROUP BY Rel.t2ID
)AS T1Sum ON  T1Sum.t2ID = T2.ID
INNER JOIN @t1 T1 ON T1Sum.t1ID=T1.ID

Result:

结果:

T2ID   T2Name   Orders  T1ID    T1Name  Price     
 1      aaa       25     2       BBB    350,00     
 2      bbb       35     3       CCC    100,00

What i need is commented above, a way to get the ROW_NUMBERbut also to Group Byin the first place. So i need the sumof all T1-prices grouped by T2.IDin the relation-table and in the outer query the t1IDwith the highest price.

我需要的是上面评论的,一种获得ROW_NUMBER但也Group By首先的方法。所以我需要在关系表和外部查询中按最高sum价格分组的所有 T1价格。T2.IDt1ID

In other words: How to change MAX(Rel.t1ID)AS t1IDto somewhat returning the ID with the highest price?

换句话说:如何更改MAX(Rel.t1ID)AS t1ID为有点返回价格最高的ID?

So the desired result is(notice that first T1ID changed from 2 to 1 since it has the higher price):

所以期望的结果是(注意第一个 T1ID 从 2 变为 1,因为它的价格更高):

T2ID   T2Name   Orders  T1ID    T1Name  Price     
 1      aaa       25     1       AAA    350,00     
 2      bbb       35     3       CCC    100,00

Note: in case you're wondering why i don't multiply Orderswith Price: they are not realated(so i should have left off this column since it's a bit ambiguous, please ignore it, i've just added it to make all less abstract). Actually Ordersmust remain unchanged, that's the reason for the sub-query approach to join both and the reason why i need to group by in the first place.

注意:如果你想知道为什么我不乘以Orders价格:它们没有被重新计算(所以我应该离开这个列,因为它有点模棱两可,请忽略它,我只是添加它以减少所有抽象的)。实际上Orders必须保持不变,这就是子查询方法加入两者的原因,也是我首先需要分组的原因。

Conclusion: obviously the core of my question can be answered by the OVERclausethat can be applied to anyaggregate function like SUM(see Damien's answer) what was new to me. Thank you all for your working approaches.

结论:显然,我的问题的核心可以通过可以应用于任何聚合函数的OVER子句来回答,例如(参见Damien 的回答)对我来说是新的。感谢大家的工作方法。SUM

回答by Damien_The_Unbeliever

Wow, the other answers look complex - so I'm hoping I've not missed something obvious.

哇,其他答案看起来很复杂 - 所以我希望我没有遗漏一些明显的东西。

You can use OVER/PARTITION BYagainst aggregates, and they'll then do grouping/aggregating without a GROUP BYclause. So I just modified your query to:

您可以对聚合使用OVER/ PARTITION BY,然后它们将在没有GROUP BY子句的情况下进行分组/聚合。所以我只是将您的查询修改为:

select T2.ID AS T2ID
    ,T2.Name as T2Name
    ,T2.Orders
    ,T1.ID AS T1ID
    ,T1.Name As T1Name
    ,T1Sum.Price
FROM @t2 T2
INNER JOIN (
    SELECT Rel.t2ID
        ,Rel.t1ID
 --       ,MAX(Rel.t1ID)AS t1ID 
-- the MAX returns an arbitrary ID, what i need is: 
      ,ROW_NUMBER()OVER(Partition By Rel.t2ID Order By Price DESC)As PriceList
        ,SUM(Price)OVER(PARTITION BY Rel.t2ID) AS Price
        FROM @t1 T1 
        INNER JOIN @relation Rel ON Rel.t1ID=T1.ID
--        GROUP BY Rel.t2ID
)AS T1Sum ON  T1Sum.t2ID = T2.ID
INNER JOIN @t1 T1 ON T1Sum.t1ID=T1.ID
where t1Sum.PriceList = 1

Which gives the requested result set.

这给出了请求的结果集。

回答by Lieven Keersmaekers

Undoubtly this can be simplified but the results match your expectations.

毫无疑问,这可以简化,但结果符合您的期望。

The gist of this is to

这样做的要点是

  • Calculate the maximum price in a seperate CTEfor each t2ID
  • Calculate the total price in a seperate CTEfor each t2ID
  • Combine the results of both CTE's
  • 计算出的最高价格在一个单独的CTE每个t2ID
  • 单独CTE计算每个的总价t2ID
  • 结合两者的结果CTE

SQL Statement

SQL语句

;WITH MaxPrice AS ( 
    SELECT  t2ID
            , t1ID
    FROM    (       
                SELECT  t2.ID AS t2ID
                        , t1.ID AS t1ID
                        , rn = ROW_NUMBER() OVER (PARTITION BY t2.ID ORDER BY t1.Price DESC)
                FROM    @t1 t1
                        INNER JOIN @relation r ON r.t1ID = t1.ID        
                        INNER JOIN @t2 t2 ON t2.ID = r.t2ID
            ) maxt1
    WHERE   maxt1.rn = 1                            
)
, SumPrice AS (
    SELECT  t2ID = t2.ID
            , Price = SUM(Price)
    FROM    @t1 t1
            INNER JOIN @relation r ON r.t1ID = t1.ID
            INNER JOIN @t2 t2 ON t2.ID = r.t2ID
    GROUP BY
            t2.ID           
)           
SELECT  t2.ID
        , t2.Name
        , t2.Orders
        , mp.t1ID
        , t1.ID
        , t1.Name
        , sp.Price
FROM    @t2 t2
        INNER JOIN MaxPrice mp ON mp.t2ID = t2.ID
        INNER JOIN SumPrice sp ON sp.t2ID = t2.ID
        INNER JOIN @t1 t1 ON t1.ID = mp.t1ID

回答by Mikael Eriksson

;with C as
(
  select Rel.t2ID,
         Rel.t1ID,
         t1.Price,
         row_number() over(partition by Rel.t2ID order by t1.Price desc) as rn
  from @t1 as T1
    inner join @relation as Rel
      on T1.ID = Rel.t1ID
)
select T2.ID as T2ID,
       T2.Name as T2Name,
       T2.Orders,
       T1.ID as T1ID,
       T1.Name as T1Name,
       T1Sum.Price
from @t2 as T2
  inner join (
              select C1.t2ID,
                     sum(C1.Price) as Price,
                     C2.t1ID
              from C as C1
                inner join C as C2 
                  on C1.t2ID = C2.t2ID and
                     C2.rn = 1
              group by C1.t2ID, C2.t1ID
             ) as T1Sum
    on T2.ID = T1Sum.t2ID
  inner join @t1 as T1
    on T1.ID = T1Sum.t1ID

回答by Ed Harper

The deduplication (to select the max T1) and the aggregation need to be done as distinct steps. I've used a CTE since I think this makes it clearer:

重复数据删除(选择最大 T1)和聚合需要作为不同的步骤完成。我使用了 CTE,因为我认为这更清楚:

;WITH sumCTE
AS
(
    SELECT  Rel.t2ID, SUM(Price) price
    FROM    @t1         AS T1
    JOIN    @relation   AS Rel 
    ON      Rel.t1ID=T1.ID
    GROUP 
    BY      Rel.t2ID
)
,maxCTE
AS
(
    SELECT  Rel.t2ID, Rel.t1ID, 
            ROW_NUMBER()OVER(Partition By Rel.t2ID Order By Price DESC)As PriceList
    FROM    @t1         AS T1
    JOIN    @relation   AS Rel 
    ON      Rel.t1ID=T1.ID
)
SELECT T2.ID AS T2ID
,T2.Name as T2Name
,T2.Orders
,T1.ID AS T1ID
,T1.Name As T1Name
,sumT1.Price
FROM    @t2 AS T2
JOIN    sumCTE AS sumT1
ON      sumT1.t2ID = t2.ID
JOIN    maxCTE AS maxT1
ON      maxT1.t2ID = t2.ID
JOIN    @t1 AS T1
ON      T1.ID = maxT1.t1ID
WHERE   maxT1.PriceList = 1