MySQL SQL OVER() 子句 - 何时以及为何有用?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6218902/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 20:07:10  来源:igfitidea点击:

The SQL OVER() clause - when and why is it useful?

mysqlsqlsql-serveraggregate-functionsclause

提问by WithFlyingColors

    USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
    ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
    ,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'
    ,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'
    ,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'
FROM Sales.SalesOrderDetail 
WHERE SalesOrderID IN(43659,43664);

I read about that clause and I don't understand why I need it. What does the function Overdo? What does Partitioning Bydo? Why can't I make a query with writing Group By SalesOrderID?

我读到了那个条款,但我不明白为什么我需要它。函数Over有什么作用?有什么作用Partitioning By?为什么我不能通过写作进行查询Group By SalesOrderID

回答by Andriy M

You canuse GROUP BY SalesOrderID. The difference is, with GROUP BY you can only have the aggregated values for the columns that are not included in GROUP BY.

可以使用GROUP BY SalesOrderID. 不同之处在于,使用 GROUP BY,您只能拥有未包含在 GROUP BY 中的列的聚合值。

In contrast, using windowed aggregate functions instead of GROUP BY, you can retrieve both aggregated and non-aggregated values. That is, although you are not doing that in your example query, you could retrieve both individual OrderQtyvalues and their sums, counts, averages etc. over groups of same SalesOrderIDs.

相反,使用窗口聚合函数而不是 GROUP BY,您可以检索聚合值和非聚合值。也就是说,尽管您在示例查询中没有这样做,但您可以在OrderQty相同SalesOrderIDs 的组上检索单个值及其总和、计数、平均值等。

Here's a practical example of why windowed aggregates are great. Suppose you need to calculate what percent of a total every value is. Without windowed aggregates you'd have to first derive a list of aggregated values and then join it back to the original rowset, i.e. like this:

这是一个实际示例,说明为什么窗口聚合很棒。假设您需要计算每个值占总数的百分比。如果没有窗口聚合,您必须首先导出聚合值列表,然后将其连接回原始行集,即像这样:

SELECT
  orig.[Partition],
  orig.Value,
  orig.Value * 100.0 / agg.TotalValue AS ValuePercent
FROM OriginalRowset orig
  INNER JOIN (
    SELECT
      [Partition],
      SUM(Value) AS TotalValue
    FROM OriginalRowset
    GROUP BY [Partition]
  ) agg ON orig.[Partition] = agg.[Partition]

Now look how you can do the same with a windowed aggregate:

现在看看如何使用窗口聚合来做同样的事情:

SELECT
  [Partition],
  Value,
  Value * 100.0 / SUM(Value) OVER (PARTITION BY [Partition]) AS ValuePercent
FROM OriginalRowset orig

Much easier and cleaner, isn't it?

更容易和更干净,不是吗?

回答by gbn

The OVERclause is powerful in that you can have aggregates over different ranges ("windowing"), whether you use a GROUP BYor not

OVER条款是强大的,你可以有在不同的范围(“开窗”)聚集,无论你使用GROUP BY与否

Example: get count per SalesOrderIDand count of all

示例:获取每个SalesOrderID计数和所有计数

SELECT
    SalesOrderID, ProductID, OrderQty
    ,COUNT(OrderQty) AS 'Count'
    ,COUNT(*) OVER () AS 'CountAll'
FROM Sales.SalesOrderDetail 
WHERE
     SalesOrderID IN(43659,43664)
GROUP BY
     SalesOrderID, ProductID, OrderQty

Get different COUNTs, no GROUP BY

得到不同的COUNTs,不GROUP BY

SELECT
    SalesOrderID, ProductID, OrderQty
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'
    ,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',
    ,COUNT(*) OVER () AS 'CountAllAgain'
FROM Sales.SalesOrderDetail 
WHERE
     SalesOrderID IN(43659,43664)

回答by Tom H

If you only wanted to GROUP BY the SalesOrderID then you wouldn't be able to include the ProductID and OrderQty columns in the SELECT clause.

如果您只想对 SalesOrderID 进行 GROUP BY,那么您将无法在 SELECT 子句中包含 ProductID 和 OrderQty 列。

The PARTITION BY clause let's you break up your aggregate functions. One obvious and useful example would be if you wanted to generate line numbers for order lines on an order:

PARTITION BY 子句让您分解聚合函数。一个明显且有用的示例是,如果您想为订单上的订单行生成行号:

SELECT
    O.order_id,
    O.order_date,
    ROW_NUMBER() OVER(PARTITION BY O.order_id) AS line_item_no,
    OL.product_id
FROM
    Orders O
INNER JOIN Order_Lines OL ON OL.order_id = O.order_id

(My syntax might be off slightly)

(我的语法可能略有偏差)

You would then get back something like:

然后你会得到类似的东西:

order_id    order_date    line_item_no    product_id
--------    ----------    ------------    ----------
    1       2011-05-02         1              5
    1       2011-05-02         2              4
    1       2011-05-02         3              7
    2       2011-05-12         1              8
    2       2011-05-12         2              1

回答by Sanjay Singh

Let me explain with an example and you would be able to see how it works.

让我用一个例子来解释,你将能够看到它是如何工作的。

Assuming you have the following table DIM_EQUIPMENT:

假设您有下表 DIM_EQUIPMENT:

VIN         MAKE    MODEL   YEAR    COLOR
-----------------------------------------
1234ASDF    Ford    Taurus  2008    White
1234JKLM    Chevy   Truck   2005    Green
5678ASDF    Ford    Mustang 2008    Yellow

Run below SQL

在 SQL 下运行

SELECT VIN,
  MAKE,
  MODEL,
  YEAR,
  COLOR ,
  COUNT(*) OVER (PARTITION BY YEAR) AS COUNT2
FROM DIM_EQUIPMENT

The result would be as below

结果如下

VIN         MAKE    MODEL   YEAR    COLOR     COUNT2
 ----------------------------------------------  
1234JKLM    Chevy   Truck   2005    Green     1
5678ASDF    Ford    Mustang 2008    Yellow    2
1234ASDF    Ford    Taurus  2008    White     2

See what happened.

看看发生了什么。

You are able to count without Group By on YEAR and Match with ROW.

您可以在 YEAR 上不按 Group By 进行计数,并使用 ROW 进行匹配。

Another Interesting WAY to get same result if as below using WITH Clause, WITH works as in-line VIEW and can simplify the query especially complex ones, which is not the case here though since I am just trying to show usage

另一个有趣的方法来获得相同的结果,如果如下使用 WITH 子句,WITH 用作内联 VIEW 并且可以简化查询,特别是复杂的查询,但这里不是这种情况,因为我只是想展示用法

 WITH EQ AS
  ( SELECT YEAR AS YEAR2, COUNT(*) AS COUNT2 FROM DIM_EQUIPMENT GROUP BY YEAR
  )
SELECT VIN,
  MAKE,
  MODEL,
  YEAR,
  COLOR,
  COUNT2
FROM DIM_EQUIPMENT,
  EQ
WHERE EQ.YEAR2=DIM_EQUIPMENT.YEAR;

回答by maple_shaft

The OVER clause when combined with PARTITION BY state that the preceding function call must be done analytically by evaluating the returned rows of the query. Think of it as an inline GROUP BY statement.

OVER 子句与 PARTITION BY 结合使用时,表明必须通过评估查询的返回行来分析性地完成前面的函数调用。将其视为内联 GROUP BY 语句。

OVER (PARTITION BY SalesOrderID)is stating that for SUM, AVG, etc... function, return the value OVER a subset of the returned records from the query, and PARTITION that subset BY the foreign key SalesOrderID.

OVER (PARTITION BY SalesOrderID)声明对于 SUM、AVG 等...函数,返回值 OVER 来自查询的返回记录的子集,以及 PARTITION 该子集 BY 外键 SalesOrderID。

So we will SUM every OrderQty record for EACH UNIQUE SalesOrderID, and that column name will be called 'Total'.

因此,我们将对每个唯一的 SalesOrderID 的每个 OrderQty 记录求和,该列名称将称为“总计”。

It is a MUCH more efficient means than using multiple inline views to find out the same information. You can put this query within an inline view and filter on Total then.

这是比使用多个内联视图查找相同信息更有效的方法。您可以将此查询放在内联视图中,然后在 Total 上进行过滤。

SELECT ...,
FROM (your query) inlineview
WHERE Total < 200

回答by Elshan

  • Also Called Query PetitionClause.
  • Similar to the Group ByClause

    • break up data into chunks (or partitions)
    • separate by partition bounds
    • function performs within partitions
    • re-initialised when crossing parting boundary
  • 也称Query Petition子句。
  • 类似于Group By条款

    • 将数据分解成块(或分区)
    • 按分区边界分隔
    • 功能在分区内执行
    • 越过分型边界时重新初始化

Syntax:
function (...) OVER (PARTITION BY col1 col3,...)

语法:
function (...) OVER (PARTITION BY col1 col3,...)

  • Functions

    • Familiar functions such as COUNT(), SUM(), MIN(), MAX(), etc
    • New Functions as well (eg ROW_NUMBER(), RATION_TO_REtheitroadT(), etc.)
  • 职能

    • 熟悉的功能,例如COUNT()SUM()MIN()MAX(),等
    • 新功能以及(如ROW_NUMBER()RATION_TO_REtheitroadT()等)


More info with example : http://msdn.microsoft.com/en-us/library/ms189461.aspx


更多信息示例:http: //msdn.microsoft.com/en-us/library/ms189461.aspx

回答by Алексей Неудачин

prkey   whatsthat               cash   
890    "abb                "   32  32
43     "abbz               "   2   34
4      "bttu               "   1   35
45     "gasstuff           "   2   37
545    "gasz               "   5   42
80009  "hoo                "   9   51
2321   "ibm                "   1   52
998    "krk                "   2   54
42     "kx-5010            "   2   56
32     "lto                "   4   60
543    "mp                 "   5   65
465    "multipower         "   2   67
455    "O.N.               "   1   68
7887   "prem               "   7   75
434    "puma               "   3   78
23     "retractble         "   3   81
242    "Trujillo's stuff   "   4   85

That's a result of query. Table used as source is the same exept that it has no last column. This column is a moving sum of third one.

那是查询的结果。用作源的表与没有最后一列的表相同。此列是第三个的移动总和。

Query:

询问:

SELECT prkey,whatsthat,cash,SUM(cash) over (order by whatsthat)
    FROM public.iuk order by whatsthat,prkey
    ;

(table goes as public.iuk)

(表为public.iuk)

sql version:  2012

It's a little over dbase(1986) level, I don't know why 25+ years has been needed to finish it up.

它有点超过 dbase(1986) 的水平,我不知道为什么需要 25 年以上才能完成它。