Oracle MIN 作为分析函数 - ORDER BY 的奇怪行为?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7393974/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 00:18:52  来源:igfitidea点击:

Oracle MIN as analytic function - odd behavior with ORDER BY?

sqlsql-serveroracleoracle10ganalytic-functions

提问by Cade Roux

This particular case was distilled from an example where the programmer assumed that for two shipments into a tank car, line #1 would be loaded first. I corrected this to allow for the loading to be performed in any order - however, I discovered that MIN() OVER (PARTITION BY)allows an ORDER BYin Oracle (this is not allowed in SQL Server), and additionally, it alters the behavior of the function, causing the ORDER BYto apparently be added to the PARTITION BY.

这个特殊情况是从一个例子中提炼出来的,在这个例子中,程序员假设将两次装运到油罐车中,第 1 行将首先装载。我更正了这个以允许以任何顺序执行加载 - 但是,我发现MIN() OVER (PARTITION BY)允许ORDER BY在 Oracle 中(这在 SQL Server 中是不允许的),此外,它改变了函数的行为,导致ORDER BY显然是添加到PARTITION BY.

WITH data AS (
SELECT 1 AS SHIPMENT_ID, 1 AS LINE_NUMBER, 2 AS TARE, 3 AS GROSS FROM DUAL
UNION ALL
SELECT 1 AS SHIPMENT_ID, 2 AS LINE_NUMBER, 1 AS TARE, 2 AS GROSS FROM DUAL
)
SELECT MIN(tare) OVER (PARTITION BY shipment_id) first_tare
,MAX(gross) OVER (PARTITION BY shipment_id) last_gross
,FIRST_VALUE(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect
,FIRST_VALUE(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect
,MIN(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect_still
,MAX(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect_still
,MIN(tare) OVER (PARTITION BY shipment_id, LINE_NUMBER) first_tare_incorrect_still2
,MAX(gross) OVER (PARTITION BY shipment_id, LINE_NUMBER) last_gross_incorrect_still2
 FROM data

A SQL Server example (with non-applicable code commented out):

SQL Server 示例(注释掉不适用的代码):

WITH data AS (
SELECT 1 AS SHIPMENT_ID, 1 AS LINE_NUMBER, 2 AS TARE, 3 AS GROSS -- FROM DUAL
UNION ALL
SELECT 1 AS SHIPMENT_ID, 2 AS LINE_NUMBER, 1 AS TARE, 2 AS GROSS -- FROM DUAL
)
SELECT MIN(tare) OVER (PARTITION BY shipment_id) first_tare
,MAX(gross) OVER (PARTITION BY shipment_id) last_gross
-- ,FIRST_VALUE(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect
-- ,FIRST_VALUE(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect
-- ,MIN(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect_still
-- ,MAX(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect_still
,MIN(tare) OVER (PARTITION BY shipment_id, LINE_NUMBER) first_tare_incorrect_still2
,MAX(gross) OVER (PARTITION BY shipment_id, LINE_NUMBER) last_gross_incorrect_still2
 FROM data

So question: What is Oracle doing and why and is it right?

所以问题:甲骨文在做什么,为什么,它是对的?

回答by Justin Cave

If you add an ORDER BYto the MINanalytic function, you turn it into a "min so far" function rather than an overall minimum. For the final row for whatever you're partitioning by, the results will be the same. But the prior rows may have a different "min so far" than the overall minimum.

如果ORDER BYMIN解析函数添加,则会将其转换为“到目前为止的最小值”函数,而不是整体最小值。对于您分区的最后一行,结果将是相同的。但是前面的行可能具有与总体最小值不同的“到目前为止的最小值”。

Using the EMPtable as an example, you can see that the minimum salary so far for the department eventually converges on the overall minimum for the department. And you can see that the "min so far" value for any given department decreases as lower values are encountered.

以该EMP表为例,您可以看到该部门迄今为止的最低工资最终收敛于该部门的总体最低工资。并且您可以看到任何给定部门的“min so far”值随着遇到较低的值而减少。

SQL> ed
Wrote file afiedt.buf

  1  select ename,
  2         deptno,
  3         sal,
  4         min(sal) over (partition by deptno order by ename) min_so_far,
  5         min(sal) over (partition by deptno) min_overall
  6    from emp
  7*  order by deptno, ename
SQL> /

ENAME          DEPTNO        SAL MIN_SO_FAR MIN_OVERALL
---------- ---------- ---------- ---------- -----------
CLARK              10       2450       2450        1300
KING               10       5000       2450        1300
MILLER             10       1300       1300        1300
ADAMS              20       1110       1110         800
FORD               20       3000       1110         800
JONES              20       2975       1110         800
SCOTT              20       3000       1110         800
smith              20        800        800         800
ALLEN              30       1600       1600         950
BLAKE              30       2850       1600         950
MARTIN             30       1250       1250         950
SM0                30        950        950         950
TURNER             30       1500        950         950
WARD               30       1250        950         950
BAR
PAV

16 rows selected.

Of course, it would make more sense to use this form of the analytic function when you're trying to do something like compute a personal best that you can use as a comparison in future periods. If you're tracking an individual's decreasing golf scores, mile times, or weight, displaying personal bests can be a form of motivation.

当然,当您尝试执行诸如计算个人最佳成绩之类的事情时,使用这种形式的分析函数会更有意义,以便您可以将其用作未来时期的比较。如果您要跟踪个人高尔夫成绩、英里次数或体重的下降情况,展示个人最好成绩可能是一种动力。

SQL> ed
Wrote file afiedt.buf

  1  with golf_scores as
  2  (  select 1 golfer_id, 80 score, sysdate dt from dual union all
  3     select 1, 82, sysdate+1 dt from dual union all
  4     select 1, 72, sysdate+2 dt from dual union all
  5     select 1, 75, sysdate+3 dt from dual union all
  6     select 1, 71, sysdate+4 dt from dual union all
  7     select 2, 74, sysdate from dual )
  8  select golfer_id,
  9         score,
 10         dt,
 11         (case when score=personal_best
 12               then 'New personal best'
 13               else null
 14           end) msg
 15    from (
 16  select golfer_id,
 17         score,
 18         dt,
 19         min(score) over (partition by golfer_id
 20                              order by dt) personal_best
 21    from golf_scores
 22*        )
SQL> /

 GOLFER_ID      SCORE DT        MSG
---------- ---------- --------- -----------------
         1         80 12-SEP-11 New personal best
         1         82 13-SEP-11
         1         72 14-SEP-11 New personal best
         1         75 15-SEP-11
         1         71 16-SEP-11 New personal best
         2         74 12-SEP-11 New personal best

6 rows selected.