Oracle MIN 作为分析函数 - ORDER BY 的奇怪行为?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7393974/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Oracle MIN as analytic function - odd behavior with ORDER BY?
提问by Cade Roux
This particular case was distilled from an example where the programmer assumed that for two shipments into a tank car, line #1 would be loaded first. I corrected this to allow for the loading to be performed in any order - however, I discovered that MIN() OVER (PARTITION BY)
allows an ORDER BY
in Oracle (this is not allowed in SQL Server), and additionally, it alters the behavior of the function, causing the ORDER BY
to apparently be added to the PARTITION BY
.
这个特殊情况是从一个例子中提炼出来的,在这个例子中,程序员假设将两次装运到油罐车中,第 1 行将首先装载。我更正了这个以允许以任何顺序执行加载 - 但是,我发现MIN() OVER (PARTITION BY)
允许ORDER BY
在 Oracle 中(这在 SQL Server 中是不允许的),此外,它改变了函数的行为,导致ORDER BY
显然是添加到PARTITION BY
.
WITH data AS (
SELECT 1 AS SHIPMENT_ID, 1 AS LINE_NUMBER, 2 AS TARE, 3 AS GROSS FROM DUAL
UNION ALL
SELECT 1 AS SHIPMENT_ID, 2 AS LINE_NUMBER, 1 AS TARE, 2 AS GROSS FROM DUAL
)
SELECT MIN(tare) OVER (PARTITION BY shipment_id) first_tare
,MAX(gross) OVER (PARTITION BY shipment_id) last_gross
,FIRST_VALUE(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect
,FIRST_VALUE(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect
,MIN(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect_still
,MAX(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect_still
,MIN(tare) OVER (PARTITION BY shipment_id, LINE_NUMBER) first_tare_incorrect_still2
,MAX(gross) OVER (PARTITION BY shipment_id, LINE_NUMBER) last_gross_incorrect_still2
FROM data
A SQL Server example (with non-applicable code commented out):
SQL Server 示例(注释掉不适用的代码):
WITH data AS (
SELECT 1 AS SHIPMENT_ID, 1 AS LINE_NUMBER, 2 AS TARE, 3 AS GROSS -- FROM DUAL
UNION ALL
SELECT 1 AS SHIPMENT_ID, 2 AS LINE_NUMBER, 1 AS TARE, 2 AS GROSS -- FROM DUAL
)
SELECT MIN(tare) OVER (PARTITION BY shipment_id) first_tare
,MAX(gross) OVER (PARTITION BY shipment_id) last_gross
-- ,FIRST_VALUE(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect
-- ,FIRST_VALUE(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect
-- ,MIN(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect_still
-- ,MAX(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect_still
,MIN(tare) OVER (PARTITION BY shipment_id, LINE_NUMBER) first_tare_incorrect_still2
,MAX(gross) OVER (PARTITION BY shipment_id, LINE_NUMBER) last_gross_incorrect_still2
FROM data
So question: What is Oracle doing and why and is it right?
所以问题:甲骨文在做什么,为什么,它是对的?
回答by Justin Cave
If you add an ORDER BY
to the MIN
analytic function, you turn it into a "min so far" function rather than an overall minimum. For the final row for whatever you're partitioning by, the results will be the same. But the prior rows may have a different "min so far" than the overall minimum.
如果ORDER BY
向MIN
解析函数添加,则会将其转换为“到目前为止的最小值”函数,而不是整体最小值。对于您分区的最后一行,结果将是相同的。但是前面的行可能具有与总体最小值不同的“到目前为止的最小值”。
Using the EMP
table as an example, you can see that the minimum salary so far for the department eventually converges on the overall minimum for the department. And you can see that the "min so far" value for any given department decreases as lower values are encountered.
以该EMP
表为例,您可以看到该部门迄今为止的最低工资最终收敛于该部门的总体最低工资。并且您可以看到任何给定部门的“min so far”值随着遇到较低的值而减少。
SQL> ed
Wrote file afiedt.buf
1 select ename,
2 deptno,
3 sal,
4 min(sal) over (partition by deptno order by ename) min_so_far,
5 min(sal) over (partition by deptno) min_overall
6 from emp
7* order by deptno, ename
SQL> /
ENAME DEPTNO SAL MIN_SO_FAR MIN_OVERALL
---------- ---------- ---------- ---------- -----------
CLARK 10 2450 2450 1300
KING 10 5000 2450 1300
MILLER 10 1300 1300 1300
ADAMS 20 1110 1110 800
FORD 20 3000 1110 800
JONES 20 2975 1110 800
SCOTT 20 3000 1110 800
smith 20 800 800 800
ALLEN 30 1600 1600 950
BLAKE 30 2850 1600 950
MARTIN 30 1250 1250 950
SM0 30 950 950 950
TURNER 30 1500 950 950
WARD 30 1250 950 950
BAR
PAV
16 rows selected.
Of course, it would make more sense to use this form of the analytic function when you're trying to do something like compute a personal best that you can use as a comparison in future periods. If you're tracking an individual's decreasing golf scores, mile times, or weight, displaying personal bests can be a form of motivation.
当然,当您尝试执行诸如计算个人最佳成绩之类的事情时,使用这种形式的分析函数会更有意义,以便您可以将其用作未来时期的比较。如果您要跟踪个人高尔夫成绩、英里次数或体重的下降情况,展示个人最好成绩可能是一种动力。
SQL> ed
Wrote file afiedt.buf
1 with golf_scores as
2 ( select 1 golfer_id, 80 score, sysdate dt from dual union all
3 select 1, 82, sysdate+1 dt from dual union all
4 select 1, 72, sysdate+2 dt from dual union all
5 select 1, 75, sysdate+3 dt from dual union all
6 select 1, 71, sysdate+4 dt from dual union all
7 select 2, 74, sysdate from dual )
8 select golfer_id,
9 score,
10 dt,
11 (case when score=personal_best
12 then 'New personal best'
13 else null
14 end) msg
15 from (
16 select golfer_id,
17 score,
18 dt,
19 min(score) over (partition by golfer_id
20 order by dt) personal_best
21 from golf_scores
22* )
SQL> /
GOLFER_ID SCORE DT MSG
---------- ---------- --------- -----------------
1 80 12-SEP-11 New personal best
1 82 13-SEP-11
1 72 14-SEP-11 New personal best
1 75 15-SEP-11
1 71 16-SEP-11 New personal best
2 74 12-SEP-11 New personal best
6 rows selected.