oracle 选择没有 ROWNUM 的前 N 行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5636507/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SELECTing top N rows without ROWNUM?
提问by Pew
I hope you can help me with my homework :)
我希望你能帮我做作业:)
We need to build a query that outputs the top N best paid employees.
我们需要构建一个查询,输出前 N 位薪酬最高的员工。
My version works perfectly fine.
For example the top 3:
我的版本运行良好。
例如前三名:
SELECT name, salary
FROM staff
WHERE salary IN ( SELECT *
FROM ( SELECT salary
FROM staff
ORDER BY salary DESC )
WHERE ROWNUM <= 3 )
ORDER BY salary DESC
;
Note that this will output employees that are in the top 3 and have the same salary, too.
请注意,这也将输出前 3 名且薪水相同的员工。
1: Mike, 4080
2: Steve, 2800
2: Susan, 2800
2: Hyman, 2800
3: Chloe, 1400
1:迈克,4080
2:史蒂夫,2800
2:苏珊,2800
2:Hyman,2800
3:克洛伊,1400
But now our teacher does not allow us to use ROWNUM
.
I searched far and wide and didn't find anything useable.
但是现在我们的老师不允许我们使用ROWNUM
.
我搜索了很远很远,没有找到任何有用的东西。
My second solutionthanks to Justin Caves' hint.
感谢 Justin Caves 的提示,我的第二个解决方案。
First i tried this:
首先我试过这个:
SELECT name, salary, ( rank() OVER ( ORDER BY salary DESC ) ) as myorder
FROM staff
WHERE myorder <= 3
;
The errormessage is: "myorder: invalid identifier"
错误消息是:“我的订单:标识符无效”
Thanks to DCookie its now clear:
多亏了 DCookie,它现在很清楚:
"[...] Analytics are applied AFTER the where clause is evaluated, which is why you get the error that myorder is an invalid identifier."
“[...] 在评估 where 子句后应用分析,这就是为什么您会收到 myorder 是无效标识符的错误。”
Wrapping a SELECT around solves this:
环绕一个 SELECT 解决了这个问题:
SELECT *
FROM ( SELECT name, salary, rank() OVER ( ORDER BY salary DESC ) as myorder FROM staff )
WHERE myorder <= 3
;
My teacher strikes again and don't allow such exotic analytic functions.
我的老师再次罢工,不允许使用这种奇异的分析函数。
3rd solutionfrom @Justin Caves.
来自@Justin Caves 的第三个解决方案。
"If analytic functions are also disallowed, the other option I could imagine-- one that you would never, ever, ever actually write in practice, would be something like"
“如果分析函数也被禁止,我可以想象的另一种选择——你永远、永远、永远不会在实践中真正写出的选项,将是这样的”
SELECT name, salary
FROM staff s1
WHERE (SELECT COUNT(*)
FROM staff s2
WHERE s1.salary < s2.salary) <= 3
回答by Justin Cave
Since this is homework, a hint rather than an answer. You'll want to use analytic functions. ROW_NUMBER, RANK, or DENSE_RANK can work depending on how you want to handle ties.
由于这是家庭作业,因此是提示而不是答案。您将需要使用分析函数。ROW_NUMBER、RANK 或 DENSE_RANK 可以工作,具体取决于您想要如何处理关系。
If analytic functions are also disallowed, the other option I could imagine-- one that you would never, ever, ever actually write in practice, would be something like
如果分析函数也被禁止,我可以想象的另一种选择——你永远、永远、永远不会在实践中真正编写的选项,就像
SELECT name, salary
FROM staff s1
WHERE (SELECT COUNT(*)
FROM staff s2
WHERE s1.salary < s2.salary) <= 3
With regard to performance, I wouldn't rely on the COST number from the query plan-- that's only an estimate and it is not generally possible to compare the cost between plans for different SQL statements. You're much better off looking at something like the number of consistent gets the query actually does and considering how the query performance will scale as the number of rows in the table increases. The third option is going to be radically less efficient than the other two simply because it needs to scan the STAFF table twice.
关于性能,我不会依赖查询计划中的 COST 数字——这只是一个估计值,通常不可能比较不同 SQL 语句的计划之间的成本。您最好查看查询实际执行的一致获取次数之类的内容,并考虑查询性能将如何随着表中行数的增加而扩展。第三个选项将比其他两个选项效率低得多,因为它需要扫描 STAFF 表两次。
I don't have your STAFF table, so I'll use the EMP table from the SCOTT schema
我没有你的 STAFF 表,所以我将使用 SCOTT 模式中的 EMP 表
The analytic function solution actually does 7 consistent gets as does the ROWNUM solution
解析函数解与 ROWNUM 解一样,实际上做了 7 个一致的获取
Wrote file afiedt.buf
1 select ename, sal
2 from( select ename,
3 sal,
4 rank() over (order by sal) rnk
5 from emp )
6* where rnk <= 3
SQL> /
ENAME SAL
---------- ----------
smith 800
SM0 950
ADAMS 1110
Execution Plan
----------------------------------------------------------
Plan hash value: 3291446077
--------------------------------------------------------------------------------
-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
--------------------------------------------------------------------------------
-
| 0 | SELECT STATEMENT | | 14 | 672 | 4 (25)| 00:00:01
|* 1 | VIEW | | 14 | 672 | 4 (25)| 00:00:01
|* 2 | WINDOW SORT PUSHED RANK| | 14 | 140 | 4 (25)| 00:00:01
| 3 | TABLE ACCESS FULL | EMP | 14 | 140 | 3 (0)| 00:00:01
--------------------------------------------------------------------------------
-
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RNK"<=3)
2 - filter(RANK() OVER ( ORDER BY "SAL")<=3)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
668 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
3 rows processed
SQL> select ename, sal
2 from( select ename, sal
3 from emp
4 order by sal )
5 where rownum <= 3;
ENAME SAL
---------- ----------
smith 800
SM0 950
ADAMS 1110
Execution Plan
----------------------------------------------------------
Plan hash value: 1744961472
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 105 | 4 (25)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 14 | 490 | 4 (25)| 00:00:01 |
|* 3 | SORT ORDER BY STOPKEY| | 14 | 140 | 4 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL | EMP | 14 | 140 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=3)
3 - filter(ROWNUM<=3)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
668 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
3 rows processed
The COUNT(*) solution, however, actually does 99 consistent gets and has to do a full scan of the table twice so it is more than 10 times less efficient. And it will scale much worse as the number of rows in the table increases
然而,COUNT(*) 解决方案实际上执行了 99 次一致的获取,并且必须对表进行两次完整扫描,因此效率降低了 10 倍以上。随着表中行数的增加,它的扩展性会更差
SQL> select ename, sal
2 from emp e1
3 where (select count(*) from emp e2 where e1.sal < e2.sal) <= 3;
ENAME SAL
---------- ----------
JONES 2975
SCOTT 3000
KING 5000
FORD 3000
FOO
Execution Plan
----------------------------------------------------------
Plan hash value: 2649664444
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 140 | 24 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | EMP | 14 | 140 | 3 (0)| 00:00:01 |
| 3 | SORT AGGREGATE | | 1 | 4 | | |
|* 4 | TABLE ACCESS FULL| EMP | 1 | 4 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( (SELECT COUNT(*) FROM "EMP" "E2" WHERE
"E2"."SAL">:B1)<=3)
4 - filter("E2"."SAL">:B1)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
99 consistent gets
0 physical reads
0 redo size
691 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5 rows processed
回答by DCookie
The reason you must wrap the statement with another select is because the outer select statement is the one that limits your result set to the row numbers desired. Here's a helpful link on analytics. If you run the inner select by itself you'll see why you have to do this. Analytics are applied AFTER the where clause is evaluated, which is why you get the error that myorder is an invalid identifier.
必须用另一个 select 包装语句的原因是因为外部 select 语句将结果集限制为所需的行号。这是有关分析的有用链接。如果您自己运行内部选择,您就会明白为什么必须这样做。在评估 where 子句后应用分析,这就是为什么您会收到 myorder 是无效标识符的错误。
回答by Andrey Frolov
Oracle? What about window functions?
甲骨文?窗口函数呢?
select * from
(SELECT s.*, row_number over (order by salary desc ) as rn FROM staff s )
where rn <=3
回答by Epicurist
When you use count(distinct <exp>)
, equal ranking top salaries will be treated as tie ranks.
当您使用 时count(distinct <exp>)
,同等排名的最高工资将被视为并列排名。
select NAME, SALARY
from STAFF STAFF1
where 3 >= ( select count(distinct STAFF2.SALARY) RANK
from STAFF STAFF2
where STAFF2.SALARY >= STAFF1.SALARY)
回答by russ
You could solve this in Oracle 12c
你可以在 Oracle 12c 中解决这个问题
select NAME, SALARY
from STAFF
order by SALARY DESC
FETCH FIRST 3 ROWS ONLY
(FETCH FIRST syntax is new with Oracle 12c)
(FETCH FIRST 语法是 Oracle 12c 的新语法)