oracle 在什么条件下 ROWNUM=1 会显着提高“存在”syle 查询的性能
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1089961/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Under what conditions does ROWNUM=1 significantly increase performance in an "exists" syle query
提问by darreljnz
I read some of the discussion in this questionand thought to myself that in my PL/SQL code I have "exists" style queries all over the place that don't use the ROWNUM=1 optimisation.
我阅读了这个问题中的一些讨论,并在心里想,在我的 PL/SQL 代码中,我到处都有不使用 ROWNUM=1 优化的“存在”样式查询。
The questions I have are:
我的问题是:
- Does the introduction of ROWNUM=1 significantly increase performance?
- If so, under what conditions would performance be particularly improved (e.g lots of joins, constraints on unindexed columns, large tables, large result sets)
- 引入 ROWNUM=1 是否显着提高了性能?
- 如果是这样,在什么条件下性能会特别提高(例如,大量连接、对未索引列的约束、大表、大结果集)
I'm trying to determine of it is worth rewriting all of my existing queries to add a ROWNUM=1 optimisation.
我正在尝试确定是否值得重写我所有现有的查询以添加 ROWNUM=1 优化。
The queries I'm thinking of are ones that may have multiple joins and may query large tables. They have the general form of:
我正在考虑的查询可能有多个连接并且可能查询大表。它们的一般形式为:
SELECT 1
INTO ln_count
FROM table_1, table_2...., table_n
WHERE <various joins and conditions>;
IF ln_count > 0 THEN
<do stuff>
END IF;
I'm considering changing them to:
我正在考虑将它们更改为:
SELECT 1
INTO ln_count
FROM table_1, table_2...., table_n
WHERE <various joins and conditions>
AND ROWNUM = 1;
IF <local variable> > 0 THEN
<do stuff>
END IF;
回答by Michal Pravda
It does improve performance significantly (tens of percent average) on queries which cannnot be solved by simple single index lookup e.g. table joins. However it has a potential to hide data/application error.
它确实显着提高了查询的性能(平均百分之几十),而这些查询无法通过简单的单索引查找(例如表连接)来解决。然而,它有可能隐藏数据/应用程序错误。
Lets have a table:
让我们有一张桌子:
create table t (id number(10,0), padding varchar2(1000));
--intentionally don't use PK to make the example as simple as possible. The padding is used to simulate real data load in each record
-- 故意不使用 PK 使示例尽可能简单。padding 用于模拟每条记录中的真实数据负载
with a many records:
有很多记录:
insert into t (id, padding)
select rownum, rpad(' ', 1000) from dual connect by level < 10000
Now if you ask something like
现在如果你问类似的问题
select 1 into ll_exists
from t where id = 5;
the DB must go through the whole table whether it found the only matching record in the first data block (which by the way we cannot know because it could be inserted by many different ways) or in the last. That's because it doesn't know that there is only one matching record. On the other hand if you use ... and rownum = 1 than it can stop traversing through data after the record is found because you told it that there is not (or not needed) another matching record.
DB 必须遍历整个表,无论它是在第一个数据块中找到唯一匹配的记录(顺便说一下,我们无法知道,因为它可以通过多种不同的方式插入)还是在最后一个。那是因为它不知道只有一个匹配的记录。另一方面,如果您使用 ... 并且 rownum = 1 则它可以在找到记录后停止遍历数据,因为您告诉它没有(或不需要)另一个匹配的记录。
The drawback is that with the rownum constraint you may get undeterministic results if the data contains more than one possible record. If the query was
缺点是使用 rownum 约束,如果数据包含多个可能的记录,您可能会得到不确定的结果。如果查询是
select id into ll_id
from t where mod (id, 2) = 1
and rownum = 1;
then I may receive from the DB answer 1 as well as 3 as well as 123 ... order is not guaranteed and this is the consequence. (without the rownum clause I would get a TOO_MANY_ROWS exception. It depends on situation which one is worse)
那么我可能会从数据库中收到 1 和 3 以及 123 的答案......订单无法保证,这就是结果。(如果没有 rownum 子句,我会得到一个 TOO_MANY_ROWS 异常。这取决于哪种情况更糟)
If you really want query which tests existence then WRITE IT THAT WAY.
如果你真的想查询哪个测试存在,那么就这样写。
begin
select 'It does'
into ls_exists
from dual where
exists (your_original_query_without_rownum);
do_something_when_it_does_exist
exception
when no_data_found then
do_something_when_it_doesn't_exist
end;
回答by akf
One rule of thumb in optimization is to not do it unless you have a hotspot you need to fix. However, if you are curious about the performance benefits, you might want to run some tests using both to see if you can measure any improved performance.
优化的一个经验法则是不要这样做,除非您有一个需要修复的热点。但是,如果您对性能优势感到好奇,您可能希望同时使用两者运行一些测试,看看是否可以衡量任何改进的性能。
wikipedia quotesDonald Knuth as saying:
维基百科引用Donald Knuth 的话说:
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."
“我们应该忘记小效率,比如大约 97% 的时间:过早的优化是万恶之源。”
回答by Brian
I'm going to guess that this is not going to be worth your time. Modern optimizers are very good at what they do so I'd find it a bit amazing that a query that only is ALLOWED to return one row is going to see a significant performance boost from adding ROWNUM=1.
我猜这不值得你花时间。现代优化器非常擅长他们所做的事情,所以我发现只允许返回一行的查询通过添加 ROWNUM=1 会看到显着的性能提升,这有点令人惊讶。
Is the performance gain supposedly from removing the need to check for this constraint?
是否可以通过消除检查此约束的需要来提高性能?
I find when I stop trusting the optimizer I often dig myself a deeper grave ;)
我发现当我不再相信优化器时,我经常给自己挖一个更深的坟墓;)
Additionally: When in doubt try it out. Find a large join, run it several times without rownum=1, several times with rownum=1 and see if you're noticing a large percentage improvement. To ensure that there are no caching issues I would suggest doing this on a database that you are able to restart.
另外:如有疑问,请尝试。找到一个大连接,在没有 rownum=1 的情况下运行几次,在 rownum=1 的情况下运行几次,看看你是否注意到有很大的百分比改进。为了确保没有缓存问题,我建议在您能够重新启动的数据库上执行此操作。
回答by djb
While I like the highest voted answer, in an effort to avoid exceptions, I do something like this:
虽然我喜欢投票最高的答案,但为了避免例外,我会这样做:
begin
select count(*)
into ls_exists
from dual
where exists (select null from ... where ...);
if ls_exists = 1 then
do_something;
else
do_something_else;
end if;
end;
This will alwaysreturn a single row with a 1 or a 0. No exceptions.
这将始终返回带有 1 或 0 的单行。没有例外。
Note also the use of SELECT NULL. This does a couple of things; when the optimizer thinks it's appropriate, Oracle will only look at indexes. If you put table columns in the SELECT clause, and the columns are not part of an index, Oracle will do the index lookup, then get the row from the table, which may be totally useless depending on the query. Since you're only checking the existence of a row, you probably don't need actual data.
还要注意 SELECT NULL 的使用。这做了几件事;当优化器认为合适时,Oracle 将只查看索引。如果将表列放在 SELECT 子句中,并且这些列不是索引的一部分,Oracle 将进行索引查找,然后从表中获取行,这可能完全无用,具体取决于查询。由于您只是检查行是否存在,因此您可能不需要实际数据。
SELECT NULL doesn't return any data (an optimization), and the EXISTS clause looks at whether a row is returned, not the data in the row.
SELECT NULL 不返回任何数据(优化),并且 EXISTS 子句查看是否返回行,而不是行中的数据。
回答by WW.
If you ask for COUNT(1) then Oracle must find all the matching rows to satisfy your exact answer.
如果您要求 COUNT(1),那么 Oracle 必须找到所有匹配的行来满足您的确切答案。
SELECT COUNT(1) FROM ....
If you ask for 1 from the first row, Oracle can stop once it has found one matching row.
如果您从第一行请求 1,Oracle 会在找到一个匹配行后停止。
SELECT 1 FROM ... WHERE ROWNUM = 1
It is good practice to only ask for the data that you actually need. Why get Oracle to tell you there are 1,203,499 matching results when you only care about the first one? People have mentioned that the optimizer can improve things. However, it still has to answer the question you asked. If you ask an easier question it can come up with the answer quicker.
仅询问您实际需要的数据是一种很好的做法。只关心第一个匹配结果,为什么要让 Oracle 告诉你有 1,203,499 个匹配结果?人们提到优化器可以改进事物。但是,它仍然必须回答您提出的问题。如果你问一个更简单的问题,它可以更快地给出答案。
The times it is likely to have a significant impact on performance:- * The actual count of records you find is high, * Oracle switches from a HASH JOIN plan to a NESTED LOOP, and the nested loop plan is better for finding the first row
它可能对性能产生重大影响的次数:- * 你找到的实际记录数很高, * Oracle 从 HASH JOIN 计划切换到 NESTED LOOP,嵌套循环计划更适合查找第一行
回答by Jeffrey Kemp
Does the introduction of ROWNUM=1 significantly increase performance?
引入 ROWNUM=1 是否显着提高了性能?
It can make a very significant difference. If you are only interested in the first row the database happens to find when it runs the query, then it is best practice to tell Oracle that, by adding "ROWNUM=1". If you don't, Oracle will assume you intend to fetch all the rows from the query eventually, and will optimise the query accordingly.
它可以产生非常显着的差异。如果您只对数据库在运行查询时碰巧找到的第一行感兴趣,那么最佳做法是通过添加“ROWNUM=1”来告诉 Oracle。如果不这样做,Oracle 将假定您打算最终从查询中获取所有行,并相应地优化查询。
In the case of COUNT(), if you only want to know if there is at least one record, the query optimiser won't know that and will count each and every row - a waste of time. If you add ROWNUM=1, you have given the optimiser the chance to stop as soon as it finds a row.
在 COUNT() 的情况下,如果您只想知道是否至少有一条记录,则查询优化器不会知道这一点,并且会计算每一行 - 浪费时间。如果您添加 ROWNUM=1,您就给了优化器在发现行后立即停止的机会。
If so, under what conditions would performance be particularly improved (e.g lots of joins, constraints on unindexed columns, large tables, large result sets)
如果是这样,在什么条件下性能会特别提高(例如,大量连接、对未索引列的约束、大表、大结果集)
The more data the query needs to plow through to answer your query, the more the ROWNUM=1 predicate is likely to improve performance. In a multi-table join, for example, adding ROWNUM=1 could change the plan from using a lot of expensive hash joins, to make it use nested loops instead, which are far quicker when rows exist.
查询需要处理的数据越多来回答您的查询,ROWNUM=1 谓词就越有可能提高性能。例如,在多表连接中,添加 ROWNUM=1 可以将计划从使用大量昂贵的散列连接更改为使用嵌套循环,当行存在时,嵌套循环要快得多。