SQL 如何使用 Explain Plan 优化查询?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/234622/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use Explain Plan to optimize queries?
提问by Jacob Schoen
I have been tasked to optimize some sql queries at work. Everything I have found points to using Explain Plan to identify problem areas. The problem I can not find out exactly what explain plan is telling me. You get Cost, Cardinality, and bytes.
我的任务是优化工作中的一些 sql 查询。我发现的一切都指向使用解释计划来识别问题区域。我无法确切地找出解释计划告诉我的问题。您将获得成本、基数和字节数。
What do this indicate, and how should I be using this as a guide. Are low numbers better? High better? Any input would be greatly appreciated.
这表明什么,我应该如何使用它作为指导。数字越小越好吗?高更好?任何投入将不胜感激。
Or if you have a better way to go about optimizing a query, I would be interested.
或者,如果您有更好的方法来优化查询,我会很感兴趣。
采纳答案by carson
You get more than that actually depending on what you are doing. Check out this explain planpage. I'm assuming a little bit here that you are using Oracle and know how to run the script to display the plan output. What may be more important to start with is looking at the left hand side for the use of a particular index or not and how that index is being utilized. You should see things like "(Full)", "(By Index Rowid)", etc if you are doing joins. The cost would be the next thing to look at with lower costs being better and you will notice that if you are doing a join that is not using an index you may get a very large cost. You may also want to read details about the explain plan columns.
实际上取决于你在做什么,你会得到更多。查看此解释计划页面。我在这里稍微假设您正在使用 Oracle 并且知道如何运行脚本来显示计划输出。开始时可能更重要的是查看左侧是否使用特定索引以及如何使用该索引。如果您在进行连接,您应该会看到诸如“(Full)”、“(By Index Rowid)”等内容。成本将是接下来要考虑的事情,成本越低越好,您会注意到,如果您正在执行不使用索引的连接,您可能会获得非常大的成本。您可能还想阅读有关解释计划列的详细信息。
回答by Walter Mitty
I also assume you are using Oracle. And I also recommend that you check out the explain plan web page, for starters. There is a lot to optimization, but it can be learned.
我还假设您正在使用 Oracle。我还建议您先查看解释计划网页。优化有很多,但可以学习。
A few tips follow:
一些提示如下:
First, when somebody tasks you to optimize, they are almost always looking for acceptable performance rather than ultimate performance. If you can reduce a query's running time from 3 minutes down to 3 seconds, don't sweat reducing it down to 2 seconds, until you are asked to.
首先,当有人要求您进行优化时,他们几乎总是在寻找可接受的性能而不是最终性能。如果您可以将查询的运行时间从 3 分钟减少到 3 秒,请不要担心将其减少到 2 秒,直到您被要求这样做。
Second, do a quick check to make sure the queries you are optimizing are logically correct. It sounds absurd, but I can't tell you the number of times I've been asked for advice on a slow running query, only to find out that it was occasionally giving wrong answers! And as it turns out, debugging the query often turned out to speed it up as well.
其次,快速检查以确保您正在优化的查询在逻辑上是正确的。这听起来很荒谬,但我无法告诉你有多少次我被要求就运行缓慢的查询提供建议,结果却发现它偶尔会给出错误的答案!事实证明,调试查询通常也能加快速度。
In particular, look for the phrase "Cartesian Join" in the explain plan. If you see it there, the chances are awfully good that you've found an unintentional cartesian join. The usual pattern for an unintentional cartesian join is that the FROM clause lists tables separated by comma, and the join conditions are in the WHERE clause. Except that one of the join conditions is missing, so that Oracle has no choice but to perform a cartesian join. With large tables, this is a performance disaster.
特别是,在解释计划中寻找短语“笛卡尔连接”。如果你在那里看到它,你发现一个无意的笛卡尔连接的机会非常好。无意的笛卡尔连接的通常模式是 FROM 子句列出以逗号分隔的表,连接条件在 WHERE 子句中。除了缺少其中一个连接条件之外,Oracle 别无选择,只能执行笛卡尔连接。对于大表,这是性能灾难。
It is possible to see a Cartesian Join in the explain plan where the query is logically correct, but I associate this with older versions of Oracle.
可以在查询逻辑正确的解释计划中看到笛卡尔连接,但我将其与旧版本的 Oracle 相关联。
Also look for the unused compound index. If the first column of a compound index is not used in the query, Oracle may use the index inefficiently, or not at all. Let me give an example:
还要查找未使用的复合索引。如果查询中未使用复合索引的第一列,则 Oracle 可能会低效使用该索引,或者根本不使用该索引。让我举个例子吧:
The query was:
查询是:
select * from customers
where
State = @State
and ZipCode = @ZipCode
(The DBMS was not Oracle, so the syntax was different, and I've forgotten the original syntax).
(DBMS 不是 Oracle,所以语法不同,我忘记了原来的语法)。
A quick peek at the indexes revealed an index on Customers with the columns (Country, State, ZipCode) in that order. I changed the query to read
快速浏览一下这些索引,您会发现有一个以该顺序排列的列(国家、州、邮政编码)的客户索引。我将查询更改为读取
select * from customers
where Country = @Country
and State = @State
and ZipCode = @ZipCode
and now it ran in about 6 seconds instead of about 6 minutes, because the optimizer was able to use the index to good advantage. I asked the application programmers why they had omitted the country from the criteria, and this was their answer: they knew that all the addresses had country equal to 'USA' so they figured they could speed up the query by leaving that criterion out!
现在它运行了大约 6 秒而不是大约 6 分钟,因为优化器能够很好地利用索引。我问应用程序程序员为什么他们从标准中省略了国家,这是他们的答案:他们知道所有地址的国家都等于“美国”,所以他们认为他们可以通过忽略该标准来加快查询速度!
Unfortunately, optimizing database retrieval is not really the same as shaving microseconds off of computing time. It involves understanding the database design, especially indexes, and at least an overview of how the optimizer does its job.
不幸的是,优化数据库检索与将计算时间缩短几微秒并不完全相同。它涉及了解数据库设计,尤其是索引,以及至少对优化器如何工作的概述。
You generally get better results from the optimizer when you learn to collaborate with it instead of trying to outsmart it.
当您学会与优化器协作而不是试图超越它时,您通常会从优化器中获得更好的结果。
Good luck coming up to speed at optimization!
祝你好运加速优化!
回答by carson
You got the fuzzy end of the lollipop.
你得到了棒棒糖的模糊末端。
There is absolutely no way, in isolation, without a ton of additional information and experience, to look at an explain plan and determine what (if anything) is causing less than optimum performance. If query tuning could be reduced to a 10 step process it would be done by an automated process. I was about to list all of the things you need to understand to be effective at this but that would be a very long list.
在没有大量额外信息和经验的情况下,绝对没有办法单独查看解释计划并确定是什么(如果有的话)导致性能低于最佳性能。如果查询优化可以减少到 10 个步骤的过程,那么它将由一个自动化过程完成。我正要列出你需要了解的所有事情才能有效地做到这一点,但这将是一个很长的清单。
the only short answer I can think of... is look for steps in the plan that are going through way more bytes than you'd guess. Then think about how you can reduce that number... via an index or partitioning.
我能想到的唯一简短答案是……寻找计划中的步骤,这些步骤所经历的字节数比您想象的要多。然后考虑如何减少这个数字……通过索引或分区。
Seriously, get Jonathan's Lewis book on Cost Based Oracle Fundementals
说真的,获取 Jonathan 的 Lewis 关于基于成本的 Oracle 基础知识的书
Get Tom Kyte's book on Oracle database Architecture and rent a cabin in the woods for a few weeks.
获取 Tom Kyte 的关于 Oracle 数据库架构的书,然后在树林里租一个小木屋几个星期。
回答by WW.
This is a massive area of expertise (aka a black art).
这是一个庞大的专业领域(又名黑色艺术)。
The approach I generally take is:
我一般采取的方法是:
- Run the SQL statement in question,
- Get the actual plan (look up dbms_xplan),
- Compare the estimated number of rows (cardinality) vs actual number of rows. A big difference indicates a problem to be fixed (e.g. index, histogram)
- Consider if you can create an index to speed part of the process (generally where you conceptually think the plan should go first). Try some indexes.
- 运行有问题的 SQL 语句,
- 获取实际计划(查找 dbms_xplan),
- 比较估计的行数(基数)与实际行数。大的差异表示要修复的问题(例如索引、直方图)
- 考虑是否可以创建索引来加快流程的一部分(通常是您从概念上认为计划应该首先进行的地方)。尝试一些索引。
You need to understand the O() impacts of different indexes in the context of what you are asking the database. It helps you understand data structures like b-trees, hash tables etc. Then, create an index that might work and repeat the process.
您需要在您询问数据库的上下文中了解不同索引的 O() 影响。它可以帮助您理解 b 树、哈希表等数据结构。然后,创建一个可能有效的索引并重复该过程。
If Oracle decides not to use your index, apply an INDEX() hint and look at the new plan. The cost will be greater than the plan it did choose - this is why it didn't pick your index. The hinted plan might lead to some insight about why your index is not good.
如果 Oracle 决定不使用您的索引,请应用 INDEX() 提示并查看新计划。成本将高于它选择的计划 - 这就是它没有选择您的索引的原因。暗示的计划可能会导致对为什么您的索引不好的一些见解。