了解 Oracle SQL Developer 中 Execute Explain Plan 的结果
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/860450/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Understanding the results of Execute Explain Plan in Oracle SQL Developer
提问by Kevin Babcock
I'm trying to optimize a query but don't quite understand some of the information returned from Explain Plan. Can anyone tell me the significance of the OPTIONS and COST columns? In the OPTIONS column, I only see the word FULL. In the COST column, I can deduce that a lower cost means a faster query. But what exactly does the cost value represent and what is an acceptable threshold?
我正在尝试优化查询,但不太明白从Explain Plan返回的一些信息。谁能告诉我 OPTIONS 和 COST 列的重要性?在 OPTIONS 列中,我只看到 FULL 一词。在 COST 列中,我可以推断出较低的成本意味着更快的查询。但是成本值究竟代表什么,可接受的阈值是多少?
回答by Jeffrey Kemp
The output of EXPLAIN PLAN is a debug output from Oracle's query optimiser. The COST is the final output of the Cost-based optimiser (CBO), the purpose of which is to select which of the many different possible plans should be used to run the query. The CBO calculates a relative Cost for each plan, then picks the plan with the lowest cost.
EXPLAIN PLAN 的输出是 Oracle 查询优化器的调试输出。COST 是基于成本的优化器 (CBO) 的最终输出,其目的是选择应使用许多不同的可能计划中的哪一个来运行查询。CBO 计算每个计划的相对成本,然后选择成本最低的计划。
(Note: in some cases the CBO does not have enough time to evaluate every possible plan; in these cases it just picks the plan with the lowest cost found so far)
(注意:在某些情况下,CBO 没有足够的时间来评估每个可能的计划;在这些情况下,它只会选择迄今为止成本最低的计划)
In general, one of the biggest contributors to a slow query is the number of rows read to service the query (blocks, to be more precise), so the cost will be based in parton the number of rows the optimiser estimates will need to be read.
一般来说,导致慢查询的最大因素之一是为查询服务而读取的行数(更准确地说是块),因此成本将部分基于优化器估计需要的行数被阅读。
For example, lets say you have the following query:
例如,假设您有以下查询:
SELECT emp_id FROM employees WHERE months_of_service = 6;
(The months_of_service
column has a NOT NULL constraint on it and an ordinary index on it.)
(该months_of_service
列上有一个 NOT NULL 约束和一个普通索引。)
There are two basic plans the optimiser might choose here:
优化器可以在这里选择两个基本计划:
- Plan 1: Read all the rows from the "employees" table, for each, check if the predicate is true (
months_of_service=6
). - Plan 2: Read the index where
months_of_service=6
(this results in a set of ROWIDs), then access the table based on the ROWIDs returned.
- 方案一:从“employees”表中读取所有的行,对于每一行,检查谓词是否为真(
months_of_service=6
)。 - 方案二:读取索引where
months_of_service=6
(这会产生一组ROWID),然后根据返回的ROWID访问表。
Let's imagine the "employees" table has 1,000,000 (1 million) rows. Let's further imagine that the values for months_of_service range from 1 to 12 and are fairly evenly distributed for some reason.
让我们假设“员工”表有 1,000,000(100 万)行。让我们进一步想象一下,months_of_service 的值范围从 1 到 12,并且由于某种原因分布相当均匀。
The cost of Plan 1, which involves a FULL SCAN, will be the cost of reading all the rows in the employees table, which is approximately equal to 1,000,000; but since Oracle will often be able to read the blocks using multi-block reads, the actual cost will be lower (depending on how your database is set up) - e.g. let's imagine the multi-block read count is 10 - the calculated cost of the full scan will be 1,000,000 / 10; Overal cost = 100,000.
计划 1的成本,涉及全扫描,将是读取员工表中所有行的成本,大约等于 1,000,000;但是由于 Oracle 通常能够使用多块读取来读取块,因此实际成本会更低(取决于您的数据库的设置方式) - 例如,假设多块读取计数为 10 - 计算成本全扫描将是 1,000,000 / 10;总成本 = 100,000。
The cost of Plan 2, which involves an INDEX RANGE SCAN and a table lookup by ROWID, will be the cost of scanning the index, plus the cost of accessing the table by ROWID. I won't go into how index range scans are costed but let's imagine the cost of the index range scan is 1 per row; we expect to find a match in 1 out of 12 cases, so the cost of the index scan is 1,000,000 / 12 = 83,333; plus the cost of accessing the table (assume 1 block read per access, we can't use multi-block reads here) = 83,333; Overall cost = 166,666.
计划 2的成本,涉及索引范围扫描和按 ROWID 查找表,将是扫描索引的成本,加上按 ROWID 访问表的成本。我不会讨论索引范围扫描的成本,但让我们想象一下索引范围扫描的成本是每行 1;我们希望在 12 种情况中的 1 种情况下找到匹配项,因此索引扫描的成本是 1,000,000 / 12 = 83,333;加上访问表的成本(假设每次访问读取 1 个块,我们这里不能使用多块读取)= 83,333;总成本 = 166,666。
As you can see, the cost of Plan 1 (full scan) is LESS than the cost of Plan 2 (index scan + access by rowid) - which means the CBO would choose the FULL scan.
如您所见,计划 1(完全扫描)的成本低于计划 2(索引扫描 + 按 rowid 访问)的成本 - 这意味着 CBO 将选择完全扫描。
If the assumptions made here by the optimiser are true, then in fact Plan 1 will be preferable and much more efficient than Plan 2 - which disproves the myth that FULL scans are "always bad".
如果优化器在此做出的假设是正确的,那么实际上计划 1 将比计划 2 更可取且效率更高——这反驳了 FULL 扫描“总是不好”的神话。
The results would be quite different if the optimiser goal was FIRST_ROWS(n) instead of ALL_ROWS - in which case the optimiser would favour Plan 2 because it will often return the first few rows quicker, at the cost of being less efficient for the entire query.
如果优化器的目标是 FIRST_ROWS(n) 而不是 ALL_ROWS,结果会大不相同——在这种情况下,优化器会支持计划 2,因为它通常会更快地返回前几行,代价是整个查询的效率较低.
回答by MichaelN
The CBO builds a decision tree, estimating the costs of each possible execution path available per query. The costs are set by the CPU_cost or I/O_cost parameter set on the instance. And the CBO estimates the costs, as best it can with the existing statistics of the tables and indexes that the query will use. You should not tune your query based on cost alone. Cost allows you to understand WHY the optimizer is doing what it does. Without cost you could figure out why the optimizer chose the plan it did. Lower cost does not mean a faster query. There are cases where this is true and there will be cases where this is wrong. Cost is based on your table stats and if they are wrong the cost is going to be wrong.
CBO 构建决策树,估计每个查询可用的每个可能执行路径的成本。成本由实例上设置的 CPU_cost 或 I/O_cost 参数设置。并且 CBO 会根据查询将使用的表和索引的现有统计信息尽可能地估计成本。您不应仅根据成本来调整查询。成本让您了解优化器为什么要这样做。无需成本,您就可以找出优化器选择它所做的计划的原因。更低的成本并不意味着更快的查询。有些情况下这是正确的,也会有错误的情况。成本基于您的表统计数据,如果它们是错误的,成本就会是错误的。
When tuning your query, you should take a look at the cardinality and the number of rows of each step. Do they make sense? Is the cardinality the optimizer is assuming correct? Is the rows being return reasonable. If the information present is wrong then its very likely the optimizer doesn't have the proper information it needs to make the right decision. This could be due to stale or missing statistics on the table and index as well as cpu-stats. Its best to have stats updated when tuning a query to get the most out of the optimizer. Knowing your schema is also of great help when tuning. Knowing when the optimizer chose a really bad decision and pointing it in the correct path with a small hint can save a load of time.
在调整查询时,您应该查看每一步的基数和行数。它们有意义吗?优化器假设的基数是否正确?返回的行是否合理。如果存在的信息是错误的,那么优化器很可能没有做出正确决策所需的正确信息。这可能是由于表和索引以及 cpu-stats 上的统计信息过时或丢失。最好在调整查询以充分利用优化器时更新统计信息。了解您的架构在调整时也有很大帮助。知道优化器何时选择了一个非常糟糕的决定并用一个小提示将其指向正确的路径可以节省大量时间。
回答by drowe
Here is a reference for using EXPLAIN PLAN with Oracle: http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm), with specific information about the columns found here: http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm#i18300
以下是在 Oracle 中使用 EXPLAIN PLAN 的参考:http: //download.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm),以及有关此处找到的列的具体信息:http:// /download.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm#i18300
Your mention of 'FULL' indicates to me that the query is doing a full-table scan to find your data. This is okay, in certain situations, otherwise an indicator of poor indexing / query writing.
您提到“FULL”向我表明该查询正在执行全表扫描以查找您的数据。在某些情况下,这是可以的,否则表明索引/查询编写不佳。
Generally, with explain plans, you want to ensure your query is utilizing keys, thus Oracle can find the data you're looking for with accessing the least number of rows possible. Ultimately, you can sometime only get so far with the architecture of your tables. If the costs remain too high, you may have to think about adjusting the layout of your schema to be more performance based.
通常,对于解释计划,您希望确保您的查询使用键,因此 Oracle 可以通过访问尽可能少的行数来找到您要查找的数据。最终,有时您只能了解表的架构。如果成本仍然过高,您可能需要考虑调整架构的布局以更加基于性能。
回答by David Aldridge
In recent Oracle versions the COST represent the amount of time that the optimiser expects the query to take, expressed in units of the amount of time required for a single block read.
在最近的 Oracle 版本中,COST 表示优化器期望查询花费的时间量,以单个块读取所需的时间量单位表示。
So if a single block read takes 2ms and the cost is expressed as "250", the query could be expected to take 500ms to complete.
因此,如果单个块读取需要 2 毫秒并且成本表示为“250”,则查询可能需要 500 毫秒才能完成。
The optimiser calculates the cost based on the estimated number of single block and multiblock reads, and the CPU consumption of the plan. the latter can be very useful in minimising the cost by performing certain operations before others to try and avoid high CPU cost operations.
优化器根据估计的单块和多块读取次数以及计划的 CPU 消耗来计算成本。通过在其他操作之前执行某些操作来尝试避免高 CPU 成本操作,后者在最小化成本方面非常有用。
This raises the question of how the optimiser knows how long operations take. recent Oracle versions allow the collections of "system statistics", which are definitely not to be confused with statistics on tables or indexes. The system statistics are measurements of the performance of the hardware, mostly importantly:
这就提出了优化器如何知道操作需要多长时间的问题。最近的 Oracle 版本允许收集“系统统计信息”,绝对不要将其与表或索引的统计信息混淆。系统统计数据是对硬件性能的衡量,最重要的是:
- How long a single block read takes
- How long a multiblock read takes
- How large a multiblock read is (often different to the maximum possible due to table extents being smaller than the maximum, and other reasons).
- CPU performance
- 单个块读取需要多长时间
- 多块读取需要多长时间
- 多块读取的大小(由于表范围小于最大值和其他原因,通常与可能的最大值不同)。
- CPU性能
These numbers can vary greatly according to the operating environment of the system, and different sets of statistics can be stored for "daytime OLTP" operations and "nighttime batch reporting" operations, and for "end of month reporting" if you wish.
根据系统的操作环境,这些数字可能会有很大差异,并且可以为“白天 OLTP”操作和“夜间批量报告”操作以及“月末报告”(如果您愿意)存储不同的统计数据集。
Given these sets of statistics, a given query execution plan can be evaluated for cost in different operating environments, which might promote use of full table scans at some times or index scans at others.
给定这些统计数据集,可以评估给定查询执行计划在不同操作环境中的成本,这可能会在某些时候促进使用全表扫描或在其他时候使用索引扫描。
The cost is not perfect, but the optimiser gets better at self-monitoring with every release, and can feedback the actual cost in comparison to the estimated cost in order to make better decisions for the future. this also makes it rather more difficult to predict.
成本并不完美,但优化器在每次发布时都会更好地进行自我监控,并且可以反馈实际成本与估计成本相比,以便为未来做出更好的决策。这也使得预测变得相当困难。
Note that the cost is not necessarily wall clock time, as parallel query operations consume a total amount of time across multiple threads.
请注意,成本不一定是挂钟时间,因为并行查询操作跨多个线程消耗总时间。
In older versions of Oracle the cost of CPU operations was ignored, and the relative costs of single and multiblock reads were effectively fixed according to init parameters.
在旧版本的 Oracle 中,CPU 操作的成本被忽略了,单块读取和多块读取的相对成本根据 init 参数有效地固定。
回答by stili
FULL is probably referring to a full table scan, which means that no indexes are in use. This is usually indicating that something is wrong, unless the query is supposed to use all the rows in a table.
FULL 可能指的是全表扫描,这意味着没有使用索引。这通常表示有问题,除非查询应该使用表中的所有行。
Cost is a number that signals the sum of the different loads, processor, memory, disk, IO, and high numbers are typically bad. The numbers are added up when moving to the root of the plan, and each branch should be examined to locate the bottlenecks.
成本是一个数字,表示不同负载、处理器、内存、磁盘、IO 的总和,而高的数字通常是不好的。当移动到计划的根时,这些数字会相加,并且应该检查每个分支以定位瓶颈。
You may also want to query v$sql and v$session to get statistics about SQL statements, and this will have detailed metrics for all kind of resources, timings and executions.
您可能还想查询 v$sql 和 v$session 以获取有关 SQL 语句的统计信息,这将包含所有类型资源、时间和执行的详细指标。