oracle 解释计划成本与执行时间
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/102477/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Explain Plan Cost vs Execution Time
提问by DrFloyd5
Before, I have found the "Cost" in the execution plan to be a good indicator of relative execution time. Why is this case different? Am I a fool for thinking the execution plan has relevance? What specifically can I try to improve v_test performance?
之前,我发现执行计划中的“成本”是相对执行时间的一个很好的指标。为什么这个案例不一样?我是一个认为执行计划具有相关性的傻瓜吗?我可以尝试提高 v_test 性能的具体方法是什么?
Thank you.
谢谢你。
Using Oracle 10g I have a simple query view defined below
使用 Oracle 10g 我有一个简单的查询视图定义如下
create or replace view v_test as
select distinct u.bo_id as bo_id, upper(trim(d.dept_id)) as dept_id
from
cust_bo_users u
join cust_bo_roles r on u.role_name=r.role_name
join cust_dept_roll_up_tbl d on
(r.region is null or trim(r.region)=trim(d.chrgback_reg)) and
(r.prod_id is null or trim(r.prod_id)=trim(d.prod_id)) and
(r.div_id is null or trim(r.div_id)=trim(d.div_id )) and
(r.clus_id is null or trim(r.clus_id )=trim( d.clus_id)) and
(r.prod_ln_id is null or trim(r.prod_ln_id)=trim(d.prod_ln_id)) and
(r.dept_id is null or trim(r.dept_id)=trim(d.dept_id))
defined to replace the following view
定义为替换以下视图
create or replace view v_bo_secured_detail
select distinct Q.BO_ID, Q.DEPT_ID
from (select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'REGION' and
trim(R.REGION) = UPPER(trim(D.CHRGBACK_REG))
union all
select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'RG_PROD' and
trim(R.REGION) = UPPER(trim(D.CHRGBACK_REG)) and
trim(R.PROD_ID) = UPPER(trim(D.PROD_ID))
union all
select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'PROD' and
trim(R.PROD_ID) = UPPER(trim(D.PROD_ID))
union all
select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'DIV' and
trim(R.DIV_ID) = UPPER(trim(D.DIV_ID))
union all
select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'RG_DIV' and
trim(R.REGION) = UPPER(trim(D.CHRGBACK_REG)) and
trim(R.DIV_ID) = UPPER(trim(D.DIV_ID))
union all
select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'CLUS' and
trim(R.CLUS_ID) = UPPER(trim(D.CLUS_ID))
union all
select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'RG_CLUS' and
trim(R.REGION) = UPPER(trim(D.CHRGBACK_REG)) and
trim(R.CLUS_ID) = UPPER(trim(D.CLUS_ID))
union all
select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'PROD_LN' and
trim(R.PROD_LN_ID) = UPPER(trim(D.PROD_LN_ID))
union all
select U.BO_ID BO_ID, UPPER(trim(R.DEPT_ID)) DEPT_ID
from CUST_BO_USERS U, CUST_BO_ROLES R
where U.ROLE_NAME = R.ROLE_NAME and
R.ROLE_LEVEL = 'DEPT') Q
with the goal of removing the dependency on the ROLE_LEVEL column.
目的是消除对 ROLE_LEVEL 列的依赖。
The execution plan for v_test is significantly lower than v_bo_secured_detail for simple
v_test 的执行计划明显低于 v_bo_secured_detail for simple
select * from <view> where bo_id='value'
queries. And is significantly lower when used in a real world query
查询。并且在实际查询中使用时要低得多
select CT_REPORT.RPT_KEY,
CT_REPORT_ENTRY.RPE_KEY,
CT_REPORT_ENTRY.CUSTOM16,
Exp_Sub_Type.value,
min(CT_REPORT_PAYMENT_CONF.PAY_DATE),
CT_REPORT.PAID_DATE
from CT_REPORT,
<VIEW> SD,
CT_REPORT_ENTRY,
CT_LIST_ITEM_LANG Exp_Sub_Type,
CT_REPORT_PAYMENT_CONF,
CT_STATUS_LANG Payment_Status
where (CT_REPORT_ENTRY.RPT_KEY = CT_REPORT.RPT_KEY) and
(Payment_Status.STAT_KEY = CT_REPORT.PAY_KEY) and
(Exp_Sub_Type.LI_KEY = CT_REPORT_ENTRY.CUSTOM9 and Exp_Sub_Type.LANG_CODE = 'en') and
(CT_REPORT.RPT_KEY = CT_REPORT_PAYMENT_CONF.RPT_KEY) and
(SD.BO_ID = 'JZHU9') and
(SD.DEPT_ID = UPPER(CT_REPORT_ENTRY.CUSTOM5)) and
(Payment_Status.name = 'Payment Confirmed' and (Payment_Status.LANG_CODE = 'en') and
CT_REPORT.PAID_DATE > to_date('01/01/2008', 'mm/dd/yyyy') and Exp_Sub_Type.value != 'Korea')
group by CT_REPORT.RPT_KEY,
CT_REPORT_ENTRY.RPE_KEY,
CT_REPORT_ENTRY.CUSTOM16,
Exp_Sub_Type.value,
CT_REPORT.PAID_DATE
The execution times are WILDLY different. The v_test view taking 15 hours, and the v_bo_secured_detail taking a few seconds.
执行时间大不相同。v_test 视图需要 15 个小时,v_bo_secured_detail 需要几秒钟。
Thank you all who responded
谢谢所有回复的人
This is one to remember for me. The places where the theory and mathematics of the expressions meets the reality of hardware based execution. Ouch.
这是我要记住的。表达式的理论和数学与基于硬件的执行现实相结合的地方。哎哟。
采纳答案by user11318
As the Oracle documentation says, the cost is the estimated cost relative to a particular execution plan. When you tweak the query, the particular execution plan that costs are calculated relative to can change. Sometimes dramatically.
正如Oracle 文档所说,成本是相对于特定执行计划的估计成本。当您调整查询时,计算成本的特定执行计划可能会发生变化。有时戏剧性。
The problem with v_test's performance is that Oracle can think of no way to execute it other than performing a nested loop, for each cust_bo_roles, scan all of cust_dept_roll_up_tbl to find a match. If the table are of size n and m, this takes n*m time, which is slow for large tables. By contrast v_bo_secured_detail is set up so that it is a series of queries, each of which can be done through some other mechanism. (Oracle has a number it may use, including using an index, building a hash on the fly, or sorting the datasets and merging them. These operations are all O(n*log(n)) or better.) A small series of fast queries is fast.
v_test的性能问题是Oracle除了执行嵌套循环,想不到其他的执行方法,对于每个cust_bo_roles,扫描所有的cust_dept_roll_up_tbl,寻找匹配。如果表的大小为 n 和 m,这需要 n*m 时间,这对于大表来说很慢。相比之下, v_bo_secured_detail 被设置为一系列查询,每个查询都可以通过某种其他机制完成。(Oracle 有一个可以使用的数字,包括使用索引、动态构建散列或对数据集进行排序和合并。这些操作都是 O(n*log(n)) 或更好的。)一小部分快速查询很快。
As painful as it is, if you want this query to be fast then you need to break it out like the previous query did.
尽管很痛苦,但如果您希望这个查询更快,那么您需要像上一个查询那样将其分解。
回答by dland
An execution plan is theory, the execution time is reality.
执行计划是理论,执行时间是现实。
The plan shows you how the engine goes about performing your query, but some steps might cause an inordinate amount of work to resolve the query. The use of "x is null or x = y" smells bad. If r and d are big tables you might have some sort of combinatorial explosion hitting you and the request cycles endlessly through large lists of disk blocks. I imagine you're seeing lots of I/O during the execution.
该计划向您展示了引擎如何执行您的查询,但某些步骤可能会导致解决查询的工作量过大。“x 为空或 x = y”的使用很糟糕。如果 r 和 d 是大表,您可能会遇到某种组合爆炸,并且请求在大磁盘块列表中无休止地循环。我想您在执行过程中会看到大量 I/O。
On the other hand, the unioned selects are short and sweet, and so probably reuse lots of disk blocks that are still lying around from the earlier selects, and/or you have some degree of parallelism benefitting from reads on the same disk blocks.
另一方面,联合选择又短又好,因此可能会重用大量仍然存在于早期选择中的磁盘块,和/或您有某种程度的并行性受益于对相同磁盘块的读取。
Also using trim() and upper() everywhere looks a bit suspicious. If your data are so unclean it might be worth running some periodic housecleaning from time to time, so that you can say "x = y" and know it works.
到处使用trim() 和upper() 看起来有点可疑。如果您的数据如此不干净,可能值得不时进行一些定期清理,以便您可以说“x = y”并知道它有效。
update: you asked for tips to improve v_test. Clean your data so that trim() and upper() are unnecessay. They may preclude indexes from being used (although that would be affecting the unioned select version as well).
更新:您要求提供改进 v_test 的提示。清理您的数据,以便trim() 和upper() 是不必要的。它们可能会阻止使用索引(尽管这也会影响联合选择版本)。
If you can't get rid of "x is null or x = y" then y = nvl(x,'does-not-exist') might have better characteristics (assuming 'does-not-exist' is a "can't happen" id value).
如果您无法摆脱“x 为空或 x = y”,则 y = nvl(x,'does-not-exist') 可能具有更好的特征(假设 'does-not-exist' 是“can” t发生”id值)。
回答by Nick Johnson
When you say the "query plan is lower", do you mean it is shorter, or that the actual cost estimates are lower? One obvious problem with your replacement view is that the join with cust_dept_roll_up_tbl uses almost exclusively unindexable criteria (the "is null" tests can be satisfied by an index, but the ones involving calling trim on each argument can't be), so the planner has to make at least one, and probably several sequential scans of the table to satisfy the query.
当您说“查询计划较低”时,您的意思是它较短,还是实际成本估算较低?替换视图的一个明显问题是,与 cust_dept_roll_up_tbl 的连接几乎完全使用不可索引的标准(索引可以满足“为空”测试,但不能满足对每个参数调用 trim 的测试),因此规划器必须至少对表进行一次,可能还有多次顺序扫描以满足查询。
I'm not sure if Oracle has this limitation, but many DBs can only do a single index scan per included table, so even if you clean up your join conditions to be indexable, it may be able to satisfy only one condition with an index scan and have to use sequential scans for the remainder.
我不确定 Oracle 是否有此限制,但许多 DB 只能对每个包含的表进行一次索引扫描,因此即使您清理连接条件以使其可索引,它也可能只能满足一个索引条件扫描并且必须对剩余部分使用顺序扫描。
回答by Zorkus
To elaborate about a cost a bit.
详细说明一下成本。
In Oracle 9/10g, simplifying a bit, the cost is determined by formula:
在 Oracle 9/10g 中,稍微简化一下,成本由公式决定:
Cost = (SrCount * SrTime + MbrCount * MbrTime + CpuCyclesCount * CpuCycleTime) / SrTime
成本 = (SrCount * SrTime + MbrCount * MbrTime + CpuCyclesCount * CpuCycleTime) / SrTime
Where SrCount - count total single block reads made, SrTime - average time of one single block read according to gathered system statistics, MbrCount and MbrTime, the same for multiblock read correspondingly (ones use during full table scans and index fast full scans), Cpu related metrics are self-explanatory..and all divided by single block read time.
其中 SrCount - 进行的单块读取总数,SrTime - 根据收集的系统统计信息,单块读取的平均时间,MbrCount 和 MbrTime,对应的多块读取相同(在全表扫描和索引快速全扫描期间使用),Cpu相关指标是不言自明的......并且全部除以单个块读取时间。
回答by Tony Andrews
Have you gathered optimiser stats on all the underlying tables? Without them the optimiser's estimates may be wildly out of kilter with reality.
您是否收集了所有底层表的优化器统计信息?没有它们,优化者的估计可能与现实大相径庭。
回答by Greg Ogle
One aspect of low-cost -- high execution time is that when you are looking at large data-sets, it is often more efficient on the whole to do things in bulk, whereas if you want a quick results, it is more efficient to do as little work as possible to get the first record. The repetitiveness of doing the small operations that give the appearance of a quick response will not likely give a good result when working on the large sets.
低成本——高执行时间的一个方面是,当您查看大型数据集时,整体上执行批量操作通常更高效,而如果您想要快速获得结果,则更高效做尽可能少的工作来获得第一条记录。在处理大型集合时,重复性地进行看似快速响应的小操作可能不会产生好的结果。
Many times, when you want a quick result, the USE_NL optimizer hint will help.
很多时候,当您想要快速获得结果时,USE_NL 优化器提示会有所帮助。
Also, in your test view, it is relying on IS NULL... IS NULL cannot use an index nor can using a function such as trim on the 'table-side' parameter.
此外,在您的测试视图中,它依赖于 IS NULL... IS NULL 不能使用索引,也不能在“表侧”参数上使用诸如修剪之类的函数。