SQL Oracle 11g - 如何优化慢速并行插入选择?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20047610/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Oracle 11g - How to optimize slow parallel insert select?
提问by robertdi
we want to speed up the run of the parallel insert statement below. We are expecting to insert around 80M records and it is taking around 2 hours to finish.
我们想加快下面并行插入语句的运行速度。我们预计插入大约 8000 万条记录,大约需要 2 个小时才能完成。
INSERT /*+ PARALLEL(STAGING_EX,16) APPEND NOLOGGING */ INTO STAGING_EX (ID, TRAN_DT,
RECON_DT_START, RECON_DT_END, RECON_CONFIG_ID, RECON_PM_ID)
SELECT /*+PARALLEL(PM,16) */ SEQ_RESULT_ID.nextval, sysdate, sysdate, sysdate,
'8a038312403e859201405245eed00c42', T1.ID FROM PM T1 WHERE STATUS = 1 and not
exists(select 1 from RESULT where T1.ID = RECON_PM_ID and CREATE_DT >= sysdate - 60) and
UPLOAD_DT >= sysdate - 1 and (FUND_SRC_TYPE = :1)
We think that caching the results of the not exist column will speed up the inserts. How do we perform the caching? Any ideas how else to speed up the insert?
我们认为缓存不存在列的结果会加快插入速度。我们如何执行缓存?任何想法如何加速插入?
Please see below for plan statistics from Enterprise Manager. Also we noticed that the statements are not being run in parallel. Is this normal?
请参阅下面的 Enterprise Manager 计划统计信息。我们还注意到这些语句不是并行运行的。这是正常的吗?
Edit: btw, the sequence is already cached to 1M
编辑:顺便说一句,序列已经缓存到 1M
采纳答案by Avias
Try using more bind variables, especially where nested loops might happen. I've noticed that you can use it in cases like
尝试使用更多的绑定变量,尤其是在可能发生嵌套循环的地方。我注意到你可以在这样的情况下使用它
CREATE_DT >= :YOUR_DATE instead of CREATE_DT >= sysdate - 60
I think this would explain why you have 180 million executions in the lowest part of your execution plan even though the whole other part of the update query is still at 8 million out of your 79 million.
我认为这可以解释为什么您在执行计划的最低部分有 1.8 亿次执行,即使更新查询的整个其他部分仍然是 7900 万次中的 800 万次。
回答by Jon Heller
Improve statistics.The estimated number of rows is 1, but the actual number of rows is over 7 million and counting. This causes the execution plan to use a nested loop instead of a hash join. A nested loop works better for small amounts of data and a hash join works better for large amounts of data. Fixing that may be as easy as ensuring the relevant tables have accurate, current statistics. This can usually be done by gathering statistics with the default settings, for example: exec dbms_stats.gather_table_stats('SIRS_UATC1', 'TBL_RECON_PM');
.
改进统计。估计的行数是 1,但实际的行数超过 700 万并且还在增加。这会导致执行计划使用嵌套循环而不是散列连接。嵌套循环更适用于少量数据,散列连接更适用于大量数据。解决这个问题可能就像确保相关表具有准确的最新统计数据一样简单。这通常可以通过收集统计信息使用默认设置,例如进行:exec dbms_stats.gather_table_stats('SIRS_UATC1', 'TBL_RECON_PM');
。
If that doesn't improve the cardinality estimate try using a dynamic sampling hint, such as /*+ dynamic_sampling(5) */
. For such a long-running query it is worth spending a little extra time up-front sampling data if it leads to a better plan.
如果这不能改善基数估计,请尝试使用动态采样提示,例如/*+ dynamic_sampling(5) */
. 对于这样一个长时间运行的查询,如果能产生更好的计划,那么花一点额外的时间预先采样数据是值得的。
Use statement-level parallelism instead of object-level parallelism.This is probably the most common mistake with parallel SQL. If you use object-level parallelism the hint must reference the aliasof the object. Since 11gR2 there is no need to worry about specifying objects. This statement only needs a single hint: INSERT /*+ PARALLEL(16) APPEND */ ...
. Note that NOLOGGING
is not a real hint.
使用语句级并行而不是对象级并行。这可能是并行 SQL 最常见的错误。如果您使用对象级并行,则提示必须引用对象的别名。从 11gR2 开始,无需担心指定对象。这个语句只需要一个提示:INSERT /*+ PARALLEL(16) APPEND */ ...
。请注意,这NOLOGGING
不是真正的提示。
回答by usergray
I can see 2 big problems:
我可以看到两个大问题:
1 - hint parallel (in select) NO NOT work, beacuse it should be like this +PARALLEL(T1,16)
1 - 提示并行(在选择中)不工作,因为它应该是这样的 +PARALLEL(T1,16)
2 - SELECT DO NOT optimal, it would be better if avoid expression NOT IN
2 - SELECT DO NOT 最优,如果避免表达式 NOT IN 会更好