如何优化在 700M 行的 Oracle 表上运行的更新 SQL
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2964422/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to optimize an update SQL that runs on a Oracle table with 700M rows
提问by b.roth
UPDATE [TABLE] SET [FIELD]=0 WHERE [FIELD] IS NULL
[TABLE] is an Oracle database table with more than 700 million rows. I cancelled the SQL execution after it had been running for 6 hours.
[TABLE] 是 Oracle 数据库表,行数超过 7 亿。我在运行 6 小时后取消了 SQL 执行。
Is there any SQL hint that could improve performance? Or any other solution to speed that up?
是否有任何可以提高性能的 SQL 提示?或者任何其他解决方案来加快速度?
EDIT:This query will be run once and then never again.
编辑:此查询将运行一次,然后再也不会运行。
回答by Vincent Malgrat
First of all is it a one-time query or is it a recurrent query ? If you only have to do it once you may want to look into running the query in parallel mode. You will have to scan all rows anyway, you could either divide the workload yourself with ranges of ROWID (do-it-yourself parallelism) or use Oracle built-in features.
首先,它是一次性查询还是重复查询?如果您只需要执行一次,您可能需要考虑以并行模式运行查询。无论如何,您都必须扫描所有行,您可以使用 ROWID(自己动手做的并行性)范围自己划分工作负载,也可以使用 Oracle 内置功能。
Assuming you want to run it frequently and want to optimize this query, the number of rows with the field
column as NULL will eventually be small compared to the total number of rows. In that case an index could speed things up. Oracle doesn't index rows that have all indexed columns as NULL so an index on field
won't get used by your query (since you want to find all rows where field
is NULL).
假设您想经常运行它并想优化此查询,那么与field
总行数相比,列为 NULL的行数最终会很小。在这种情况下,索引可以加快速度。Oracle 不会对所有索引列都为 NULL 的行进行索引,因此field
您的查询不会使用索引(因为您想查找所有field
为 NULL 的行)。
Either:
任何一个:
- create an index on
(FIELD, 0)
, the0
will act as a non-NULL pseudocolumn and all rows will be indexed on the table. create a function-based index on
(CASE WHEN field IS NULL THEN 1 END)
, this will only index the rows that are NULLs (the index would therefore be very compact). In that case you would have to rewrite your query:UPDATE [TABLE] SET [FIELD]=0 WHERE (CASE WHEN field IS NULL THEN 1 END)=1
- 在 上创建索引
(FIELD, 0)
,0
将充当非 NULL 伪列,并且所有行都将在表上建立索引。 在 上创建基于函数的索引
(CASE WHEN field IS NULL THEN 1 END)
,这只会索引为 NULL 的行(因此索引会非常紧凑)。在这种情况下,您将不得不重写您的查询:UPDATE [TABLE] SET [FIELD]=0 WHERE (CASE WHEN field IS NULL THEN 1 END)=1
Edit:
编辑:
Since this is a one-time scenario, you may want to use the PARALLEL
hint:
由于这是一次性方案,您可能需要使用以下PARALLEL
提示:
SQL> EXPLAIN PLAN FOR
2 UPDATE /*+ PARALLEL(test_table 4)*/ test_table
3 SET field=0
4 WHERE field IS NULL;
Explained
SQL> select * from table( dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 4026746538
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 22793 | 289K| 12 (9)| 00:00:
| 1 | UPDATE | TEST_TABLE | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM)| :TQ10000 | 22793 | 289K| 12 (9)| 00:00:
| 4 | PX BLOCK ITERATOR | | 22793 | 289K| 12 (9)| 00:00:
|* 5 | TABLE ACCESS FULL| TEST_TABLE | 22793 | 289K| 12 (9)| 00:00:
--------------------------------------------------------------------------------
回答by Gary Myers
Are other users are updating the same rows in the table at the same time ?
其他用户是否同时更新表中的相同行?
If so, you could be hitting lots of concurrency issues (waiting for locks) and it may be worth breaking it into smaller transactions.
如果是这样,您可能会遇到许多并发问题(等待锁定),可能值得将其分解为较小的事务。
DECLARE
v_cnt number := 1;
BEGIN
WHILE v_cnt > 0 LOOP
UPDATE [TABLE] SET [FIELD]=0 WHERE [FIELD] IS NULL AND ROWNUM < 50000;
v_cnt := SQL%ROWCOUNT;
COMMIT;
END LOOP;
END;
/
The smaller the ROWNUM limit the less concurrency/locking issues you'll hit, but the more time you'll spend in table scanning.
ROWNUM 限制越小,您遇到的并发/锁定问题就越少,但您花在表扫描上的时间就越多。
回答by Rob van Wijk
Vincent already answered your question perfectly, but I'm curious about the "why" behind this action. Why are you updating all NULL's to 0?
Vincent 已经完美地回答了你的问题,但我很好奇这个行动背后的“为什么”。为什么要将所有 NULL 更新为 0?
Regards, Rob.
问候,罗布。
回答by Bob Jarvis - Reinstate Monica
Some suggestions:
一些建议:
Drop any indexes that contain FIELD before running your UPDATE statement, and then re-add them later.
Write a PL/SQL procedure to do this that commits after every 1000 or 10000 rows.
在运行 UPDATE 语句之前删除所有包含 FIELD 的索引,然后稍后重新添加它们。
编写一个 PL/SQL 过程来执行此操作,该过程在每 1000 或 10000 行之后提交。
Hope this helps.
希望这可以帮助。
回答by James Anderson
You could acheive the same result without updating by using an ALTER table to set the columns "DEFAULT" value to 0.
通过使用 ALTER 表将列“DEFAULT”值设置为 0,您无需更新即可获得相同的结果。