如何优化在 700M 行的 Oracle 表上运行的更新 SQL

Question

提问by b.roth

UPDATE [TABLE] SET [FIELD]=0 WHERE [FIELD] IS NULL

[TABLE] is an Oracle database table with more than 700 million rows. I cancelled the SQL execution after it had been running for 6 hours.

[TABLE] 是 Oracle 数据库表，行数超过 7 亿。我在运行 6 小时后取消了 SQL 执行。

Is there any SQL hint that could improve performance? Or any other solution to speed that up?

是否有任何可以提高性能的 SQL 提示？或者任何其他解决方案来加快速度？

EDIT:This query will be run once and then never again.

编辑：此查询将运行一次，然后再也不会运行。

Answer 1

回答by Vincent Malgrat

First of all is it a one-time query or is it a recurrent query ? If you only have to do it once you may want to look into running the query in parallel mode. You will have to scan all rows anyway, you could either divide the workload yourself with ranges of ROWID (do-it-yourself parallelism) or use Oracle built-in features.

首先，它是一次性查询还是重复查询？如果您只需要执行一次，您可能需要考虑以并行模式运行查询。无论如何，您都必须扫描所有行，您可以使用 ROWID（自己动手做的并行性）范围自己划分工作负载，也可以使用 Oracle 内置功能。

Assuming you want to run it frequently and want to optimize this query, the number of rows with the fieldcolumn as NULL will eventually be small compared to the total number of rows. In that case an index could speed things up. Oracle doesn't index rows that have all indexed columns as NULL so an index on fieldwon't get used by your query (since you want to find all rows where fieldis NULL).

假设您想经常运行它并想优化此查询，那么与field总行数相比，列为 NULL的行数最终会很小。在这种情况下，索引可以加快速度。Oracle 不会对所有索引列都为 NULL 的行进行索引，因此field您的查询不会使用索引（因为您想查找所有field为 NULL 的行）。

Either:

任何一个：

create an index on (FIELD, 0), the 0will act as a non-NULL pseudocolumn and all rows will be indexed on the table.
create a function-based index on (CASE WHEN field IS NULL THEN 1 END), this will only index the rows that are NULLs (the index would therefore be very compact). In that case you would have to rewrite your query:
UPDATE [TABLE] SET [FIELD]=0 WHERE (CASE WHEN field IS NULL THEN 1 END)=1

在上创建索引(FIELD, 0)，0将充当非 NULL 伪列，并且所有行都将在表上建立索引。
在上创建基于函数的索引(CASE WHEN field IS NULL THEN 1 END)，这只会索引为 NULL 的行（因此索引会非常紧凑）。在这种情况下，您将不得不重写您的查询：
UPDATE [TABLE] SET [FIELD]=0 WHERE (CASE WHEN field IS NULL THEN 1 END)=1

Edit:

编辑：

Since this is a one-time scenario, you may want to use the PARALLELhint:

由于这是一次性方案，您可能需要使用以下PARALLEL提示：

SQL> EXPLAIN PLAN FOR
  2  UPDATE /*+ PARALLEL(test_table 4)*/ test_table
  3     SET field=0
  4   WHERE field IS NULL;

Explained

SQL> select * from table( dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 4026746538
--------------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT      |            | 22793 |   289K|    12   (9)| 00:00:
|   1 |  UPDATE               | TEST_TABLE |       |       |            |
|   2 |   PX COORDINATOR      |            |       |       |            |
|   3 |    PX SEND QC (RANDOM)| :TQ10000   | 22793 |   289K|    12   (9)| 00:00:
|   4 |     PX BLOCK ITERATOR |            | 22793 |   289K|    12   (9)| 00:00:
|*  5 |      TABLE ACCESS FULL| TEST_TABLE | 22793 |   289K|    12   (9)| 00:00:
--------------------------------------------------------------------------------

Answer 2

回答by Gary Myers

Are other users are updating the same rows in the table at the same time ?

其他用户是否同时更新表中的相同行？

If so, you could be hitting lots of concurrency issues (waiting for locks) and it may be worth breaking it into smaller transactions.

如果是这样，您可能会遇到许多并发问题（等待锁定），可能值得将其分解为较小的事务。

DECLARE
  v_cnt number := 1;
BEGIN
 WHILE v_cnt > 0 LOOP
   UPDATE [TABLE] SET [FIELD]=0 WHERE [FIELD] IS NULL AND ROWNUM < 50000;
   v_cnt := SQL%ROWCOUNT;
   COMMIT;
 END LOOP;
END;
/

The smaller the ROWNUM limit the less concurrency/locking issues you'll hit, but the more time you'll spend in table scanning.

ROWNUM 限制越小，您遇到的并发/锁定问题就越少，但您花在表扫描上的时间就越多。

Answer 3

回答by Rob van Wijk

Vincent already answered your question perfectly, but I'm curious about the "why" behind this action. Why are you updating all NULL's to 0?

Vincent 已经完美地回答了你的问题，但我很好奇这个行动背后的“为什么”。为什么要将所有 NULL 更新为 0？

Regards, Rob.

问候，罗布。

Answer 4

回答by Bob Jarvis - Reinstate Monica

Some suggestions:

一些建议：

Drop any indexes that contain FIELD before running your UPDATE statement, and then re-add them later.
Write a PL/SQL procedure to do this that commits after every 1000 or 10000 rows.

在运行 UPDATE 语句之前删除所有包含 FIELD 的索引，然后稍后重新添加它们。
编写一个 PL/SQL 过程来执行此操作，该过程在每 1000 或 10000 行之后提交。

Hope this helps.

希望这可以帮助。

Answer 5

回答by James Anderson

You could acheive the same result without updating by using an ALTER table to set the columns "DEFAULT" value to 0.

通过使用 ALTER 表将列“DEFAULT”值设置为 0，您无需更新即可获得相同的结果。

如何优化在 700M 行的 Oracle 表上运行的更新 SQL

提问by b.roth

回答by Vincent Malgrat

Edit:

编辑：

回答by Gary Myers

回答by Rob van Wijk

回答by Bob Jarvis - Reinstate Monica

回答by James Anderson

相关推荐

最近更新

标签

如何优化在 700M 行的 Oracle 表上运行的更新 SQL

提问by b.roth

回答by Vincent Malgrat

Edit:

编辑：

回答by Gary Myers

回答by Rob van Wijk

回答by Bob Jarvis - Reinstate Monica

回答by James Anderson

相关推荐

SQL TSQL - 如何 URL 编码

SQL (Oracle) 使用分页查询时如何获取结果总数？

使用动态变量名创建 SQL 表

使用 t-sql 更新单行

相关推荐

最近更新

标签