SQL 更新表中所有行的有效方法

Question

提问by m_pGladiator

I have a table with a lot of records (could be more than 500 000 or 1 000 000). I added a new column in this table and I need to fill a value for every row in the column, using the corresponding row value of another column in this table.

我有一张有很多记录的表（可能超过 500 000 或 1 000 000）。我在这个表中添加了一个新列，我需要为该列中的每一行填充一个值，使用该表中另一列的相应行值。

I tried to use separate transactions for selecting every next chunk of 100 records and update the value for them, but still this takes hours to update all records in Oracle10 for example.

我尝试使用单独的事务来选择 100 条记录的每一个下一个块并更新它们的值，但这仍然需要数小时才能更新 Oracle10 中的所有记录。

What is the most efficient way to do this in SQL, without using some dialect-specific features, so it works everywhere (Oracle, MSSQL, MySQL, PostGre etc.)?

在 SQL 中执行此操作的最有效方法是什么，而不使用某些特定于方言的功能，因此它可以在任何地方使用（Oracle、MSSQL、MySQL、PostGre 等）？

ADDITIONAL INFO: There are no calculated fields. There are indexes. Used generated SQL statements which update the table row by row.

附加信息：没有计算字段。有索引。使用生成的 SQL 语句逐行更新表。

Answer 1

回答by Marcelo Cantos

The usual way is to use UPDATE:

通常的方法是使用 UPDATE：

UPDATE mytable
   SET new_column = <expr containing old_column>

You should be able to do this is a single transaction.

你应该能够做到这是一个单一的事务。

Answer 2

回答by Jeffrey Kemp

As Marcelo suggests:

正如马塞洛所说：

UPDATE mytable
SET new_column = <expr containing old_column>;

If this takes too long and fails due to "snapshot too old" errors (e.g. if the expression queries another highly-active table), and if the new value for the column is always NOT NULL, you could update the table in batches:

如果这花费太长时间并且由于“快照太旧”错误而失败（例如，如果表达式查询另一个高度活跃的表），并且如果该列的新值始终为 NOT NULL，则可以批量更新该表：

UPDATE mytable
SET new_column = <expr containing old_column>
WHERE new_column IS NULL
AND ROWNUM <= 100000;

Just run this statement, COMMIT, then run it again; rinse, repeat until it reports "0 rows updated". It'll take longer but each update is less likely to fail.

只需运行这个语句，COMMIT，然后再次运行它；冲洗，重复直到报告“0 行更新”。这将需要更长的时间，但每次更新失败的可能性较小。

EDIT:

编辑：

A better alternative that should be more efficient is to use the DBMS_PARALLEL_EXECUTEAPI.

一个应该更有效的更好的替代方法是使用DBMS_PARALLEL_EXECUTEAPI。

Sample code (from Oracle docs):

示例代码（来自 Oracle 文档）：

DECLARE
  l_sql_stmt VARCHAR2(1000);
  l_try NUMBER;
  l_status NUMBER;
BEGIN

  -- Create the TASK
  DBMS_PARALLEL_EXECUTE.CREATE_TASK ('mytask');

  -- Chunk the table by ROWID
  DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID('mytask', 'HR', 'EMPLOYEES', true, 100);

  -- Execute the DML in parallel
  l_sql_stmt := 'update EMPLOYEES e 
      SET e.salary = e.salary + 10
      WHERE rowid BETWEEN :start_id AND :end_id';
  DBMS_PARALLEL_EXECUTE.RUN_TASK('mytask', l_sql_stmt, DBMS_SQL.NATIVE,
                                 parallel_level => 10);

  -- If there is an error, RESUME it for at most 2 times.
  l_try := 0;
  l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask');
  WHILE(l_try < 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED) 
  LOOP
    l_try := l_try + 1;
    DBMS_PARALLEL_EXECUTE.RESUME_TASK('mytask');
    l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask');
  END LOOP;

  -- Done with processing; drop the task
  DBMS_PARALLEL_EXECUTE.DROP_TASK('mytask');

END;
/

Oracle Docs: https://docs.oracle.com/database/121/ARPLS/d_parallel_ex.htm#ARPLS67333

Oracle 文档：https: //docs.oracle.com/database/121/ARPLS/d_parallel_ex.htm#ARPLS67333

Answer 3

回答by Timothy

You could drop any indexes on the table, then do your insert, and then recreate the indexes.

您可以删除表上的任何索引，然后进行插入，然后重新创建索引。

Answer 4

回答by David

Might not work you for, but a technique I've used a couple times in the past for similar circumstances.

可能对您不起作用，但我过去曾在类似情况下使用过几次技术。

created updated_{table_name}, then select insert into this table in batches. Once finished, and this hinges on Oracle ( which I don't know or use ) supporting the ability to rename tables in an atomic fashion. updated_{table_name} becomes {table_name} while {table_name} becomes original_{table_name}.

创建了updated_{table_name}，然后选择批量插入到这个表中。完成后，这取决于 Oracle（我不知道或不使用）支持以原子方式重命名表的能力。updated_{table_name} 变为 {table_name}，而 {table_name} 变为 original_{table_name}。

Last time I had to do this was for a heavily indexed table with several million rows that absolutely positively could not be locked for the duration needed to make some serious changes to it.

上次我不得不这样做是针对一个包含数百万行的重索引表，在对其进行一些重大更改所需的持续时间内，绝对无法锁定。

Answer 5

回答by Stellios

What is the database version? Check out virtual columns in 11g:

什么是数据库版本？查看 11g 中的虚拟列：

Adding Columns with a Default Value http://www.oracle.com/technology/pub/articles/oracle-database-11g-top-features/11g-schemamanagement.html

添加具有默认值的列 http://www.oracle.com/technology/pub/articles/oracle-database-11g-top-features/11g-schemamanagement.html

Answer 6

回答by user3098137

update Hotels set Discount=30 where Hotelid >= 1 and Hotelid <= 5504

更新酒店设置 Discount=30，其中 Hotelid >= 1 和 Hotelid <= 5504

SQL 更新表中所有行的有效方法

提问by m_pGladiator

回答by Marcelo Cantos

回答by Jeffrey Kemp

回答by Timothy

回答by David

回答by Stellios

回答by user3098137

相关推荐

最近更新

标签

SQL 更新表中所有行的有效方法

提问by m_pGladiator

回答by Marcelo Cantos

回答by Jeffrey Kemp

回答by Timothy

回答by David

回答by Stellios

回答by user3098137

相关推荐

在 T-SQL 中结合存储过程和查询

使用 SQL 过滤存储过程的结果

带有嵌套选择的 SQL Server ORDER BY/WHERE

如何在 Grails 中记录 SQL 语句

相关推荐

最近更新

标签