如何批量更新 oracle pl/sql 中的大表以避免耗尽撤消空间?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3543105/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I UPDATE a large table in oracle pl/sql in batches to avoid running out of undospace?
提问by Steve M
I have a very large table (5mm records). I'm trying to obfuscate the table's VARCHAR2 columns with random alphanumerics for every record on the table. My procedure executes successfully on smaller datasets, but it will eventually be used on a remote db whose settings I can't control, so I'd like to EXECUTE the UPDATE statement in batches to avoid running out of undospace.
我有一张非常大的桌子(5 毫米唱片)。我正在尝试使用随机字母数字来混淆表的 VARCHAR2 列,用于表中的每条记录。我的过程在较小的数据集上成功执行,但它最终将用于我无法控制其设置的远程数据库,因此我想批量执行 UPDATE 语句以避免耗尽撤消空间。
Is there some kind of option I can enable, or a standard way to do the update in chunks?
是否有某种我可以启用的选项,或者一种分块更新的标准方法?
I'll add that there won't be any distinguishing features of the records that haven'tbeen obfuscated so my one thought of using rownum in a loop won't work (I think).
我要补充一点,没有被混淆的记录不会有任何区别特征,所以我认为在循环中使用 rownum 的想法是行不通的(我认为)。
回答by Gary Myers
If you are going to update every row in a table, you are better off doing a Create Table As Select, then drop/truncate the original table and re-append with the new data. If you've got the partitioning option, you can create your new table as a table with a single partition and simply swap it with EXCHANGE PARTITION.
如果要更新表中的每一行,最好执行“按选择创建表”,然后删除/截断原始表并重新附加新数据。如果您有分区选项,则可以将新表创建为具有单个分区的表,然后只需将其与 EXCHANGE PARTITION 交换即可。
Inserts require a LOT less undo and a direct path insert with nologging (/+APPEND/ hint) won't generate much redo either.
插入需要更少的撤消,并且带有 nologging (/ +APPEND/hint)的直接路径插入也不会产生太多重做。
With either mechanism, there would probably sill be 'forensic' evidence of the old values (eg preserved in undo or in "available" space allocated to the table due to row movement).
无论采用哪种机制,旧值都可能存在“取证”证据(例如,保留在撤消中或由于行移动而分配给表的“可用”空间中)。
回答by Adam Musch
The following is untested, but should work:
以下内容未经测试,但应该有效:
declare
l_fetchsize number := 10000;
cursor cur_getrows is
select rowid, random_function(my_column)
from my_table;
type rowid_tbl_type is table of urowid;
type my_column_tbl_type is table of my_table.my_column%type;
rowid_tbl rowid_tbl_type;
my_column_tbl my_column_tbl_type;
begin
open cur_getrows;
loop
fetch cur_getrows bulk collect
into rowid_tbl, my_column_tbl
limit l_fetchsize;
exit when rowid_tbl.count = 0;
forall i in rowid_tbl.first..rowid_tbl.last
update my_table
set my_column = my_column_tbl(i)
where rowid = rowid_tbl(i);
commit;
end loop;
close cur_getrows;
end;
/
This isn't optimally efficient -- a single update would be -- but it'll do smaller, user-tunable batches, using ROWID.
这不是最佳效率——单个更新将是——但它会使用 ROWID 进行更小、用户可调的批次。
回答by Srujan Kumar Gulla
If I had to update millions of records I would probably opt to NOT update.
如果我必须更新数百万条记录,我可能会选择不更新。
I would more likely create a temp table and then insert data from old tablesince insert doesnt take up a lot of redo space and takes less undo.
我更有可能创建一个临时表,然后从旧表中插入数据,因为插入不会占用大量重做空间并且撤消更少。
CREATE TABLE new_table as select <do the update "here"> from old_table;
index new_table
grant on new table
add constraints on new_table
etc on new_table
drop table old_table
rename new_table to old_table;
you can do that using parallel query, with nologging on most operations generating very little redo and no undo at all -- in a fraction of the time it would take to update the data.
您可以使用并行查询来做到这一点,在大多数操作上不记录生成很少的重做和根本不撤消 - 更新数据所需的时间的一小部分。
回答by Kyle Lahnakoski
I do this by mapping the primary key to an integer (mod n), and then perform the update for each x, where 0 <= x < n.
为此,我将主键映射到一个整数 (mod n),然后对每个 x 执行更新,其中 0 <= x < n。
For example, maybe you are unlucky and the primary key is a string. You can hash it with your favorite hash function, and break it into three partitions:
例如,也许你不走运,主键是一个字符串。您可以使用您最喜欢的哈希函数对其进行哈希处理,并将其分成三个分区:
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=0
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=1
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=2
You may have more partitions, and may want to put this into a loop (with some commits).
您可能有更多分区,并且可能希望将其放入一个循环中(有一些提交)。