SQL 如何批量删除oracle表中的大量数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26930441/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to delete large amount of data from Oracle table in batches
提问by dusk7
I'm using Oracle 10g and want to delete approx 5 million records from Persons table (Having total 15 million records) which doesn't have any reference in Order table (Having total 5 million records)
我正在使用 Oracle 10g 并想从 Persons 表中删除大约 500 万条记录(总共有 1500 万条记录),而在 Order 表中没有任何参考(总共有 500 万条记录)
Since deteting 5 million records in one go would cause undo log issues, I've decided to delete in batches of 100k. I'm using the below query:
由于一次检测 500 万条记录会导致撤消日志问题,因此我决定分批删除 10 万条记录。我正在使用以下查询:
DELETE FROM Persons p
WHERE City = 'ABC'
AND NOT EXISTS
(SELECT O_Id
FROM Orders o
WHERE p.P_Id = o.P_Id)
AND ROWNUM <= 100000
Now the problem is that this query takes as long to execute for 100k records as it would have taken for 5 million because still full table scans and joins on both tables will happen.
现在的问题是,这个查询执行 100k 条记录所需的时间与执行 500 万条记录所需的时间一样长,因为仍然会在两个表上进行全表扫描和连接。
Is there a efficient way to rewrite this query for faster execution? or replace NOT EXISTS clause with better join condition? or use some better way to limit records to 100k?
是否有一种有效的方法可以重写此查询以加快执行速度?或者用更好的连接条件替换 NOT EXISTS 子句?或者使用更好的方法将记录限制为 100k?
P.S. This is a one time operation only and I can't use any DDL operations for this, however pl/sql is fine
PS 这只是一次操作,我不能为此使用任何 DDL 操作,但是 pl/sql 很好
回答by Cyryl1972
From my experience, the fastest way to delete lot of rows is :
根据我的经验,删除大量行的最快方法是:
solution 1 (recommended by Tom Kyte)
解决方案 1(由 Tom Kyte 推荐)
`SET TRANSACTION USE ROLLBACK SEGMENT <your_seg>
DELETE FROM <tab1> WHERE <cond>
COMMIT`
OR
或者
solution 2
解决方案2
`create table new_table unrecoverable as select * from old_table where ....;
drop table old_table;
rename new_table to old_table;
create index old_table_idx1 on old_table(c1,c2) unrecoverable parallel 5;
`
I used the second solution in different contexts: it is always the fastest to delete huge amount of rows.
我在不同的上下文中使用了第二种解决方案:删除大量行总是最快的。
An alternative is to put the data to delete in a partition and then drop the partition (each partition has its own rollback segment, can use parallelism, ...).
另一种方法是将要删除的数据放在一个分区中,然后删除该分区(每个分区都有自己的回滚段,可以使用并行性,...)。
回答by Gordon Linoff
If you want this query to run faster, add the following two indexes:
如果您希望此查询运行得更快,请添加以下两个索引:
create index idx_persons_city_pid on persons(city, p_id);
create index idx_orders_pid on orders(p_id);
回答by Gaurav Soni
DECLARE
v_limit PLS_INTEGER :=100000;
CURSOR person_deleted_cur
IS
SELECT rowid
FROM Persons p
WHERE City = 'ABC'
AND NOT EXISTS
(SELECT O_Id
FROM Orders o
WHERE p.P_Id = o.P_Id);
TYPE person_deleted_nt IS TABLE OF person_deleted_cur%ROWTYPE
INDEX BY PLS_INTEGER;
BEGIN
OPEN person_deleted_cur;
LOOP
FETCH person_deleted_cur
BULK COLLECT INTO person_deleted_nt LIMIT v_limit;
FORALL indx IN 1 .. person_deleted_nt.COUNT
DELETE FROM Persons WHERE rowid=person_deleted_nt(indx);
EXIT WHEN person_deleted_cur%NOTFOUND;
END LOOP;
CLOSE person_deleted_cur;
COMMIT;
END;
/
回答by Dmitriy
One more way to delete:
另一种删除方法:
begin
dbms_errlog.create_error_log('PERSONS');
end;
/
-- index on foreign key is useful thing in many cases, not only now
create index idx_orders_pid on orders(p_id);
declare
min_id number;
max_id number;
begin
select min(p_id), max(p_id)
into min_id, max_id
from persons;
for i in min_id..max_id loop
delete from persons where p_id between i and i + 100000
log errors into err$_persons reject limit unlimited;
end loop;
end;
/
drop table err$_persons;