从 Oracle 中删除指定行的最佳方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10092407/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 04:07:43  来源:igfitidea点击:

Optimal way to DELETE specified rows from Oracle

oracle

提问by GIS-Jonathan

I have a project that needs to occasionally delete several tens of thousands of rows from one of six tables of varying sizes but that have about 30million rows between them. Because of the structure of the data I've been given, I don't know which of the six tables has the row that needs to be deleted in it so I have to run all deletes against all tables. I've built an INDEX against the ID column to try and speed things up, but it can be removed if that'll speed things up.

我有一个项目需要偶尔从六个不同大小的表之一中删除数万行,但它们之间有大约 3000 万行。由于我得到的数据结构,我不知道六个表中哪一个有需要删除的行,所以我必须对所有表运行所有删除。我已经针对 ID 列构建了一个 INDEX 以尝试加快速度,但如果这会加快速度,则可以将其删除。

My problem is, that I can't seem to find an efficient way to actually perform the delete. For the purposes of my testing I'm running 7384 delete rows against single test-table which has about 9400 rows. I've tested a number of possible query solutions in Oracle SQL Developer:

我的问题是,我似乎找不到实际执行删除的有效方法。出于测试的目的,我正在针对具有大约 9400 行的单个测试表运行 7384 个删除行。我在 Oracle SQL Developer 中测试了许多可能的查询解决方案:

7384 separate DELETEstatements took 203seconds:

7384 条单独的DELETE语句耗时203秒:

delete from TABLE1 where ID=1000001356443294;
delete from TABLE1 where ID=1000001356443296;
etc...

7384 separate SELECTstatements took 57seconds:

7384 条单独的SELECT语句耗时57秒:

select ID from TABLE1 where ID=1000001356443294
select ID from TABLE1 where ID=1000001356443296
etc...

7384 separate DELETE from (SELECT)statements took 214seconds:

7384 条单独的DELETE from (SELECT)语句耗时214秒:

delete from (select ID from TABLE1 where ID=1000001356443294);
delete from (select ID from TABLE1 where ID=1000001356443296);
etc...

1 SELECTstatement that has 7384 ORclauses in the where took 127.4s:

1 条SELECT语句OR在 where中包含 7384 个子句,耗时127.4s

select ID from TABLE1 where ID=1000001356443294 or ID = 1000001356443296 or ...

1 DELETE from (SELECT)statement that has 7384 ORclauses in the where took 74.4s:

1 个在 whereDELETE from (SELECT)中包含 7384 个OR子句的语句需要74.4s

delete from (select ID from TABLE1 where ID=1000001356443294 or ID = 1000001356443296 or ...)

While the last may be the fastest, upon further testing its still very slow when scaled up from the 9000 row table to even just a 200,000 row table (which is still < 1% of the final tableset size) where the same statement takes 14minsto run. While > 50% faster per row, that still extrapolates up to about a day when being run against the full dataset. I have it on good authority that the piece of software we used to us to do this task could do it in about 20mins.

虽然最后一个可能是最快的,但在进一步测试时,当从 9000 行表扩展到甚至只有 200,000 行表(仍然小于最终表集大小的 1%)时,它仍然非常慢,其中相同的语句需要14 分钟才能完成跑。虽然每行速度提高了 50% 以上,但在针对完整数据集运行时仍然可以推断出大约一天的时间。我确信我们用来完成这项任务的软件可以在大约20 分钟内完成

So my questions are:

所以我的问题是:

  • Is there a better way to delete?
  • Should I use a round of SELECTstatements (i.e., like the second test) to discover which table any given row is in and then shoot off delete queries? Even that looks quite slow but...
  • Is there anything else I can do to speed the deletes up? I don't have DBA-level access or knowledge.
  • 有没有更好的删除方法?
  • 我是否应该使用一轮SELECT语句(即,像第二个测试一样)来发现任何给定的行在哪个表中,然后执行删除查询?即使这样看起来很慢,但是......
  • 我还能做些什么来加快删除速度?我没有 DBA 级别的访问权限或知识。

采纳答案by Adam Musch

In advance of my questions being answered, this is how I'd go about it:

在回答我的问题之前,我会这样做:

Minimize the number of statements and the work they do issued in relative terms.

尽量减少相关声明的数量和他们所做的工作。

All scenarios assume you have a table of IDs (PURGE_IDS) to delete from TABLE_1, TABLE_2, etc.

所有情景假设你有标识(一个表PURGE_IDS)中进行删除TABLE_1TABLE_2等等。

Consider Using CREATE TABLE AS SELECT for really large deletes

考虑对真正大的删除使用 CREATE TABLE AS SELECT

If there's no concurrent activity, and you're deleting 30+ % of the rows in one or more of the tables, don't delete; perform a create table as selectwith the rows you wish to keep, and swap the new table out for the old table. INSERT /*+ APPEND */ ... NOLOGGINGis surprisingly cheap if you can afford it. Even if you do have some concurrent activity, you may be able to use Online Table Redefinition to rebuild the table in-place.

如果没有并发活动,并且您要删除一个或多个表中超过 30% 的行,请不要删除;create table as select对您希望保留的行执行 a ,并将新表换成旧表。 INSERT /*+ APPEND */ ... NOLOGGING如果你负担得起,它的价格非常便宜。即使您确实有一些并发活动,您也可以使用 Online Table Redefinition 就地重建表。

Don't run DELETE statements you know won't delete any rows

不要运行您知道不会删除任何行的 DELETE 语句

If an ID value exists in at most one of the six tables, then keep track of which IDs you've deleted - and don't try to delete those IDs from any of the other tables.

如果一个 ID 值最多存在于六个表中的一个中,那么请跟踪您删除了哪些 ID - 不要尝试从任何其他表中删除这些 ID。

CREATE TABLE TABLE1_PURGE NOLOGGING
AS 
SELECT ID FROM PURGE_IDS INNER JOIN TABLE_1 ON PURGE_IDS.ID = TABLE_1.ID;

DELETE FROM TABLE1 WHERE ID IN (SELECT ID FROM TABLE1_PURGE);

DELETE FROM PURGE_IDS WHERE ID IN (SELECT ID FROM TABLE1_PURGE);

DROP TABLE TABLE1_PURGE;

and repeat.

并重复。

Manage Concurrency if you have to

如果必须,请管理并发

Another way is to use PL/SQL looping over the tables, issuing a rowcount-limited delete statement. This is most likely appropriate if there's significant insert/update/delete concurrent load against the tables you're running the deletes against.

另一种方法是在表上使用 PL/SQL 循环,发出行数限制的删除语句。如果针对您运行删除的表存在大量的插入/更新/删除并发负载,这很可能是合适的。

declare
  l_sql varchar2(4000);
begin
  for i in (select table_name from all_tables 
             where table_name in ('TABLE_1', 'TABLE_2', ...)
             order by table_name);
  loop
    l_sql := 'delete from ' || i.table_name || 
             ' where id in (select id from purge_ids) ' || 
             '   and rownum <= 1000000';
    loop
      commit;
      execute immediate l_sql;
      exit when sql%rowcount <> 1000000;  -- if we delete less than 1,000,000
    end loop;                             -- no more rows need to be deleted!
  end loop;
  commit;
end;

回答by Richard Chung

Store all the to be deleted ID's into a table. Then there are 3 ways. 1) loop through all the ID's in the table, then delete one row at a time for X commit interval. X can be a 100 or 1000. It works on OLTP environment and you can control the locks.

将所有要删除的 ID 存储到一个表中。然后有3种方法。1) 遍历表中的所有 ID,然后在 X 提交间隔内一次删除一行。X 可以是 100 或 1000。它适用于 OLTP 环境,您可以控制锁。

2) Use Oracle Bulk Delete

2)使用Oracle批量删除

3) Use correlated delete query.

3) 使用相关删除查询。

Single query is usually faster than multiple queries because of less context switching, and possibly less parsing.

单个查询通常比多个查询更快,因为上下文切换更少,解析也可能更少。

回答by mcha

First, disabling the index during the deletion would be helpful.

首先,在删除过程中禁用索引会有所帮助。

Try with a MERGE INTO statement :
1) create a temp table with IDs and an additional column from TABLE1 and test with the following

尝试使用 MERGE INTO 语句:
1) 创建一个带有 ID 和来自 TABLE1 的附加列的临时表,并使用以下内容进行测试

MERGE INTO table1 src
USING (SELECT id,col1
         FROM test_merge_delete) tgt
ON (src.id = tgt.id)
WHEN MATCHED THEN
  UPDATE
     SET src.col1 = tgt.col1
  DELETE
   WHERE src.id = tgt.id

回答by jatin Goyal

I have tried this code and It's working fine in my case.

我已经尝试过这段代码,在我的情况下它工作正常。

DELETE FROM NG_USR_0_CLIENT_GRID_NEW WHERE rowid IN
( SELECT rowid FROM
  (
      SELECT wi_name, relationship, ROW_NUMBER() OVER (ORDER BY rowid DESC) RN
      FROM NG_USR_0_CLIENT_GRID_NEW
      WHERE wi_name = 'NB-0000001385-Process'
  )
  WHERE RN=2
);