针对 DELETE 性能问题的 Oracle 分区解决方案

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5805277/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 23:30:15  来源:igfitidea点击:

Oracle partitioning solution for DELETE performance problem

oracleoracle11g

提问by user7116

This is a follow-up question to Strategy to improve Oracle DELETE performance. To recap, we have a large DB with a hierarchy of tables representing 1D through 4D output data from an optimization system. Reading and writing this data is fast and provides a convenient means for our various systems to utilize the information.

这是提高 Oracle DELETE 性能的策略的后续问题。回顾一下,我们有一个大型数据库,其中包含表示来自优化系统的 1D 到 4D 输出数据的表层次结构。读取和写入这些数据的速度很快,并为我们的各种系统利用这些信息提供了一种方便的方式。

However, deleting unused data has become a bear. The current table hierarchy is below.

但是,删除未使用的数据已成为一种负担。当前表层次结构如下。

/* Metadata tables */
Case(CaseId, DeleteFlag, ...) On Delete Cascade CaseId
OptimizationRun(OptId, CaseId, ...) On Delete Cascade OptId
OptimizationStep(StepId, OptId, ...) On Delete Cascade StepId

/* Data tables */
Files(FileId, CaseId, Blob) /* deletes are near instantateous here */

/* Data per run */
OnedDataX(OptId, ...)
TwoDDataY1(OptId, ...) /* packed representation of a 1D slice */

/* Data not only per run, but per step */
TwoDDataY2(StepId, ...)  /* packed representation of a 1D slice */
ThreeDDataZ(StepId, ...) /* packed representation of a 2D slice */
FourDDataZ(StepId, ...)  /* packed representation of a 3D slice */
/* ... About 10 or so of these tables exist */

What I am looking for is a means of partitioning the Casedata such that I could drop a partition relating to the case to remove its data. Ideally, OptimizationRunwould have an interval partition based on CaseIdand this would filter down through to its children. However, 11g doesn't support the combination of INTERVAL and REF partitioning.

我正在寻找的是一种对Case数据进行分区的方法,以便我可以删除与案例相关的分区以删除其数据。理想情况下,OptimizationRun会有一个基于区间的分区CaseId,这将过滤到它的子项。但是,11g 不支持 INTERVAL 和 REF 分区的组合。

I'm fairly certain ENABLE ROW MOVEMENT is out of the question based on the DB size and the requirement that the tablespaces live in ASSM. Maybe RANGE partitioning on OptimizationRunand REF partitioning on the rest?

我相当肯定 ENABLE ROW MOVEMENT 基于数据库大小和表空间存在于 ASSM 中的要求是不可能的。也许是 RANGE 分区OptimizationRun和 REF 分区?

My guess is with that strategy I would need a trigger that accomplishes something like the following:

我的猜测是使用该策略,我需要一个触发器来完成以下操作:

CREATE OR REPLACE TRIGGER Case_BeforeInsert_MakePartitions
BEFORE INSERT
    ON Case
    FOR EACH ROW
DECLARE
    v_PartName varchar(64)       := 'CASE_OPTPART_' || :new.CaseId;
    v_PartRange Case.CaseId%type := :new.CaseId
BEGIN
    -- Take :new.CaseId and create the partition
    ALTER TABLE OptimizationRun
        ADD PARTITION v_PartName
        VALUES LESS THAN ( v_PartRange );
END;

And then the requisite trigger for before deletion:

然后是删除前的必要触发器:

CREATE OR REPLACE TRIGGER Case_BeforeDelete_RemovePartitions
BEFORE DELETE
    ON Case
    FOR EACH ROW
DECLARE
    v_PartName varchar(64) := 'CASE_OPTPART_' || :old.CaseId;
BEGIN
    -- Drop the partitions associated with the case
    ALTER TABLE OptimizationRun
        DROP PARTITION v_PartName;
END;

Good idea? Or is this an idea out of the SNL Bad Idea Jeans commercial?

好主意?或者这是 SNL Bad Idea Jeans 广告中的一个想法?

Update, for size reference:

更新,尺寸参考

  • 1D data tables ~1.7G
  • 2D data tables ~12.5G
  • 3D data tables ~117.3G
  • 4D data tables ~315.2G
  • 一维数据表~1.7G
  • 二维数据表~12.5G
  • 3D 数据表 ~117.3G
  • 4D数据表~315.2G

采纳答案by Vincent Malgrat

I'm pretty sure that you're on the right track with partitionning to deal with your delete performance problem. However, I don't think you'll be able to mix this with triggers. Complex logic with triggers has always bothered me but aside from this here are the problems you are likely to encounter:

我很确定您在分区处理删除性能问题方面走在正确的轨道上。但是,我认为您无法将其与触发器混合使用。带有触发器的复杂逻辑一直困扰着我,但除此之外,您可能会遇到以下问题:

  • DDL statements break transaction logic since Oracle performs a commit of the current transaction before any DDL statement.
  • Fortunately, you can't commit in a trigger (since Oracle is in the middle of an operation and the DB is not in a consistent state).
  • Using autonomous transactions to perform DDL would be a (poor?) workaround for the insert but is unlikely to work for the DELETE since this would probably interfere with the ON DELETE CASCADE logic.
  • DDL 语句破坏了事务逻辑,因为 Oracle 在任何 DDL 语句之前执行当前事务的提交。
  • 幸运的是,您不能在触发器中提交(因为 Oracle 处于操作中间,并且数据库未处于一致状态)。
  • 使用自治事务来执行 DDL 将是插入的(糟糕的?)解决方法,但不太可能适用于 DELETE,因为这可能会干扰 ON DELETE CASCADE 逻辑。

It would be easier to code and easier to maintain procedures that deal with the dropping and creation of partitions such as:

编写和维护处理删除和创建分区的过程会更容易,例如:

CREATE PROCEDURE add_case (case_id, ...) AS
BEGIN
   EXECUTE IMMEDIATE 'ALTER TABLE OptimizationRun ADD partition...';
   /* repeat for each child table */
   INSERT INTO Case VALUES (...);
END;

Concerning the drop of partitions, you'll have to check if this works with referential integrity. It may be needed to disable the foreign key constraints before dropping a parent table partition in a parent-child table relationship.

关于分区的删除,您必须检查这是否适用于参照完整性。在删除父子表关系中的父表分区之前,可能需要禁用外键约束。

Also note that global indexes will be left in an unusable state after a partition drop. You'll have to rebuild them unless you specify UPDATE GLOBAL in your drop statement (obviously this would rebuild them automatically but will take more time).

另请注意,分区删除后全局索引将处于不可用状态。您必须重建它们,除非您在 drop 语句中指定 UPDATE GLOBAL (显然这会自动重建它们,但会花费更多时间)。

回答by Adam Musch

Not possible - you can't issue DDL like that in a row-level trigger.

不可能 - 您不能像在行级触发器中那样发出 DDL。

[possible design issue commentary redacted, as addressed]

[可能的设计问题评论已编辑,已解决]

Have you considered parallelizing your script? Rather than a sweeper that's doing relying on delete cascade, instead leverage DBMS_SCHEDULER to parallelize the job. You can run parallel deletes against tables at the same level of the dependency tree safely.

您是否考虑过并行化您的脚本?而不是依赖删除级联的清扫器,而是利用 DBMS_SCHEDULER 来并行化作业。您可以安全地在依赖树的同一级别对表运行并行删除。

begin
  dbms_scheduler.create_program
    (program_name => 'snapshot_purge_cases',
     program_type => 'PLSQL_BLOCK',
     program_action => 
      'BEGIN
         delete from purge$Case;
         insert into purge$Case
         select CaseId 
           from Case
          where deleteFlag = 1;

         delete from purge$Opt;
         insert into purge$Opt
         select OptId 
           from OptimizationRun
          where CaseId in (select CaseId from purge$Case);

         delete from purge$Step;
         insert into purge$Step
         select StepId 
           from OptimizationStep
          where OptId in (select OptId from purge$Opt);

         commit;
       END;',
     enabled => true,
     comments => 'Program to snapshot keys for purging';           
    );

  dbms_scheduler.create_program 
    (program_name => 'purge_case',
     program_type => 'PLSQL_BLOCK',
     program_action => 'BEGIN 
                          loop
                            delete from Case 
                             where CaseId in (select Case from purge$Case)
                            where rownum <= 50000;
                            exit when sql%rowcount = 0;
                            commit;
                          end loop;
                          commit;
                        END;',
     enabled => true,
     comments => 'Program to purge the Case Table'
    );

  -- repeat for each table being purged

end;
/

That only set up the programs. What we need to do next is set up a job chain so we can put them together.

那只是设置程序。我们接下来需要做的是建立一个工作链,这样我们就可以把它们放在一起。

BEGIN
  dbms_scheduler.create_chain 
   (chain_name => 'purge_case_chain');
END;
/

Now we make steps in the job chain using the programs from before:

现在我们使用之前的程序在作业链中执行步骤:

BEGIN
  dbms_scheduler.define_chain_step
   (chain_name => 'purge_case_chain',
    step_name  => 'step_snapshot_purge_cases',
    program_name => 'snapshot_purge_cases'
   );

  dbms_scheduler.define_chain_step
   (chain_name => 'purge_case_chain',
    step_name  => 'step_purge_cases',
    program_name => 'purge_case'
   );

  -- repeat for every table
END;
/

Now we have to link the chain steps together. The jobs would fan out, like so:

现在我们必须将链式步骤链接在一起。工作会散开,像这样:

  1. Snapshot the CaseIds, OptIdsand StepIdsto purge.
  2. Purge all the tables dependent on OptimizationStep.
  3. Purge all the tables dependent on OptimizationRun.
  4. Purge all the tables dependent on Case.
  5. Purge Case.
  1. 快照CaseIds,OptIdsStepIds清除。
  2. 清除所有依赖于的表 OptimizationStep.
  3. 清除所有依赖于的表 OptimizationRun.
  4. 清除所有依赖于的表 Case.
  5. 清除 Case.

So the code would then be:

所以代码将是:

begin
  dbms_scheduler.define_chain_rule
   (chain_name => 'purge_case_chain',
    condition  => 'TRUE',
    action     => 'START step_snapshot_purge_cases',
    rule_name  => 'rule_snapshot_purge_cases'
   );

  -- repeat for every table dependent on OptimizationStep
  dbms_scheduler.define_chain_rule
   (chain_name => 'purge_case_chain',
    condition  => 'step_snapshot_purge_cases COMPLETED',
    action     => 'START step_purge_TwoDDataY2',
    rule_name  => 'rule_purge_TwoDDataY2'
   );

  -- repeat for every table dependent on OptimizationRun     
  dbms_scheduler.define_chain_rule
   (chain_name => 'purge_case_chain',
    condition  => 'step_purge_TwoDDataY2  COMPLETED and
                   step_purge_ThreeDDataZ COMPLETED and
                   ... ',
    action     => 'START step_purge_OnedDataX',
    rule_name  => 'rule_purge_OnedDataX'
   );

  -- repeat for every table dependent on Case  
  dbms_scheduler.define_chain_rule
   (chain_name => 'purge_case_chain',
    condition  => 'step_purge_OneDDataX  COMPLETED and
                   step_purge_TwoDDataY1 COMPLETED and
                   ... ',
    action     => 'START step_purge_Files',
    rule_name  => 'rule_purge_Files'
   );

  dbms_scheduler.define_chain_rule
   (chain_name => 'purge_case_chain',
    condition  => 'step_purge_Files           COMPLETED and
                   step_purge_OptimizationRun COMPLETED and 
                   ... ',
    action     => 'START step_purge_Case',
    rule_name  => 'rule_purge_Case'
   );

  -- add a rule to end the chain
  dbms_scheduler.define_chain_rule
   (chain_name => 'purge_case_chain',
    condition  => 'step_purge_Case COMPLETED',
    action     => 'END',
    rule_name  => 'rule_purge_Case'
   );

end;
/

Enable the job chain:

启用作业链:

BEGIN
  DBMS_SCHEDULER.enable ('purge_case_chain');
END;
/

You can run the chain manually:

您可以手动运行链:

BEGIN
  DBMS_SCHEDULER.RUN_CHAIN
   (chain_name => 'chain_purge_case',
    job_name   => 'chain_purge_case_run'
   );
END;
/

Or create a job to schedule it:

或者创建一个作业来安排它:

BEGIN
  DBMS_SCHEDULER.CREATE_JOB (
    job_name        => 'job_purge_case',
    job_type        => 'CHAIN',
    job_action      => 'chain_purge_case',
    repeat_interval => 'freq=daily',
    start_date      => ...
    end_date        => ...
    enabled         => TRUE);
END;
/