提高 Oracle DELETE 性能的策略

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5792425/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 03:14:34  来源:igfitidea点击:

Strategy to improve Oracle DELETE performance

oracleoracle11g

提问by user7116

We've got an Oracle 11g installation that is starting to get big. This database is the backend to a parallel optimization system running on a cluster. Input to the process is contained in the database along with output from the optimization steps. The input includes rote configuration data and some binary files (using 11g's SecureFiles). The output includes 1D, 2D, 3D, and 4D data currently stored in the DB.

我们的 Oracle 11g 安装开始变得越来越大。该数据库是在集群上运行的并行优化系统的后端。流程的输入与优化步骤的输出一起包含在数据库中。输入包括死记硬背的配置数据和一些二进制文件(使用 11g 的 SecureFiles)。输出包括当前存储在 DB 中的 1D、2D、3D 和 4D 数据。

DB Structure:

数据库结构:

/* Metadata tables */
Case(CaseId, DeleteFlag, ...) On Delete Cascade CaseId
OptimizationRun(OptId, CaseId, ...) On Delete Cascade OptId
OptimizationStep(StepId, OptId, ...) On Delete Cascade StepId

/* Data tables */
Files(FileId, CaseId, Blob) /* deletes are near instantateous here */

/* Data per run */
OnedDataX(OptId, ...)
TwoDDataY1(OptId, ...) /* packed representation of a 1D slice */

/* Data not only per run, but per step */
TwoDDataY2(StepId, ...)  /* packed representation of a 1D slice */
ThreeDDataZ(StepId, ...) /* packed representation of a 2D slice */
FourDDataZ(StepId, ...)  /* packed representation of a 3D slice */
/* ... About 10 or so of these tables exist */

A reaper script comes around daily and looks for cases with the DeleteFlag = 1and proceeds with the DELETE FROM Case WHERE DeleteFlag = 1, allowing the cascades to continue.

一个 reaper 脚本每天都会出现并查找带有 的案例DeleteFlag = 1并继续使用DELETE FROM Case WHERE DeleteFlag = 1,从而允许级联继续。

This strategy works great for read/write, but is now outstripping our capabilities when we want to purge data! The rub is deleting a Case takes ~20-40 minutes depending on the size and often overloads our archiver space. The next major version of the product will take a "from the ground up" approach to solving the problem. The next minor release needs to stay within the confines of data stored in the database.

这种策略非常适合读/写,但现在在我们想要清除数据时超出了我们的能力!问题是删除一个案例需要大约 20-40 分钟,具体取决于大小,并且经常使我们的存档空间过载。该产品的下一个主要版本将采用“从头开始”的方法来解决问题。下一个次要版本需要保持在数据库中存储的数据范围内。

So, for the minor release we need an approach that can improve delete performance and at most require moderate changes to the database.

因此,对于次要版本,我们需要一种可以提高删除性能并且最多需要对数据库进行适度更改的方法。

  1. REF Partitioning, but the question is HOW? I would love to do INTERVAL on Caseand REF on the rest, but that isn't supported. Is there some way to manually partition OptimizationRunby CaseIdthrough a trigger?
  2. Disable archiving/redo logs for deletes? Couldn't find a HINT to go with this one. Not sure it is even feasible.
  3. Truncate? This likely would need some sorta complicated table setup. But maybe I'm not considering all of my option.(per answer, stricken)
  1. REF 分区,但问题是如何?我很乐意Case在其余部分上执行 INTERVAL和 REF ,但这不受支持。是否有某种方式来手动分区OptimizationRunCaseId通过触发器?
  2. 禁用归档/重做日志以进行删除?找不到与此相关的提示。不确定它是否可行。
  3. 截短?这可能需要一些复杂的表格设置。但也许我没有考虑我所有的选择。(每个答案,受打击)

To help illustrate the issue, the data in question per case ranges from 15MiB to 1.5GiB with anywhere from 20k to 2M rows.

为了帮助说明这个问题,每个案例的相关数据范围从 15MiB 到 1.5GiB,行数从 20k 到 2M 不等。

Update:Current size of the DB is ~1.5TB.

更新:数据库的当前大小约为 1.5TB。

采纳答案by ik_zelf

Deleting data is a hell of a job, for the database. It has to create before images, update indexes, write redo logs and remove the data. This is a slow process. If you can have a window to perform this task, easiest and fastest is to build new tables, containing the wanted data. Drop the old tables and rename the new tables. This requires some setup work, that is obvious but is very well possible to make. One step less drastic is to drop the indexes before the delete takes place. My vote would go for CTAS (Create Table As Select from) and build the new tables. A nice partitioning schema would certainly be helpful, maybe in the next release Oracle can combine interval and reference partitioning. It would be very nice to have.

删除数据对于数据库来说是一项艰巨的工作。它必须先创建映像、更新索引、写入重做日志并删除数据。这是一个缓慢的过程。如果您可以有一个窗口来执行此任务,最简单和最快的方法是构建包含所需数据的新表。删除旧表并重命名新表。这需要一些设置工作,这是显而易见的,但很可能做到。不那么激烈的一步是在删除之前删除索引。我的投票将投给 CTAS(根据选择创建表)并构建新表。一个好的分区模式肯定会有帮助,也许在下一个版本中 Oracle 可以结合区间和引用分区。拥有它会非常好。

Disabling logging .... can not be done for deletes but CTAS can use nologging. Make a backup when ready and make sure to transfer the datafiles to the standby database, if you have one.

禁用日志记录.... 无法删除,但 CTAS 可以使用 nologging。准备好后进行备份,并确保将数据文件传输到备用数据库(如果有)。

回答by tbone

Just some thoughts:

只是一些想法:

  1. I assume you have indexes on all foreign keys. ON DELETE CASCADE will hold row level locks until the Case delete is complete, and with no indexes will hold table locks I believe and be super slow of course

  2. Do you have any deferred constraints? This would most likely slow things down for Oracle cascading through the various table deletes

  3. Have you tried to do the deletes separately for all affected tables (instead of relying on on delete cascade)? Not as easy, but you may be surprised.

  1. 我假设您在所有外键上都有索引。ON DELETE CASCADE 将持有行级锁,直到 Case 删除完成,并且没有索引将持有表锁我相信并且当然超级慢

  2. 你有任何延迟约束吗?这很可能会减慢 Oracle 级联通过各种表删除的速度

  3. 您是否尝试过对所有受影响的表分别进行删除(而不是依赖删除级联)?没那么容易,但你可能会感到惊讶。

EDIT:

编辑:

One more thought. You may consider doing a SOFT delete on Case table, meaning you have a status field that will tell your app if that Case should be considered. This flag could have many different values, but maybe 'A' for active and 'I' for inactive. Assuming you are always using Case as a driving/primary table in joins to other tables, you can avoid the HARD deletes all-together (and occasionally do a cleanup off hours on whatever schedule if you like). Apps would need to be aware of this flag of course, and you'd be tied to joining back to Case table. May or may not fit for your situation...

再想一想。您可以考虑在 Case 表上进行 SOFT 删除,这意味着您有一个状态字段,它将告诉您的应用程序是否应该考虑该 Case。这个标志可以有许多不同的值,但也许“A”代表活动,“I”代表不活动。假设您总是使用 Case 作为连接到其他表的驱动/主表,您可以避免 HARD 一起删除(如果您愿意,可以偶尔在任何时间进行清理)。应用程序当然需要知道这个标志,你会被绑定到重新加入 Case 表。可能适合也可能不适合您的情况...

回答by Adam Musch

CASCADE DELETEruns internally slow-by-slow, er, row-by-row.

CASCADE DELETE在内部缓慢地运行,呃,逐行。

Some options:

一些选项:

  1. Have your purge job snapshot all the cases to be purged into a scratch table with a CTAS. Then have your purge job loop over that table, deleting each case (and its children) individually. This can be unpleasant, especially if you run into millions of descendant rows. We had to change one of the processes recently at [business redacted] which did that to determine which ultimate parents had child counts that would be problematic, and then use a rownumlimiter on a delete against the problematic child table(s). It's not fast, but at least it's safer from an undo/redo management perspective by placing an upper bound on how big any transaction can be.

  2. If you're using CASCADE DELETEas a convenience, you could always not do so. You'd have to write a more sophisticated purge routine that deletes from your dependency tree "bottom up".

  3. If you can afford the undo/redo generation on the soft delete, you could range-partition the ultimate parent on DeleteFlag, then partition the children BY REFERENCE, all tables using ENABLE ROW MOVEMENT. You'd incur undo/redo costs for moving the rows when soft-deleted, but when it came time to finally purge, it would be truncating partitions where DeleteFlag = 1, nothing more.

  4. Adding storage is relatively cheap. If there's a date-based retention option, use it, and just have the soft delete option hide the data from the application front end. It's inelegant, but then, so is CASCADE DELETE.

  1. 让您的清除作业快照所有要清除到带有 CTAS 的临时表中的案例。然后让您的清除作业循环遍历该表,分别删除每个案例(及其子项)。这可能令人不快,尤其是当您遇到数百万个后代行时。我们最近不得不更改 [business redacted] 中的一个流程,该流程是为了确定哪个最终父级的子计数有问题,然后rownum对有问题的子表使用限制器进行删除。它并不快,但至少从撤消/重做管理的角度来看,通过对任何事务的大小设置上限,它更安全。

  2. 如果你是CASCADE DELETE为了方便而使用,你总是不能这样做。您必须编写一个更复杂的清除例程,从“自下而上”的依赖树中删除。

  3. 如果您可以负担软删除的撤消/重做生成,您可以对最终父级进行范围分区DeleteFlag,然后BY REFERENCE使用 ENABLE ROW MOVEMENT 对所有表进行分区。软删除时移动行会产生撤消/重做成本,但是当需要最终清除时,它将截断分区 where DeleteFlag = 1,仅此而已。

  4. 添加存储相对便宜。如果有基于日期的保留选项,请使用它,并让软删除选项从应用程序前端隐藏数据。这是不优雅的,但是,也是CASCADE DELETE

回答by keiki

Not advised for live database.

不建议用于实时数据库。

  1. I disabled the foreign key constraints referencing the table which is slow to delete.
  2. I executed the delete
  3. Enabled the foreign keys again.
  1. 我禁用了引用删除缓慢的表的外键约束。
  2. 我执行了删除
  3. 再次启用外键。

回答by HAL 9000

Use Enterprise Manager to create a AWR report and run it through statspack analyzerwhich will give you detailed instructions about the bottlenecks in your system. A AWR report is a textfile containing all kinds of data about what the database has done during a certain time and how long it took.... That statspack analyzer ist sort of an automatic DBA telling you what to do.

使用企业管理器创建 AWR 报告并通过statspack 分析器运行它,该分析器将为您提供有关系统瓶颈的详细说明。AWR 报告是一个文本文件,其中包含有关数据库在特定时间内完成的工作以及花费的时间的各种数据...... statspack 分析器是一种自动 DBA 告诉您该做什么。

Forget partitions until Statspack Analyzer tells you that they could be useful and you've got a few idle disks that you can use to distribute the I/O.

忘记分区,直到 Statspack Analyzer 告诉您它们可能有用并且您有一些空闲磁盘可用于分发 I/O。

Don't think about truncate. It forces a commit...

不要考虑截断。它强制提交...

BTW, I'm not affiliated with Statspack Analyzer, but I think it's a very viable general tuning approach for Oracle, especially if there's no DBA around.

顺便说一句,我不隶属于 Statspack Analyzer,但我认为这是一种非常可行的 Oracle 通用调优方法,尤其是在没有 DBA 的情况下。