Postgresql 截断速度
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11419536/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Postgresql Truncation speed
提问by brad
We're using Postgresql 9.1.4
as our db server. I've been trying to speed up my test suite so I've stared profiling the db a bit to see exactly what's going on. We are using database_cleanerto truncate tables at the end of tests. YES I know transactions are faster, I can't use them in certain circumstances so I'm not concerned with that.
我们Postgresql 9.1.4
用作我们的数据库服务器。我一直在尝试加快我的测试套件的速度,所以我开始对数据库进行一些分析,以确切了解发生了什么。我们正在使用database_cleaner在测试结束时截断表。是的,我知道交易速度更快,在某些情况下我不能使用它们,所以我不关心。
What I AM concerned with, is why TRUNCATION takes so long (longer than using DELETE) and why it takes EVEN LONGER on my CI server.
我关心的是为什么 TRUNCATION 需要这么长时间(比使用 DELETE 更长)以及为什么它在我的 CI 服务器上需要更长的时间。
Right now, locally (on a Macbook Air) a full test suite takes 28 minutes. Tailing the logs, each time we truncate tables... ie:
现在,在本地(在 Macbook Air 上)一个完整的测试套件需要 28 分钟。拖尾日志,每次我们截断表......即:
TRUNCATE TABLE table1, table2 -- ... etc
it takes over 1 second to perform the truncation. Tailing the logs on our CI server (Ubuntu 10.04 LTS), take takes a full 8 seconds to truncate the tables and a build takes 84 minutes.
执行截断需要 1 秒以上。在我们的 CI 服务器 (Ubuntu 10.04 LTS) 上拖尾日志,需要整整 8 秒来截断表,构建需要 84 分钟。
When I switched over to the :deletion
strategy, my local build took 20 minutes and the CI server went down to 44 minutes. This is a significantdifference and I'm really blown away as to why this might be. I've tunedtheDB on the CI server, it has 16gb system ram, 4gb shared_buffers... and an SSD. All the good stuff. How is it possible:
当我切换到:deletion
策略时,我的本地构建需要 20 分钟,而 CI 服务器则下降到 44 分钟。这是一个显着的差异,我真的很惊讶为什么会这样。我已经调整了CI 服务器上的数据库,它有 16gb 系统内存、4gb shared_buffers...和一个 SSD。所有的好东西。这怎么可能:
a.that it's SO much slower than my Macbook Air with 2gb of ram
b.that TRUNCATION is so much slower than DELETE when the postgresql docsstate explicitlythat it should be much faster.
一种。它比我的带有 2gb ram
b 的Macbook Air 慢得多。当postgresql 文档明确指出它应该快得多时,TRUNCATION 比 DELETE 慢得多。
Any thoughts?
有什么想法吗?
回答by Craig Ringer
This has come up a few times recently, both on SO and on the PostgreSQL mailing lists.
这最近在 SO 和 PostgreSQL 邮件列表上出现过几次。
The TL;DRfor your last two points:
该TL; DR为你的最后两个点:
(a) The bigger shared_buffers may be why TRUNCATE is slower on the CI server. Different fsync configuration or the use of rotational media instead of SSDs could also be at fault.
(a) 较大的 shared_buffers 可能是 CI 服务器上 TRUNCATE 较慢的原因。不同的 fsync 配置或使用旋转媒体而不是 SSD 也可能有问题。
(b) TRUNCATE
has a fixed cost, but not necessarily slower than DELETE
, plus it does more work. See the detailed explanation that follows.
(b)TRUNCATE
有固定成本,但不一定比 慢,而且DELETE
它做的工作更多。请参阅下面的详细说明。
UPDATE:A significant discussion on pgsql-performancearose from this post. See this thread.
更新:一个上的pgsql性能显著讨论这个职位出现。看到这个线程。
UPDATE 2:Improvements have been added to 9.2beta3 that should help with this, see this post.
更新 2:9.2beta3 已添加改进,应该对此有所帮助,请参阅此帖子。
Detailed explanation of TRUNCATE
vs DELETE FROM
:
TRUNCATE
vs的详细解释DELETE FROM
:
While not an expert on the topic, my understanding is that TRUNCATE
has a nearly fixed cost per table, while DELETE
is at least O(n) for n rows; worse if there are any foreign keys referencing the table being deleted.
虽然不是该主题的专家,但我的理解是TRUNCATE
每个表的成本几乎固定,而DELETE
n 行至少为 O(n);更糟糕的是,如果有任何外键引用被删除的表。
I always assumed that the fixed cost of a TRUNCATE
was lower than the cost of a DELETE
on a near-empty table, but this isn't true at all.
我一直认为 a 的固定成本TRUNCATE
低于DELETE
几乎空桌子上 a的成本,但这根本不是真的。
TRUNCATE table;
does more than DELETE FROM table;
TRUNCATE table;
做的不仅仅是 DELETE FROM table;
The state of the database after a TRUNCATE table
is much the same as if you'd instead run:
a 之后的数据库状态与TRUNCATE table
您运行的情况大致相同:
DELETE FROM table;
VACCUUM (FULL, ANALYZE) table;
(9.0+ only, see footnote)
DELETE FROM table;
VACCUUM (FULL, ANALYZE) table;
(仅限 9.0+,见脚注)
... though of course TRUNCATE
doesn't actually achieve its effects with a DELETE
and a VACUUM
.
...虽然当然TRUNCATE
实际上并没有用 aDELETE
和 a实现它的效果VACUUM
。
The point is that DELETE
and TRUNCATE
do different things, so you're not just comparing two commands with identical outcomes.
问题的关键是,DELETE
和TRUNCATE
做不同的事情,所以你不只是比较一致的结果两个命令。
A DELETE FROM table;
allows dead rows and bloat to remain, allows the indexes to carry dead entries, doesn't update the table statistics used by the query planner, etc.
ADELETE FROM table;
允许保留死行和膨胀,允许索引携带死条目,不更新查询计划器使用的表统计信息等。
A TRUNCATE
gives you a completely new table and indexes as if they were just CREATE
ed. It's like you deleted all the records, reindexed the table and did a VACUUM FULL
.
ATRUNCATE
为您提供了一个全新的表和索引,就好像它们刚刚被CREATE
编辑过一样。就像您删除了所有记录,重新索引了表并执行了VACUUM FULL
.
If you don't care if there's crud left in the table because you're about to go and fill it up again, you may be better off using DELETE FROM table;
.
如果您不关心桌子上是否还有剩余的食物,因为您将要再次填满它,那么最好使用DELETE FROM table;
.
Because you aren't running VACUUM
you will find that dead rows and index entries accumulate as bloat that must be scanned then ignored; this slows all your queries down. If your tests don't actually create and delete all that much data you may not notice or care, and you can always do a VACUUM
or two part-way through your test run if you do. Better, let aggressive autovacuum settings ensure that autovacuum does it for you in the background.
因为您没有运行,VACUUM
您会发现死行和索引条目累积为必须扫描然后忽略的膨胀;这会减慢您的所有查询速度。如果您的测试实际上没有创建和删除所有您可能不会注意到或不在意的数据,并且VACUUM
如果您这样做,您总是可以在测试运行中进行一两个部分。更好的是,让激进的 autovacuum 设置确保 autovacuum 在后台为您执行此操作。
You can still TRUNCATE
all your tables after the wholetest suite runs to make sure no effects build up across many runs. On 9.0 and newer, VACUUM (FULL, ANALYZE);
globally on the table is at least as good if not better, and it's a whole lot easier.
TRUNCATE
在整个测试套件运行后,您仍然可以使用所有表,以确保在多次运行中不会产生影响。在 9.0 及更高版本上,VACUUM (FULL, ANALYZE);
全局桌面至少是一样好,甚至更好,而且要容易得多。
IIRC Pg has a few optimisations that mean it might notice when your transaction is the only one that can see the table and immediately mark the blocks as free anyway. In testing, when I've wanted to create bloat I've had to have more than one concurrent connection to do it. I wouldn't rely on this, though.
IIRC Pg 有一些优化,这意味着它可能会注意到当您的事务是唯一可以看到表并立即将块标记为空闲的事务时。在测试中,当我想要创建膨胀时,我必须有多个并发连接才能做到这一点。不过,我不会依赖于此。
DELETE FROM table;
is very cheap for small tables with no f/k refs
DELETE FROM table;
对于没有 f/k refs 的小桌子来说非常便宜
To DELETE
all records from a table with no foreign key references to it, all Pg has to do a sequential table scan and set the xmax
of the tuples encountered. This is a very cheap operation - basically a linear read and a semi-linear write. AFAIK it doesn't have to touch the indexes; they continue to point to the dead tuples until they're cleaned up by a later VACUUM
that also marks blocks in the table containing only dead tuples as free.
对于DELETE
没有外键引用的表中的所有记录,所有 Pg 都必须执行顺序表扫描并设置xmax
遇到的元组。这是一个非常便宜的操作——基本上是线性读取和半线性写入。AFAIK 它不必接触索引;它们继续指向死元组,直到它们被稍后清除,VACUUM
后者还将表中仅包含死元组的块标记为空闲。
DELETE
only gets expensive if there are lotsof records, if there are lots of foreign key references that must be checked, or if you count the subsequent VACUUM (FULL, ANALYZE) table;
needed to match TRUNCATE
's effects within the cost of your DELETE
.
DELETE
只有在有获得昂贵很多的记录,如果有大量的外键引用的必须检查,或者如果算上后续的VACUUM (FULL, ANALYZE) table;
匹配需要的TRUNCATE
是你的成本范围内的影响DELETE
。
In my tests here, a DELETE FROM table;
was typically 4x faster than TRUNCATE
at 0.5ms vs 2ms. That's a test DB on an SSD, running with fsync=off
because I don't care if I lose all this data. Of course, DELETE FROM table;
isn't doing all the same work, and if I follow up with a VACUUM (FULL, ANALYZE) table;
it's a much more expensive 21ms, so the DELETE
is only a win if I don't actually need the table pristine.
在我的测试中,aDELETE FROM table;
通常比TRUNCATE
0.5ms 和 2ms快 4 倍。这是一个 SSD 上的测试数据库,运行fsync=off
因为我不在乎我是否会丢失所有这些数据。当然,DELETE FROM table;
并没有做所有相同的工作,如果我跟进,VACUUM (FULL, ANALYZE) table;
它会花费更昂贵的 21 毫秒,所以DELETE
如果我实际上不需要原始的桌子,这只是一个胜利。
TRUNCATE table;
does a lot more fixed-cost work and housekeeping than DELETE
TRUNCATE table;
比做更多固定成本的工作和家务 DELETE
By contrast, a TRUNCATE
has to do a lot of work. It must allocate new files for the table, its TOAST table if any, and every index the table has. Headers must be written into those files and the system catalogs may need updating too (not sure on that point, haven't checked). It then has to replace the old files with the new ones or remove the old ones, and has to ensure the file system has caught up with the changes with a synchronization operation - fsync() or similar - that usually flushes all buffers to the disk. I'm not sure whether the the sync is skipped if you're running with the (data-eating) option fsync=off
.
相比之下,aTRUNCATE
必须做很多工作。它必须为表、其 TOAST 表(如果有)以及表具有的每个索引分配新文件。必须将标头写入这些文件,并且系统目录也可能需要更新(当时不确定,尚未检查)。然后它必须用新文件替换旧文件或删除旧文件,并且必须确保文件系统已通过同步操作(fsync() 或类似操作)赶上更改,这通常会将所有缓冲区刷新到磁盘. 如果您使用 (data-eating) option 运行,我不确定是否会跳过同步fsync=off
。
I learned recently that TRUNCATE
must also flush all PostgreSQL's buffers related to the old table. This can take a non-trivial amount of time with huge shared_buffers
. I suspect this is why it's slower on your CI server.
我最近了解到TRUNCATE
还必须刷新所有与旧表相关的 PostgreSQL 缓冲区。对于巨大的shared_buffers
. 我怀疑这就是为什么它在您的 CI 服务器上变慢的原因。
The balance
余额
Anyway, you can see that a TRUNCATE
of a table that has an associated TOAST table (most do) and several indexes could take a few moments. Not long, but longer than a DELETE
from a near-empty table.
无论如何,您可以看到TRUNCATE
具有关联 TOAST 表(大多数情况下)和多个索引的表的 a 可能需要一些时间。不长,但比DELETE
几乎空的桌子上的要长。
Consequently, you might be better off doing a DELETE FROM table;
.
因此,您最好执行DELETE FROM table;
.
--
——
Note: on DBs before 9.0, CLUSTER table_id_seq ON table; ANALYZE table;
or VACUUM FULL ANALYZE table; REINDEX table;
would be a closer equivalent to TRUNCATE
. The VACUUM FULL
impl changed to a much better one in 9.0.
注意:在 9.0 之前的数据库上,CLUSTER table_id_seq ON table; ANALYZE table;
或者VACUUM FULL ANALYZE table; REINDEX table;
更接近于TRUNCATE
. 该VACUUM FULL
IMPL改为在9.0一好得多。
回答by Stanislav Pankevich
Brad, just to let you know. I've looked fairly deeply into a very similar question.
布拉德,只是想让你知道。我已经相当深入地研究了一个非常相似的问题。
Related question: 30 tables with few rows - TRUNCATE the fastest way to empty them and reset attached sequences?
相关问题:30 个行很少的表 - 截断清空它们并重置附加序列的最快方法?
Please also look at this issue and this pull request:
另请查看此问题和此拉取请求:
https://github.com/bmabey/database_cleaner/issues/126
https://github.com/bmabey/database_cleaner/issues/126
https://github.com/bmabey/database_cleaner/pull/127
https://github.com/bmabey/database_cleaner/pull/127
Also this thread: http://archives.postgresql.org/pgsql-performance/2012-07/msg00047.php
还有这个线程:http: //archives.postgresql.org/pgsql-performance/2012-07/msg00047.php
I am sorry for writing this as an answer, but I didn't find any comment links, maybe because there are too much comments already there.
我很抱歉写这个作为答案,但我没有找到任何评论链接,可能是因为那里已经有太多评论了。
回答by Maksym Kammerer
I've encountered similar issue lately, i.e.:
我最近遇到了类似的问题,即:
- The time to run test suite which used DatabaseCleaner varied widely between different systems with comparable hardware,
- Changing DatabaseCleaner strategy to
:deletion
provided ~10x improvement.
- 运行使用 DatabaseCleaner 的测试套件的时间在具有可比硬件的不同系统之间差异很大,
- 更改 DatabaseCleaner 策略以
:deletion
提供约 10 倍的改进。
The root cause of the slowness was a filesystem with journaling (ext4) used for database storage. During TRUNCATE operation the journaling daemon (jbd2) was using ~90% of disk IO capacity. I am not sure if this is a bug, an edge case or actually normal behaviour in these circumstances. This explains however why TRUNCATE was a lot slower than DELETE - it generated a lot more disk writes. As I did not want to actually use DELETE I resorted to setting fsync=off
and it was enough to mitigate this issue (data safety was not important in this case).
缓慢的根本原因是用于数据库存储的带有日志记录 (ext4) 的文件系统。在 TRUNCATE 操作期间,日志守护进程 (jbd2) 使用了大约 90% 的磁盘 IO 容量。在这些情况下,我不确定这是错误、边缘情况还是实际正常行为。然而,这解释了为什么 TRUNCATE 比 DELETE 慢得多——它产生了更多的磁盘写入。因为我不想实际使用 DELETE,所以我求助于设置fsync=off
,这足以缓解这个问题(在这种情况下数据安全并不重要)。
回答by Mark Stosberg
A couple of alternate approaches to consider:
需要考虑的几种替代方法:
- Create a empty database with static "fixture" data in it, and run the tests in that. When you are done, just just drop the database, which should be fast.
- Create a new table called "test_ids_to_delete" that contains columns for table names and primary key ids. Update your deletion logic to insert the ids/table names in this table instead, which will be much faster than running deletes. Then, write a script to run "offline" to actually delete the data, either after a entire test run has finished, or overnight.
- 创建一个包含静态“夹具”数据的空数据库,并在其中运行测试。完成后,只需删除数据库,这应该很快。
- 创建一个名为“test_ids_to_delete”的新表,其中包含表名和主键 id 的列。更新您的删除逻辑以在此表中插入 ids/表名称,这将比运行删除快得多。然后,编写一个脚本以“离线”运行以实际删除数据,无论是在整个测试运行完成后还是一夜之间。
The former is a "clean room" approach, while latter means there will be some test data will persist in database for longer. The "dirty" approach with offline deletes is what I'm using for a test suite with about 20,000 tests. Yes, there are sometimes problems due to having "extra" test data in the dev database but at times. But sometimes this "dirtiness" has helped us find and fixed bug because the "messiness" better simulated a real-world situation, in a way that clean-room approach never will.
前者是一种“洁净室”方法,而后者意味着会有一些测试数据会在数据库中保留更长时间。离线删除的“脏”方法是我用于包含大约 20,000 个测试的测试套件的方法。是的,有时会由于开发数据库中有“额外”的测试数据而出现问题,但有时会出现问题。但有时这种“肮脏”帮助我们找到并修复了错误,因为“混乱”更好地模拟了现实世界的情况,而洁净室方法永远不会这样。