postgresql Postgres中的UUID主键,什么插入性能影响?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34230208/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 02:06:37  来源:igfitidea点击:

UUID Primary Key in Postgres, What Insert Performance Impact?

databaseperformancepostgresqlindexinguuid

提问by philofinfinitejest

I am wondering about the performance impact of using a non-sequential UUID as the primary key in a table that will become quite large in PosgreSQL.

我想知道使用非顺序 UUID 作为表中的主键的性能影响,该表在 PosgreSQL 中将变得非常大。

In DBMS's that use clustered storage for table records it is a given that using a UUID is going to increase the cost of inserts due to having to read from disk to find the data page into which to perform the insert, once the table is too big to hold in memory. As I understand it, Postgres does not maintain row clustering on inserts, so I imagine that in Postgres using a UUID PK does not hurt the performance of that insert.

在对表记录使用集群存储的 DBMS 中,使用 UUID 会增加插入的成本,因为一旦表太大,必须从磁盘读取以找到要执行插入的数据页保存在记忆中。据我了解,Postgres 不维护插入的行集群,所以我想在 Postgres 中使用 UUID PK 不会损害该插入的性能。

But I would think that it makes the insert into the index that the primary key constraint creates much more expensive once the table is large, because it will have to constantly be read from disk to update the index on insertion of new data. Whereas with a sequential key the index will only be updated at the tip which will always be in memory.

但我认为,一旦表很大,它会使主键约束创建的索引的插入变得更加昂贵,因为它必须不断从磁盘读取以在插入新数据时更新索引。而使用顺序键,索引只会在总是在内存中的提示处更新。

Assuming that I understand the performance impact on the index correctly, is there any way to remedy that or are UUIDs simply not a good PK on a large, un-partitioned table?

假设我正确理解了对索引的性能影响,是否有任何方法可以解决这个问题,或者 UUID 在大型未分区表上根本不是一个好的 PK?

采纳答案by Craig Ringer

As I understand it, Postgres does not maintain row clustering on inserts

据我了解,Postgres 不维护插入行集群

Correct at the moment. Unfortunately.

目前正确。很遗憾。

so I imagine that in Postgres using a UUID PK does not hurt the performance of that insert.

所以我想在 Postgres 中使用 UUID PK 不会损害该插入的性能。

It still does have a performance cost because of the need to maintain the PK, and because the inserted tuple is bigger.

由于需要维护PK,并且因为插入的元组更大,所以它仍然有性能成本。

  • The uuid is 4 times as wide as a typical 32-bit integer synthetic key, so the row to write is 12 bytes bigger and you can fit fewer rows into a given amount of RAM

  • The b-tree index that implements the primary key will be 4x as large (vs a 32-bit key), taking longer to search and requiring more memory to cache. It also needs more frequent page splits.

  • Writes will tend to be random within indexes, not appends to hot, recently accessed rows

  • uuid 是典型 32 位整数合成键的 4 倍宽,因此要写入的行大 12 个字节,并且您可以将更少的行放入给定数量的 RAM

  • 实现主键的 b 树索引将是 4 倍大(与 32 位键相比),搜索时间更长,需要更多内存来缓存。它还需要更频繁的页面拆分。

  • 写入在索引中往往是随机的,而不是附加到最近访问的热门行

is there any way to remedy [the performance impact on the index] or are UUIDs simply not a good PK on a large, un-partitioned table?

有没有办法补救[对索引的性能影响] 或者 UUID 在大型未分区表上根本不是一个好的 PK?

If you need a UUID key, you need a UUID key. You shouldn't use one if you don't require one, but if you cannot rely on a central source of synthetic keys and there is no suitable natural key to use, it's still the way to go.

如果需要 UUID 密钥,则需要 UUID 密钥。如果您不需要,则不应该使用一个,但是如果您不能依赖合成密钥的中央来源并且没有合适的自然密钥可供使用,那么它仍然是要走的路。

Partitioning won't help much unless you can confine writes to one partition. Also, you won't be able to usefully use constraint exclusion on searches for the key if writing only to one partition at a time, so you'll still have to search all the partitions' indexes for a key when doing queries. I can only see it being useful if your UUID forms part of a composite key and you can partition on the other part of the composite key.

除非您可以将写入限制在一个分区,否则分区不会有太大帮助。此外,如果一次只写入一个分区,您将无法在搜索键时有效地使用约束排除,因此在执行查询时您仍然必须搜索所有分区的索引以查找键。如果您的 UUID 构成复合键的一部分,并且您可以在复合键的另一部分进行分区,我认为它很有用。

回答by pensnarik

It should be mentioned that you will get more WALs generated if you have btree index on UUID column with full_page_writesoption enabled. This happens because of UUID randomness - the values are not sequential so each insert is likely to touch completely new leaf index leaf page. You can read more in On the impact of full-page writesarticle.

应该提到的是,如果您在full_page_writes启用选项的UUID 列上有 btree 索引,您将获得更多的 WAL 。发生这种情况是因为 UUID 随机性 - 值不是连续的,因此每个插入都可能触及全新的叶索引叶页。您可以在关于整页写作的影响文章中阅读更多内容。