postgresql 理解 Postgres 行大小

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13570613/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:34:44  来源:igfitidea点击:

Making sense of Postgres row sizes

postgresqldatabase-designtypesstorage

提问by Arman

I got a large (>100M rows) Postgres table with structure {integer, integer, integer, timestamp without time zone}. I expected the size of a row to be 3*integer + 1*timestamp = 3*4 + 1*8 = 20 bytes.

我得到了一个大型(> 100M 行)Postgres 表,其结构为 {integer, integer, integer, timestamp without time zone}。我预计一行的大小为 3*整数 + 1*时间戳 = 3*4 + 1*8 = 20 个字节。

In reality the row size is pg_relation_size(tbl) / count(*)= 52 bytes. Why?

实际上,行大小是pg_relation_size(tbl) / count(*)= 52 字节。为什么?

(No deletes are done against the table: pg_relation_size(tbl, 'fsm')~= 0)

(没有对表进行删除:pg_relation_size(tbl, 'fsm')~= 0)

回答by Erwin Brandstetter

Calculation of row size is much more complex than that.

行大小的计算比这复杂得多。

Storage is typically partitioned in 8 kB data pages. There is a small fixed overhead per page, possible remainders not big enough to fit another tuple, and more importantly dead rows or a percentage initially reserved with the FILLFACTORsetting.

存储通常以 8 kB 的数据页进行分区。每页有一个小的固定开销,可能的剩余部分不足以容纳另一个元组,更重要的是死行或最初由FILLFACTOR设置保留的百分比。

And there is even more overhead per row(tuple): an item identifier of 4 bytes at the start of the page, the HeapTupleHeaderof 23 bytes and alignment padding. The start of the tuple header as well as the start of tuple data are aligned at a multiple of MAXALIGN, which is 8 bytes on a typical 64-bit machine. Some data types require alignment to the next multiple of 2, 4 or 8 bytes.

而且每行(元组)的开销甚至更多:页面开头的 4 个字节的项目标识符HeapTupleHeader,23 个字节的项目标识符和对齐填充。元组标头的开头以及元组数据的开头以 的倍数对齐,MAXALIGN在典型的 64 位机器上为 8 个字节。某些数据类型需要与 2、4 或 8 个字节的下一个倍数对齐。

Quoting the manual on the system table pg_tpye:

引用系统表上的手册pg_tpye

typalignis the alignment required when storing a value of this type. It applies to storage on disk as well as most representations of the value inside PostgreSQL. When multiple values are stored consecutively, such as in the representation of a complete row on disk, padding is inserted before a datum of this type so that it begins on the specified boundary. The alignment reference is the beginning of the first datum in the sequence.

Possible values are:

  • c= charalignment, i.e., no alignment needed.

  • s= shortalignment (2 bytes on most machines).

  • i= intalignment (4 bytes on most machines).

  • d= doublealignment (8 bytes on many machines, but by no means all).

typalign是存储这种类型的值时所需的对齐方式。它适用于磁盘上的存储以及 PostgreSQL 内部值的大多数表示。当多个值连续存储时,例如在磁盘上的完整行的表示中,填充插入到这种类型的数据之前,以便它从指定的边界开始。对齐参考是序列中第一个数据的开始。

可能的值为:

  • c=char对齐,即不需要对齐。

  • s=short对齐(大多数机器上为 2 个字节)。

  • i=int对齐(大多数机器上为 4 个字节)。

  • d=double对齐(许多机器上为 8 个字节,但绝不是全部)。

Read about the basics in the manual here.

此处阅读手册中的基础知识。

Your example

你的榜样

This results in 4 bytes of padding after your 3 integercolumns, because the timestampcolumn requires doublealignment and needs to start at the next multiple of 8 bytes.

这会在 3integer列之后产生 4 个字节的填充,因为该timestamp列需要double对齐并且需要从 8 个字节的下一个倍数开始。

So, one row occupies:

因此,一行占据:

   23   -- heaptupleheader
 +  1   -- padding or NULL bitmap
 + 12   -- 3 * integer (no alignment padding here)
 +  4   -- padding after 3rd integer
 +  8   -- timestamp
 +  0   -- no padding since tuple ends at multiple of MAXALIGN

Plus item identifier per tuple in the page header (as pointed out by @A.H. in the comment):

加上页眉中每个元组的项目标识符(正如@AH 在评论中指出的那样):

 +  4   -- item identifier in page header
------
 = 52 bytes

So we arrive at the observed 52 bytes.

所以我们得到了观察到的52 个字节

The calculation pg_relation_size(tbl) / count(*)is a pessimistic estimation. pg_relation_size(tbl)includes bloat (dead rows) and space reserved by fillfactor, as well as overhead per data page and per table. (And we didn't even mention compression for long varlenadata in TOAST tables, since it doesn't apply here.)

该计算pg_relation_size(tbl) / count(*)是一个悲观的估计。pg_relation_size(tbl)包括膨胀(死行)和由 保留的空间fillfactor,以及每个数据页和每个表的开销。(我们甚至没有提到对TOAST 表中的varlena数据进行压缩,因为它在这里不适用。)

You can install the additional module pgstattupleand call SELECT * FROM pgstattuple('tbl_name');for more information on table and tuple size.

您可以安装附加模块pgstattuple并调用SELECT * FROM pgstattuple('tbl_name');有关表和元组大小的更多信息。

Related:

有关的:

回答by Sean

Each row has metadata associated with it. The correct formula is (assuming na?ve alignment):

每行都有与之关联的元数据。正确的公式是(假设天真对齐):

3 * 4 + 1 * 8 == your data
24 bytes == row overhead
total size per row: 23 + 20

Or roughly 53 bytes. I actually wrote postgresql-varintspecifically to help with this problem with this exact use case. You may want to look at a similar postfor additional details re: tuple overhead.

或大约 53 个字节。我实际上专门编写了postgresql-varint来帮助解决这个确切用例的问题。您可能想查看类似的帖子以获取更多详细信息:元组开销。