postgresql 理解 Postgres 行大小
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13570613/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Making sense of Postgres row sizes
提问by Arman
I got a large (>100M rows) Postgres table with structure {integer, integer, integer, timestamp without time zone}. I expected the size of a row to be 3*integer + 1*timestamp = 3*4 + 1*8 = 20 bytes.
我得到了一个大型(> 100M 行)Postgres 表,其结构为 {integer, integer, integer, timestamp without time zone}。我预计一行的大小为 3*整数 + 1*时间戳 = 3*4 + 1*8 = 20 个字节。
In reality the row size is pg_relation_size(tbl) / count(*)
= 52 bytes. Why?
实际上,行大小是pg_relation_size(tbl) / count(*)
= 52 字节。为什么?
(No deletes are done against the table: pg_relation_size(tbl, 'fsm')
~= 0)
(没有对表进行删除:pg_relation_size(tbl, 'fsm')
~= 0)
回答by Erwin Brandstetter
Calculation of row size is much more complex than that.
行大小的计算比这复杂得多。
Storage is typically partitioned in 8 kB data pages. There is a small fixed overhead per page, possible remainders not big enough to fit another tuple, and more importantly dead rows or a percentage initially reserved with the FILLFACTOR
setting.
存储通常以 8 kB 的数据页进行分区。每页有一个小的固定开销,可能的剩余部分不足以容纳另一个元组,更重要的是死行或最初由FILLFACTOR
设置保留的百分比。
And there is even more overhead per row(tuple): an item identifier of 4 bytes at the start of the page, the HeapTupleHeader
of 23 bytes and alignment padding. The start of the tuple header as well as the start of tuple data are aligned at a multiple of MAXALIGN
, which is 8 bytes on a typical 64-bit machine. Some data types require alignment to the next multiple of 2, 4 or 8 bytes.
而且每行(元组)的开销甚至更多:页面开头的 4 个字节的项目标识符HeapTupleHeader
,23 个字节的项目标识符和对齐填充。元组标头的开头以及元组数据的开头以 的倍数对齐,MAXALIGN
在典型的 64 位机器上为 8 个字节。某些数据类型需要与 2、4 或 8 个字节的下一个倍数对齐。
Quoting the manual on the system table pg_tpye
:
typalign
is the alignment required when storing a value of this type. It applies to storage on disk as well as most representations of the value inside PostgreSQL. When multiple values are stored consecutively, such as in the representation of a complete row on disk, padding is inserted before a datum of this type so that it begins on the specified boundary. The alignment reference is the beginning of the first datum in the sequence.Possible values are:
c
=char
alignment, i.e., no alignment needed.
s
=short
alignment (2 bytes on most machines).
i
=int
alignment (4 bytes on most machines).
d
=double
alignment (8 bytes on many machines, but by no means all).
typalign
是存储这种类型的值时所需的对齐方式。它适用于磁盘上的存储以及 PostgreSQL 内部值的大多数表示。当多个值连续存储时,例如在磁盘上的完整行的表示中,填充插入到这种类型的数据之前,以便它从指定的边界开始。对齐参考是序列中第一个数据的开始。可能的值为:
c
=char
对齐,即不需要对齐。
s
=short
对齐(大多数机器上为 2 个字节)。
i
=int
对齐(大多数机器上为 4 个字节)。
d
=double
对齐(许多机器上为 8 个字节,但绝不是全部)。
Read about the basics in the manual here.
在此处阅读手册中的基础知识。
Your example
你的榜样
This results in 4 bytes of padding after your 3 integer
columns, because the timestamp
column requires double
alignment and needs to start at the next multiple of 8 bytes.
这会在 3integer
列之后产生 4 个字节的填充,因为该timestamp
列需要double
对齐并且需要从 8 个字节的下一个倍数开始。
So, one row occupies:
因此,一行占据:
23 -- heaptupleheader
+ 1 -- padding or NULL bitmap
+ 12 -- 3 * integer (no alignment padding here)
+ 4 -- padding after 3rd integer
+ 8 -- timestamp
+ 0 -- no padding since tuple ends at multiple of MAXALIGN
Plus item identifier per tuple in the page header (as pointed out by @A.H. in the comment):
加上页眉中每个元组的项目标识符(正如@AH 在评论中指出的那样):
+ 4 -- item identifier in page header
------
= 52 bytes
So we arrive at the observed 52 bytes.
所以我们得到了观察到的52 个字节。
The calculation pg_relation_size(tbl) / count(*)
is a pessimistic estimation. pg_relation_size(tbl)
includes bloat (dead rows) and space reserved by fillfactor
, as well as overhead per data page and per table. (And we didn't even mention compression for long varlena
data in TOAST tables, since it doesn't apply here.)
该计算pg_relation_size(tbl) / count(*)
是一个悲观的估计。pg_relation_size(tbl)
包括膨胀(死行)和由 保留的空间fillfactor
,以及每个数据页和每个表的开销。(我们甚至没有提到对TOAST 表中的长varlena
数据进行压缩,因为它在这里不适用。)
You can install the additional module pgstattupleand call SELECT * FROM pgstattuple('tbl_name');
for more information on table and tuple size.
您可以安装附加模块pgstattuple并调用SELECT * FROM pgstattuple('tbl_name');
有关表和元组大小的更多信息。
Related:
有关的:
回答by Sean
Each row has metadata associated with it. The correct formula is (assuming na?ve alignment):
每行都有与之关联的元数据。正确的公式是(假设天真对齐):
3 * 4 + 1 * 8 == your data
24 bytes == row overhead
total size per row: 23 + 20
Or roughly 53 bytes. I actually wrote postgresql-varintspecifically to help with this problem with this exact use case. You may want to look at a similar postfor additional details re: tuple overhead.
或大约 53 个字节。我实际上专门编写了postgresql-varint来帮助解决这个确切用例的问题。您可能想查看类似的帖子以获取更多详细信息:元组开销。