内存中的 PostgreSQL 表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20737876/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:14:38  来源:igfitidea点击:

PostgreSQL Table in memory

postgresqlmemory

提问by Kapil

I created a database containing a total of 3 tables for a specific purpose. The total size of all tables is about 850 MB - very lean... out of which one single table contains about 800 MB (including index) of data and 5 million records (daily addition of about 6000 records).

我创建了一个数据库,共包含 3 个用于特定目的的表。所有表的总大小约为 850 MB - 非常精简......其中一个表包含约 800 MB(包括索引)的数据和 500 万条记录(每天增加约 6000 条记录)。

The system is PG-Windows with 8 GB RAM Windows 7 laptop with SSD. I allocated 2048MB as shared_buffers, 256MB as temp_buffers and 128MB as work_mem. I execute a single query multiple times against the single table - hoping that the table stays in RAM (hence the above parameters). But, although I see a spike in memory usage during execution (by about 200 MB), I do not see memory consumption remaining at at least 500 MB (for the data to stay in memory). All postgres exe running show 2-6 MB size in task manager. Hence, I suspect the LRU does not keep the data in memory.

系统是带有 8 GB RAM 的带有 SSD 的 Windows 7 笔记本电脑的 PG-Windows。我分配了 2048MB 作为 shared_buffers,256MB 作为 temp_buffers 和 128MB 作为 work_mem。我对单个表多次执行单个查询 - 希望该表保留在 RAM 中(因此有上述参数)。但是,虽然我在执行期间看到内存使用量激增(大约 200 MB),但我没有看到至少 500 MB 的内存消耗(数据保留在内存中)。运行的所有 postgres exe 在任务管理器中显示 2-6 MB 大小。因此,我怀疑 LRU 不会将数据保存在内存中。

Average query execution time is about 2 seconds (very simple single table query)... but I need to get it down to about 10-20 ms or even lesser if possible, purely because there are just too many times, the same is going to be executed and can be achieved only by keeping stuff in memory. Any advice?

平均查询执行时间约为 2 秒(非常简单的单表查询)...但我需要将其缩短到大约 10-20 毫秒,如果可能的话甚至更短,纯粹是因为次数太多了,同样如此被执行并且只能通过将内容保存在内存中来实现。有什么建议吗?

Regards, Kapil

问候, 卡皮尔

回答by Craig Ringer

You should not expect postgresprocesses to show large memory use, even if the whole database is cached in RAM.

postgres即使整个数据库都缓存在 RAM 中,您也不应该期望进程显示大量内存使用。

That is because PostgreSQL relies on buffered reads from the operating system buffer cache. In simplified terms, when PostgreSQL does a read(), the OS looks to see whether the requested blocks are cached in the "free" RAM that it uses for disk cache. If the block is in cache, the OS returns it almost instantly. If the block is not in cache the OS reads it from disk, adds it to the disk cache, and returns the block. Subsequent reads will fetch it from the cache unless it's displaced from the cache by other blocks.

那是因为 PostgreSQL 依赖于来自操作系统缓冲区缓存的缓冲读取。简单来说,当 PostgreSQL 执行 a 时read(),操作系统会查看请求的块是否缓存在它用于磁盘缓存的“空闲”RAM 中。如果块在缓存中,操作系统几乎立即返回它。如果块不在缓存中,操作系统会从磁盘读取它,将其添加到磁盘缓存中,然后返回该块。后续读取将从缓存中获取它,除非它被其他块从缓存中取代。

That means that if you have enough free memory to fit the whole database in "free" operating system memory, you won't tend to hit the disk for reads.

这意味着,如果您有足够的空闲内存来将整个数据库放入“空闲”操作系统内存中,您就不会倾向于访问磁盘进行读取。

Depending on the OS, behaviour for disk writes may differ. Linux will write-back cache "dirty" buffers, and will still return blocks from cache even if they've been written to. It'll write these back to the disk lazily unless forced to write them immediately by an fsync()as Pg uses at COMMITtime. When it does that it marks the cached blocks clean, but doesn't flush them. I don't know how Windows behaves here.

根据操作系统的不同,磁盘写入的行为可能会有所不同。Linux 将回写缓存“脏”缓冲区,并且即使它们已被写入,仍会从缓存中返回块。它会懒洋洋地将这些写回磁盘,除非被迫在fsync()Pg 使用时立即将它们写入COMMIT。当它这样做时,它会将缓存的块标记为干净,但不会刷新它们。我不知道 Windows 在这里的表现如何。

The point is that PostgreSQL can be running entirely out of RAM with a 1GB database, even though no PostgreSQL process seems to be using much RAM. Having shared_bufferstoo high just leads to double-caching and can reducethe amount of RAM available for the OS to cache blocks.

关键是 PostgreSQL 可以用 1GB 的数据库完全耗尽 RAM,即使 PostgreSQL 进程似乎没有使用太多 RAM。有shared_buffers过高只是导致双缓存,可以减少可用的RAM为操作系统的缓存块的数量。

It isn't easy to see exactly what's cached in RAM because Pg relies on the OS cache. That's why I referred you to pg_fincore.

要准确查看 RAM 中缓存的内容并不容易,因为 Pg 依赖于操作系统缓存。这就是我提到你的原因pg_fincore

If you're on Windows and this won't work, you really just have to rely on observing disk activity. Does performance monitor show lots of uncached disk reads? Does operating system memory monitoring show lots of memory used for disk cache in the OS?

如果您使用的是 Windows 并且这不起作用,那么您实际上只需要依靠观察磁盘活动即可。性能监视器是否显示大量未缓存的磁盘读取?操作系统内存监控是否显示操作系统中用于磁盘缓存的大量内存?

Make sure that effective_cache_sizecorrectly reflects the RAM used for disk cache. It will help PostgreSQL choose appropriate query plans.

确保effective_cache_size正确反映用于磁盘缓存的 RAM。它将帮助 PostgreSQL 选择合适的查询计划。

You are making the assumption, without apparent evidence, that the query performance you are experiencing is explained by disk read delays, and that it can be improved by in-memory caching. This may not be the case at all. You need to look at explain analyzeoutput and system performance metrics to see what's going on.

您在没有明显证据的情况下假设您遇到的查询性能是由磁盘读取延迟解释的,并且可以通过内存缓存来改进。情况可能完全不是这样。您需要查看explain analyze输出和系统性能指标以了解发生了什么。