如何禁用 postgresql“缓存”优化?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24252455/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:29:58  来源:igfitidea点击:

How disable postgresql "Cache" optimization?

postgresql

提问by Juan Carlos Oropeza

I'm trying to optimize my function.
The thing is when you run the query once you get one result.
Run the query a second time or third time the process time is much smaller.

我正在尝试优化我的功能。
问题是当您获得一个结果时运行查询。
第二次或第三次运行查询的处理时间要小得多。

SELECT  map.get_near_link(a.X, a.Y, a.azimuth)
FROM traffic.avl;

First time 17 seg

第一次 17 段

Total query runtime: 17188 ms.
801 rows retrieved.

Second time 11 seg

第二次11段

Total query runtime: 11406 ms.
801 rows retrieved.

I'm guessing there is some kind of cache doing optimization behind scene. How can i disable this behavior so i can get a more realistic runtime value?

我猜有某种缓存在幕后进行优化。如何禁用此行为以便获得更现实的运行时值?

采纳答案by Craig Ringer

PostgreSQL doesn't have a "cache" optimisation, in the sense of a query result cache.

在查询结果缓存的意义上,PostgreSQL 没有“缓存”优化。

It does cache table blocks that were recently read in shared_buffers, but for most installs that only has a small effect. The main cache is the operating system's disk read cache. For more information see:

它确实缓存了最近读入的表块shared_buffers,但对于大多数安装来说只有很小的影响。主缓存是操作系统的磁盘读取缓存。有关更多信息,请参阅:

See and clear Postgres caches/buffers?

查看并清除 Postgres 缓存/缓冲区?

It sounds to me like you have a system with a reasonable amount of RAM and a fast CPU but a terribly slow disk. So queries that only hit the OS's disk cache are very fast, but queries that go to disk take a couple of seconds to read the data in. So caching effects are very strong.

在我看来,您的系统具有合理数量的 RAM 和快速的 CPU,但磁盘速度非常慢。因此,只访问操作系统磁盘缓存的查询非常快,但访问磁盘的查询需要几秒钟才能读取数据。因此缓存效果非常强。

You should explain (buffers, analyze, verbose) SELECT ...your queries. Try with a couple of different input values until you get a slow one. Compare plans.

你应该explain (buffers, analyze, verbose) SELECT ...你的查询。尝试使用几个不同的输入值,直到得到一个缓慢的输入值。比较计划。

If the plans are the same, that's probably it.

如果计划相同,那可能就是这样。

If the plans are different, you're probably hitting a situation where the query planner is making bad choices based on variations in the table statistics. Increasing the statistics targets for the columns of interest may help (see the manual). If you get different plans and are stuck / want help, feel free to post a new question on dba.stackexchange.com with details.

如果计划不同,您可能会遇到查询计划程序根据表统计信息的变化做出错误选择的情况。增加感兴趣的列的统计目标可能会有所帮助(请参阅手册)。如果您有不同的计划并且遇到困难/需要帮助,请随时在 dba.stackexchange.com 上发布一个包含详细信息的新问题。

回答by rusllonrails

sync; sudo service postgresql stop; echo 1 > /proc/sys/vm/drop_caches; sudo service postgresql start

回答by jjanes

Since you said you want to optimize your function, not your disk layout, then the 2nd timings are probably the ones you want to focus on for that purpose, as they are the purest measure of the run time of your function itself, rather than the time needed to gather the data to feed into your function.

既然你说你想优化你的函数,而不是你的磁盘布局,那么第二个时间可能是你想要为此目的而关注的,因为它们是函数本身运行时间的最纯粹的度量,而不是收集数据以提供给您的函数所需的时间。

And since the 2nd-execution time is still 2/3 of the 1st execution time, that would still be where to focus your attention even if what you want to optimize total execution time.

而且由于第二次执行时间仍然是第一次执行时间的 2/3,即使您想优化总执行时间,这仍然是您关注的焦点。

If you really need to do this, you can clear the cache as explained in another answers, but that is usually too onerous to be used for routine work. Better would be to create a program/script that picks random (but generally realistic) values of a.X, a.Y, a.azimuth and runs the query with them. By changing the values each time from among a set of realistics examples, you get results most representative of the real world. Making such drivers is a little more work upfront but usually pays off.

如果您确实需要这样做,您可以按照其他答案中的说明清除缓存,但这通常太麻烦,无法用于日常工作。更好的是创建一个程序/脚本,选择 aX、aY、a.azimuth 的随机(但通常是现实的)值并使用它们运行查询。通过每次从一组现实示例中更改值,您可以获得最能代表现实世界的结果。制作这样的驱动程序需要更多的前期工作,但通常会得到回报。