PostgreSQL 的解释究竟告诉我什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/117262/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 22:08:24  来源:igfitidea点击:

What is PostgreSQL explain telling me exactly?

postgresql

提问by kjg

MySQL's explain output is pretty straightforward. PostgreSQL's is a little more complicated. I haven't been able to find a good resource that explains it either.

MySQL 的解释输出非常简单。PostgreSQL 的有点复杂。我也找不到解释它的好资源。

Can you describe what exactly explain is saying or at least point me in the direction of a good resource?

你能描述一下具体解释的内容,或者至少给我指明一个好的资源的方向吗?

采纳答案by Milen A. Radev

Explaining_EXPLAIN.pdfcould help too.

Explaining_EXPLAIN.pdf也有帮助。

回答by Mark E. Haase

The part I always found confusing is the startup cost vs total cost. I Google this every time I forget about it, which brings me back to here, which doesn't explain the difference, which is why I'm writing this answer. This is what I have gleaned from the Postgres EXPLAINdocumentation, explained as I understand it.

我总是感到困惑的部分是启动成本与总成本。每次我忘记它时我都会谷歌这个,这让我回到这里,这并没有解释差异,这就是我写这个答案的原因。这是我从PostgresEXPLAIN文档中收集到的内容按照我的理解进行了解释。

Here's an example from an application that manages a forum:

以下是管理论坛的应用程序的示例:

EXPLAIN SELECT * FROM post LIMIT 50;

Limit  (cost=0.00..3.39 rows=50 width=422)
  ->  Seq Scan on post  (cost=0.00..15629.12 rows=230412 width=422)

Here's the graphical explanation from PgAdmin:

这是 PgAdmin 的图形解释:

graphical explanation of first query

第一个查询的图形解释

(When you're using PgAdmin, you can point your mouse at a component to read the cost details.)

(当您使用 PgAdmin 时,您可以将鼠标指向某个组件以阅读成本详细信息。)

The cost is represented as a tuple, e.g. the cost of the LIMITis cost=0.00..3.39and the cost of sequentially scanning postis cost=0.00..15629.12. The first number in the tuple is the startup costand the second number is the total cost. Because I used EXPLAINand not EXPLAIN ANALYZE, these costs are estimates, not actual measures.

成本表示为一个元组,例如LIMITiscost=0.00..3.39的成本和顺序扫描的成本postcost=0.00..15629.12。元组中的第一个数字是启动成本,第二个数字是总成本。因为我使用EXPLAIN而不是EXPLAIN ANALYZE,这些成本是估计值,而不是实际测量值。

  • Startup costis a tricky concept. It doesn't just represent the amount of time before that component starts. It represents the amount of time between when the component starts executing (reading in data) and when the component outputs its first row.
  • Total costis the entire execution time of the component, from when it begins reading in data to when it finishes writing its output.
  • 启动成本是一个棘手的概念。它不仅仅代表该组件启动之前的时间量。它表示组件开始执行(读入数据)和组件输出第一行之间的时间量。
  • 总成本是组件的整个执行时间,从开始读取数据到完成写入输出。

As a complication, each "parent" node's costs includes the cost's of its child nodes. In the text representation, the tree is represented by indentation, e.g. LIMITis a parent node and Seq Scanis its child. In the PgAdmin representation, the arrows point from child to parent — the direction of the flow of data — which might be counterintuitive if you are familiar with graph theory.

复杂的是,每个“父”节点的成本包括其子节点的成本。在文本表示中,树由缩进表示,例如LIMIT是父节点并且Seq Scan是其子节点。在 PgAdmin 表示中,箭头指向从孩子到父母——数据流的方向——如果你熟悉图论,这可能违反直觉。

The documentation says that costs are inclusive of all child nodes, but notice that the total cost of the parent 3.39is much smaller than the total cost of it's child 15629.12. Total cost is not inclusive because a component like LIMITdon't need to process its entire input. See the EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 LIMIT 2;example in Postgres EXPLAINdocumentation.

文档说成本包括所有子节点,但请注意父节点的总成本3.39远小于其子节点的总成本15629.12。总成本不包括在内,因为像LIMIT这样的组件不需要处理其全部输入。请参阅Postgres文档中EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 LIMIT 2;示例。EXPLAIN

In the example above, startup time is zero for both components, because neither component needs to do any processing before it starts writing rows: a sequential scan reads the first row of the table and emits it. The LIMITreads its first row and then emits it.

在上面的示例中,两个组件的启动时间都为零,因为两个组件在开始写入行之前都不需要进行任何处理:顺序扫描读取表的第一行并发出它。在LIMIT读取它的第一行,然后发射它。

When would a component need to do a lot of processing before it can start to output any rows? There are a lot of possible reasons, but let's look at one clear example. Here's the same query from before but now containing an ORDER BYclause:

组件何时需要进行大量处理才能开始输出任何行?可能的原因有很多,但让我们看一个明显的例子。这是之前的相同查询,但现在包含一个ORDER BY子句:

EXPLAIN SELECT * FROM post ORDER BY body LIMIT 50;

Limit  (cost=23283.24..23283.37 rows=50 width=422)
  ->  Sort  (cost=23283.24..23859.27 rows=230412 width=422)
        Sort Key: body
        ->  Seq Scan on post  (cost=0.00..15629.12 rows=230412 width=422)

And graphically:

并以图形方式:

graphical explanation of second query

第二个查询的图形解释

Once again, the sequential scan on posthas no startup cost: it starts outputting rows immediately. But the sort has a significant startup cost 23283.24because it has to sort the entire table before it can output even a single row. The total cost of the sort 23859.27is only slightly higher than the startup cost, reflecting the fact that once the entire dataset has been sorted, the sorted data can be emitted very quickly.

再一次,顺序扫描post没有启动成本:它立即开始输出行。但是排序的启动成本很高,23283.24因为它必须先对整个表进行排序,然后才能输出一行。排序的总成本23859.27仅略高于启动成本,反映了一旦整个数据集排序完成,排序后的数据可以非常快地发出。

Notice that the startup time of the LIMIT23283.24is exactly equal to the startup time of the sort. This is not because LIMITitself has a high startup time. It actually has zero startup time by itself, but EXPLAINrolls up all of the child costs for each parent, so the LIMITstartup time includes the sum startup times of its children.

请注意, 的启动时间LIMIT23283.24正好等于排序的启动时间。这并不是因为LIMIT它本身具有很高的启动时间。它本身的启动时间实际上为零,但是EXPLAIN汇总了每个父级的所有子项成本,因此LIMIT启动时间包括其子项的总启动时间。

This rollup of costs can make it difficult to understand the execution cost of each individual component. For example, our LIMIThas zero startup time, but that's not obvious at first glance. For this reason, several other people linked to explain.depesz.com, a tool created by Hubert Lubaczewski (a.k.a. depesz) that helps understand EXPLAINby — among other things — subtracting out child costs from parent costs. He mentions some other complexities in a short blog postabout his tool.

成本的这种累积可能使理解每个单独组件的执行成本变得困难。例如,我们的LIMIT启动时间为零,但这乍一看并不明显。出于这个原因,其他几个人挂explain.depesz.com,由休伯特Lubaczewski(又名depesz),有助于理解创建的工具EXPLAIN的-除其他事项外-减去从父成本孩子的费用。他在一篇关于他的工具的简短博客文章中提到了其他一些复杂性。

回答by user11318

It executes from most indented to least indented, and I believe from the bottom of the plan to the top. (So if there are two indented sections, the one farther down the page executes first, then when they meet the other executes, then the rule joining them executes.)

它从最缩进到最不缩进执行,我相信从计划的底部到顶部。(因此,如果有两个缩进的部分,页面下方的一个先执行,然后当它们遇到另一个时执行,然后执行加入它们的规则。)

The idea is that at each step there are 1 or 2 datasets that arrive and get processed by some rule. If just one dataset, that operation is done to that data set. (For instance scan an index to figure out what rows you want, filter a dataset, or sort it.) If two, the two datasets are the two things that are indented further, and they are joined by the rule you see. The meaning of most of the rules can be reasonably easily guessed (particularly if you have read a bunch of explain plans before), however you can try to verify individual items either by looking in the documentation or (easier) by just throwing the phrase into Google along with a few keywords like EXPLAIN.

这个想法是,在每一步都有 1 或 2 个数据集到达并通过某些规则进行处理。如果只有一个数据集,则对该数据集执行该操作。(例如,扫描索引以找出您想要的行,过滤数据集或对其进行排序。)如果是两个,则两个数据集是进一步缩进的两个事物,并且它们由您看到的规则连接。大多数规则的含义很容易猜到(特别是如果您之前阅读过一堆解释计划),但是您可以尝试通过查看文档或(更简单)将短语放入来验证单个项目谷歌以及一些关键字,如EXPLAIN

This is obviously not a full explanation, but it provides enough context that you can usually figure out whatever you want. For example consider this plan from an actual database:

这显然不是一个完整的解释,但它提供了足够的上下文,您通常可以弄清楚您想要的任何内容。例如,从实际数据库中考虑此计划:

explain analyze
select a.attributeid, a.attributevalue, b.productid
from orderitemattribute a, orderitem b
where a.orderid = b.orderid
and a.attributeid = 'display-album'
and b.productid = 'ModernBook';

------------------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Join  (cost=125379.14..125775.12 rows=3311 width=29) (actual time=841.478..841.478 rows=0 loops=1)
   Merge Cond: (a.orderid = b.orderid)
   ->  Sort  (cost=109737.32..109881.89 rows=57828 width=23) (actual time=736.163..774.475 rows=16815 loops=1)
         Sort Key: a.orderid
         Sort Method:  quicksort  Memory: 1695kB
         ->  Bitmap Heap Scan on orderitemattribute a  (cost=1286.88..105163.27 rows=57828 width=23) (actual time=41.536..612.731 rows=16815 loops=1)
               Recheck Cond: ((attributeid)::text = 'display-album'::text)
               ->  Bitmap Index Scan on (cost=0.00..1272.43 rows=57828 width=0) (actual time=25.033..25.033 rows=16815 loops=1)
                     Index Cond: ((attributeid)::text = 'display-album'::text)
   ->  Sort  (cost=15641.81..15678.73 rows=14769 width=14) (actual time=14.471..16.898 rows=1109 loops=1)
         Sort Key: b.orderid
         Sort Method:  quicksort  Memory: 76kB
         ->  Bitmap Heap Scan on orderitem b  (cost=310.96..14619.03 rows=14769 width=14) (actual time=1.865..8.480 rows=1114 loops=1)
               Recheck Cond: ((productid)::text = 'ModernBook'::text)
               ->  Bitmap Index Scan on id_orderitem_productid  (cost=0.00..307.27 rows=14769 width=0) (actual time=1.431..1.431 rows=1114 loops=1)
                     Index Cond: ((productid)::text = 'ModernBook'::text)
 Total runtime: 842.134 ms
(17 rows)

Try reading it for yourself and see if it makes sense.

尝试自己阅读它,看看它是否有意义。

What I read is that the database first scans the id_orderitem_productidindex, using that to find the rows it wants from orderitem, then sorts that dataset using a quicksort (the sort used will change if data doesn't fit in RAM), then sets that aside.

我读到的是数据库首先扫描id_orderitem_productid索引,使用它来查找它想要的行orderitem,然后使用快速排序对该数据集进行排序(如果数据不适合 RAM,则使用的排序将改变),然后将其放在一边。

Next, it scans orditematt_attributeid_idxto find the rows it wants from orderitemattributeand then sorts that dataset using a quicksort.

接下来,它会扫描orditematt_attributeid_idx以查找所需的行orderitemattribute,然后使用快速排序对该数据集进行排序。

It then takes the two datasets and merges them. (A merge join is a sort of "zipping" operation where it walks the two sorted datasets in parallel, emitting the joined row when they match.)

然后它获取两个数据集并合并它们。(合并连接是一种“压缩”操作,它并行遍历两个已排序的数据集,并在它们匹配时发出连接的行。)

As I said, you work through the plan inner part to outer part, bottom to top.

正如我所说,您从内部到外部,从下到上完成计划。

回答by Stephen Denne

There is an online helper tool available too, Depesz, which will highlight where the expensive parts of the analysis results are.

还有一个在线帮助工具Depesz可用,它将突出显示分析结果中昂贵部分的位置。

also has one, here's the same results, which to me make it clearer where the problem is.

也有一个,这是相同的结果,这让我更清楚问题出在哪里。

回答by Grant Johnson

PgAdminwill show you a graphical representation of the explain plan. Switching back and forth between the two can really help you understand what the text representation means. However, if you just want to know what it is going todo, you may be able to just always use the GUI.

PgAdmin将向您显示解释计划的图形表示。在两者之间来回切换可以真正帮助您理解文本表示的含义。但是,如果您只想知道它要做什么,您可以始终使用 GUI。

回答by Franck Dernoncourt

PostgreSQL's official documentationprovides an interesting, thorough explanation on how to understand explain's output.

PostgreSQL 的官方文档提供了关于如何理解解释输出的有趣而详尽的解释。

回答by Greg

If you install pgadmin, there's an Explain button that as well as giving the text output draws diagrams of what's happening, showing the filters, sorts and sub-set merges that I find really useful to see what's happening.

如果您安装 pgadmin,就会有一个解释按钮,它不仅可以提供文本输出,还可以绘制正在发生的事情的图表,显示过滤器、排序和子集合并,我认为这对于查看正在发生的事情非常有用。