SQL 带有 ORDER 和 LIMIT 子句的极慢的 PostgreSQL 查询

Question

提问by jakeboxer

I have a table, let's call it "foos", with almost 6 million records in it. I am running the following query:

我有一张桌子，我们称之为“foos”，里面有将近 600 万条记录。我正在运行以下查询：

SELECT "foos".*
FROM "foos"
INNER JOIN "bars" ON "foos".bar_id = "bars".id
WHERE (("bars".baz_id = 13266))
ORDER BY "foos"."id" DESC
LIMIT 5 OFFSET 0;

This query takes a very long time to run (Rails times out while running it). There is an index on all IDs in question. The curious part is, if I remove either the ORDER BYclause or the LIMITclause, it runs almost instantaneously.

此查询需要很长时间才能运行（Rails 在运行时超时）。所有有问题的 ID 都有一个索引。奇怪的是，如果我删除ORDER BY子句或LIMIT子句，它几乎立即运行。

I'm assuming that the presence of both ORDER BYand LIMITare making PostgreSQL make some bad choices in query planning. Anyone have any ideas on how to fix this?

我假定这两者的存在ORDER BY和LIMIT正在做的PostgreSQL查询规划一些错误的选择。任何人都对如何解决这个问题有任何想法？

In case it helps, here is the EXPLAINfor all 3 cases:

如果有帮助，这里是EXPLAIN所有 3 种情况：

//////// Both ORDER and LIMIT
SELECT "foos".*
FROM "foos"
INNER JOIN "bars" ON "foos".bar_id = "bars".id
WHERE (("bars".baz_id = 13266))
ORDER BY "foos"."id" DESC
LIMIT 5 OFFSET 0;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..16663.44 rows=5 width=663)
   ->  Nested Loop  (cost=0.00..25355084.05 rows=7608 width=663)
         Join Filter: (foos.bar_id = bars.id)
         ->  Index Scan Backward using foos_pkey on foos  (cost=0.00..11804133.33 rows=4963477 width=663)
               Filter: (((NOT privacy_protected) OR (user_id = 67962)) AND ((status)::text = 'DONE'::text))
         ->  Materialize  (cost=0.00..658.96 rows=182 width=4)
               ->  Index Scan using index_bars_on_baz_id on bars  (cost=0.00..658.05 rows=182 width=4)
                     Index Cond: (baz_id = 13266)
(8 rows)

//////// Just LIMIT
SELECT "foos".*
FROM "foos"
INNER JOIN "bars" ON "foos".bar_id = "bars".id
WHERE (("bars".baz_id = 13266))
LIMIT 5 OFFSET 0;
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..22.21 rows=5 width=663)
   ->  Nested Loop  (cost=0.00..33788.21 rows=7608 width=663)
         ->  Index Scan using index_bars_on_baz_id on bars  (cost=0.00..658.05 rows=182 width=4)
               Index Cond: (baz_id = 13266)
         ->  Index Scan using index_foos_on_bar_id on foos  (cost=0.00..181.51 rows=42 width=663)
               Index Cond: (foos.bar_id = bars.id)
               Filter: (((NOT foos.privacy_protected) OR (foos.user_id = 67962)) AND ((foos.status)::text = 'DONE'::text))
(7 rows)

//////// Just ORDER
SELECT "foos".*
FROM "foos"
INNER JOIN "bars" ON "foos".bar_id = "bars".id
WHERE (("bars".baz_id = 13266))
ORDER BY "foos"."id" DESC;
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=36515.17..36534.19 rows=7608 width=663)
   Sort Key: foos.id
   ->  Nested Loop  (cost=0.00..33788.21 rows=7608 width=663)
         ->  Index Scan using index_bars_on_baz_id on bars  (cost=0.00..658.05 rows=182 width=4)
               Index Cond: (baz_id = 13266)
         ->  Index Scan using index_foos_on_bar_id on foos  (cost=0.00..181.51 rows=42 width=663)
               Index Cond: (foos.bar_id = bars.id)
               Filter: (((NOT foos.privacy_protected) OR (foos.user_id = 67962)) AND ((foos.status)::text = 'DONE'::text))
(8 rows)

Answer 1

回答by Andrew Lazarus

When you have both the LIMIT and ORDER BY, the optimizer has decided it is faster to limp through the unfiltered records on foo by key descending until it gets five matches for the rest of the criteria. In the other cases, it simply runs the query as a nested loop and returns all the records.

当您同时拥有 LIMIT 和 ORDER BY 时，优化器已决定通过键降序遍历 foo 上未过滤的记录会更快，直到它获得其余条件的五个匹配项。在其他情况下，它只是将查询作为嵌套循环运行并返回所有记录。

Offhand, I'd say the problem is that PG doesn't grok the jointdistribution of the various ids and that's why the plan is so sub-optimal.

顺便说一句，我想说的问题是 PG 不了解各种 id的联合分布，这就是为什么该计划如此次优的原因。

For possible solutions: I'll assume that you have run ANALYZE recently. If not, do so. That may explain why your estimated times are high even on the version that returns fast. If the problem persists, perhaps run the ORDER BY as a subselect and slap the LIMIT on in an outer query.

对于可能的解决方案：我假设您最近运行了 ANALYZE。如果没有，请这样做。这可以解释为什么即使在快速返回的版本上，您的估计时间也很高。如果问题仍然存在，也许可以将 ORDER BY 作为子选择运行，并在外部查询中设置 LIMIT。

Answer 2

回答by Davide Ungari

Probably it happens because before it tries to order then to select. Why do not try to sort the result in an outer select all? Something like: SELECT * FROM (SELECT ... INNER JOIN ETC...) ORDER BY ... DESC

可能发生这种情况是因为在它尝试订购然后选择之前。为什么不尝试在外部全选中对结果进行排序？类似于：SELECT * FROM (SELECT ... INNER JOIN ETC ...) ORDER BY ... DESC

Answer 3

回答by ic3b3rg

Your query plan indicates a filter on

您的查询计划指示过滤器

(((NOT privacy_protected) OR (user_id = 67962)) AND ((status)::text = 'DONE'::text))

which doesn't appear in the SELECT - where is it coming from?

它没有出现在 SELECT 中 - 它来自哪里？

Also, note that expression is listed as a "Filter" and not an "Index Cond" which would seem to indicate there's no index applied to it.

另请注意，表达式被列为“过滤器”而不是“索引条件”，这似乎表明没有对其应用索引。

Answer 4

回答by Christian Noel

it may be running a full-table scan on "foos". did you try changing the order of the tables and instead use a left-join instead of inner-join and see if it displays results faster.

它可能正在对“foos”运行全表扫描。您是否尝试更改表的顺序，而是使用左连接而不是内连接，看看它是否更快地显示结果。

say...

说...

SELECT "bars"."id", "foos".*
FROM "bars"
LEFT JOIN "foos" ON "bars"."id" = "foos"."bar_id"
WHERE "bars"."baz_id" = 13266
ORDER BY "foos"."id" DESC
LIMIT 5 OFFSET 0;

SQL 带有 ORDER 和 LIMIT 子句的极慢的 PostgreSQL 查询

提问by jakeboxer

回答by Andrew Lazarus

回答by Davide Ungari

回答by ic3b3rg

回答by Christian Noel

相关推荐

最近更新

标签

SQL 带有 ORDER 和 LIMIT 子句的极慢的 PostgreSQL 查询

提问by jakeboxer

回答by Andrew Lazarus

回答by Davide Ungari

回答by ic3b3rg

回答by Christian Noel

相关推荐

SQL postgresql：选择返回数组

多个 SQL 连接

SQL Server 可以发送 Web 请求吗？

T-SQL 复制登录名、用户、角色、权限等

相关推荐

最近更新

标签