使用 postgresql 的 ORDER BY、OFFSET 和 LIMIT 优化 SELECT 查询

Question

提问by Tan Nguyen

This is my table schema

这是我的表架构

Column       |          Type          |                      Modifiers                      
-------------+------------------------+------------------------------------------------------
id           | integer                | not null default nextval('message_id_seq'::regclass)
date_created | bigint                 |
content      | text                   |
user_name    | character varying(128) |
user_id      | character varying(128) |
user_type    | character varying(8)   |
user_ip      | character varying(128) |
user_avatar  | character varying(128) |
chatbox_id   | integer                | not null
Indexes:
    "message_pkey" PRIMARY KEY, btree (id)
    "idx_message_chatbox_id" btree (chatbox_id)
    "indx_date_created" btree (date_created)
Foreign-key constraints:
    "message_chatbox_id_fkey" FOREIGN KEY (chatbox_id) REFERENCES chatboxes(id) ON UPDATE CASCADE ON DELETE CASCADE

This is the query

这是查询

SELECT * 
FROM message 
WHERE chatbox_id= 
ORDER BY date_created 
OFFSET 0 
LIMIT 20;

($1 will be replaced by the actual ID)

（$1 将替换为实际 ID）

It runs pretty well, but when it reaches 3.7 millions records, all SELECT queries start consuming a lot of CPU and RAM and then the whole system goes down. I have to temporarily backup all the current messages and truncate that table. I am not sure what is going on because everything is ok when I have about 2 millions records

它运行得很好，但是当它达到 370 万条记录时，所有 SELECT 查询开始消耗大量 CPU 和 RAM，然后整个系统就会崩溃。我必须临时备份所有当前消息并截断该表。我不确定发生了什么，因为当我有大约 200 万条记录时一切正常

I am using Postresql Server 9.1.5 with default options.

我正在使用带有默认选项的 Postresql Server 9.1.5。

Update the output of EXPLAIN ANALYZE

更新 EXPLAIN ANALYZE 的输出

Limit  (cost=0.00..6.50 rows=20 width=99) (actual time=0.107..0.295 rows=20 loops=1)
->  Index Scan Backward using indx_date_created on message  (cost=0.00..3458.77 rows=10646 width=99) (actual time=0.105..0.287 rows=20 loops=1)
Filter: (chatbox_id = 25065)
Total runtime: 0.376 ms
(4 rows)

Update server specification

更新服务器规范

Intel Xeon 5620 8x2.40GHz+HT
12GB DDR3 1333 ECC
SSD Intel X25-E Extreme 64GB

Final solution

最终解决方案

Finally I can go above 3 million messages, I have to optimize the postgresql configuration as wildplasser suggested and also make a new index as A.H. suggested

最后我可以超过 300 万条消息，我必须按照wildplasser 的建议优化postgresql 配置，并按照AH 的建议创建一个新索引

Answer 1

回答by A.H.

You could try to give PostgreSQL a better index for that query. I propose something like this:

您可以尝试为该查询为 PostgreSQL 提供更好的索引。我提出这样的建议：

create index invent_suitable_name on message(chatbox_id, date_created);

or

或者

 create index invent_suitable_name on message(chatbox_id, date_created desc);

Answer 2

回答by Igor Romanchenko

Try adding an index for chatbox_id, date_created. For this particular query it will give you maximum performance.

尝试为chatbox_id, date_created. 对于这个特定的查询，它会给你最大的性能。

For the case, when postgres "start consuming a lot of CPU and RAM" try to get more details. It could be a bug (with default configuration postgres normally doesn't consume much RAM).

对于这种情况，当 postgres“开始消耗大量 CPU 和 RAM”时，尝试获取更多详细信息。这可能是一个错误（默认配置 postgres 通常不会消耗太多内存）。

UPD My guess for the reason of bad performance:

UPD 我的猜测是性能不佳的原因：

At some point in time the table becomes to big for full scan to collect accurate statistics. After another ANALYZEPostgresql got bad statistics for the table. As a result - got bad plan that consisted of:

在某个时间点，该表变得很大，无法进行全面扫描以收集准确的统计信息。又一次ANALYZEPostgresql 得到了糟糕的表统计信息。结果 - 得到了糟糕的计划，包括：

Index scan on chatbox_id;
Ordering of returned records to get top 20.

索引扫描开启chatbox_id；
返回记录的排序以获得前 20 名。

Because of default configs and lots of records, returned on step 1, postgres was forced to do sorting in files on disk. As a result - bad performance.

由于在第 1 步返回的默认配置和大量记录，postgres 被迫对磁盘上的文件进行排序。结果 - 性能不佳。

UPD2 EXPALIN ANALYZEshows 0.376 mstime and a good plan. Can you give details about a case with bad performance?

UPD2EXPALIN ANALYZE显示0.376 ms时间和一个好的计划。您能否提供有关性能不佳的案例的详细信息？

使用 postgresql 的 ORDER BY、OFFSET 和 LIMIT 优化 SELECT 查询

提问by Tan Nguyen

回答by A.H.

回答by Igor Romanchenko

相关推荐

最近更新

标签

使用 postgresql 的 ORDER BY、OFFSET 和 LIMIT 优化 SELECT 查询

提问by Tan Nguyen

回答by A.H.

回答by Igor Romanchenko

相关推荐

postgresql PHP、Postgres 帮助使用 RETURNING

全文的 Postgresql 前缀通配符

postgresql 如何创建 Rails 迁移以删除/更改精度和小数位数？

PostgreSQL 无法在 Windows XP 上启动

相关推荐

最近更新

标签