为什么 MYSQL 较高的 LIMIT 偏移量会减慢查询速度？

Question

提问by Rahman

Scenario in short: A table with more than 16 million records [2GB in size]. The higher LIMIT offset with SELECT, the slower the query becomes, when using ORDER BY *primary_key*

简而言之场景：一个包含超过 1600 万条记录 [2GB 大小] 的表。当使用 ORDER BY *primary_key* 时，SELECT 的 LIMIT 偏移量越大，查询变得越慢

So

所以

SELECT * FROM large ORDER BY `id`  LIMIT 0, 30

takes far less than

远远少于

SELECT * FROM large ORDER BY `id` LIMIT 10000, 30

That only orders 30 records and same eitherway. So it's not the overhead from ORDER BY.
Now when fetching the latest 30 rows it takes around 180 seconds. How can I optimize that simple query?

那只订购了 30 条记录，无论如何都是一样的。所以这不是来自 ORDER BY 的开销。
现在获取最新的 30 行大约需要 180 秒。如何优化这个简单的查询？

Answer 1

回答by Nikos Kyr

I had the exact same problem myself. Given the fact that you want to collect a large amount of this data and not a specific set of 30 you'll be probably running a loop and incrementing the offset by 30.

我自己也遇到了完全相同的问题。鉴于您想要收集大量此类数据而不是一组特定的 30，您可能会运行一个循环并将偏移量增加 30。

So what you can do instead is:

所以你可以做的是：

Hold the last id of a set of data(30) (e.g. lastId = 530)
Add the condition WHERE id > lastId limit 0,30

保存一组数据的最后一个 id(30)（例如 lastId = 530）
添加条件 WHERE id > lastId limit 0,30

So you can always have a ZERO offset. You will be amazed by the performance improvement.

所以你总是可以有一个零偏移。你会惊讶于性能的提升。

Answer 2

回答by Quassnoi

It's normal that higher offsets slow the query down, since the query needs to count off the first OFFSET + LIMITrecords (and take only LIMITof them). The higher is this value, the longer the query runs.

较高的偏移量会减慢查询速度是正常的，因为查询需要对第一OFFSET + LIMIT条记录进行计数（并且只LIMIT取它们）。该值越高，查询运行的时间越长。

The query cannot go right to OFFSETbecause, first, the records can be of different length, and, second, there can be gaps from deleted records. It needs to check and count each record on its way.

查询不能正确执行，OFFSET因为首先，记录的长度可能不同，其次，删除的记录可能存在间隙。它需要检查和计算途中的每条记录。

Assuming that idis a PRIMARY KEYof a MyISAMtable, you can speed it up by using this trick:

假设id是PRIMARY KEY一个的MyISAM表，你可以使用这一招加快步伐：

SELECT  t.*
FROM    (
        SELECT  id
        FROM    mytable
        ORDER BY
                id
        LIMIT 10000, 30
        ) q
JOIN    mytable t
ON      t.id = q.id

See this article:

看这篇文章：

MySQL ORDER BY / LIMIT performance: late row lookups

MySQL ORDER BY / LIMIT 性能：后期行查找

Answer 3

回答by Riedsio

MySQL cannot go directly to the 10000th record (or the 80000th byte as your suggesting) because it cannot assume that it's packed/ordered like that (or that it has continuous values in 1 to 10000). Although it might be that way in actuality, MySQL cannot assume that there are no holes/gaps/deleted ids.

MySQL 不能直接转到第 10000 条记录（或您建议的第 80000 字节），因为它不能假设它是这样打包/排序的（或者它在 1 到 10000 之间具有连续值）。尽管实际上可能是这样，但 MySQL 不能假设没有漏洞/间隙/删除的 id。

So, as bobs noted, MySQL will have to fetch 10000 rows (or traverse through 10000th entries of the index on id) before finding the 30 to return.

因此，正如 bob 所指出的，id在找到要返回的 30 行之前，MySQL 必须获取 10000 行（或遍历索引的第 10000 个条目）。

EDIT: To illustrate my point

编辑：为了说明我的观点

Note that although

请注意，虽然

SELECT * FROM large ORDER BY id LIMIT 10000, 30

would be slow(er),

会很慢（呃），

SELECT * FROM large WHERE id >  10000 ORDER BY id LIMIT 30

would be fast(er), and would return the same results provided that there are no missing ids (i.e. gaps).

将是fast(er)，并且会返回相同的结果，前提是没有丢失的ids（即间隙）。

Answer 4

回答by sym

I found an interesting example to optimize SELECT queries ORDER BY id LIMIT X,Y. I have 35million of rows so it took like 2 minutes to find a range of rows.

我发现了一个有趣的例子来优化 SELECT 查询 ORDER BY id LIMIT X,Y。我有 3500 万行，所以需要 2 分钟才能找到一系列行。

Here is the trick :

这是诀窍：

select id, name, address, phone
FROM customers
WHERE id > 990
ORDER BY id LIMIT 1000;

Just put the WHERE with the last id you got increase a lot the performance. For me it was from 2minutes to 1 second :)

只需将 WHERE 与您获得的最后一个 id 放在一起，即可大大提高性能。对我来说是 2 分钟到 1 秒 :)

Other interesting tricks here : http://www.iheavy.com/2013/06/19/3-ways-to-optimize-for-paging-in-mysql/

其他有趣的技巧：http: //www.iheavy.com/2013/06/19/3-ways-to-optimize-for-paging-in-mysql/

It works too with strings

它也适用于字符串

Answer 5

回答by bobs

The time-consuming part of the two queries is retrieving the rows from the table. Logically speaking, in the LIMIT 0, 30version, only 30 rows need to be retrieved. In the LIMIT 10000, 30version, 10000 rows are evaluated and 30 rows are returned. There can be some optimization can be done my the data-reading process, but consider the following:

这两个查询的耗时部分是从表中检索行。从逻辑上讲，在LIMIT 0, 30版本中，只需要检索 30 行。在该LIMIT 10000, 30版本中，评估了 10000 行并返回了 30 行。我的数据读取过程可以做一些优化，但请考虑以下几点：

What if you had a WHERE clause in the queries? The engine must return all rows that qualify, and then sort the data, and finally get the 30 rows.

如果查询中有 WHERE 子句怎么办？引擎必须返回所有符合条件的行，然后对数据进行排序，最后得到 30 行。

Also consider the case where rows are not processed in the ORDER BY sequence. All qualifying rows must be sorted to determine which rows to return.

还要考虑在 ORDER BY 序列中未处理行的情况。必须对所有符合条件的行进行排序以确定要返回哪些行。

Answer 6

回答by ch271828n

For those who are interested in a comparison and figures :)

对于那些对比较和数字感兴趣的人:)

Experiment 1: The dataset contains about 100 million rows. Each row contains several BIGINT, TINYINT, as well as two TEXT fields (deliberately) containing about 1k chars.

实验 1：数据集包含约 1 亿行。每行包含几个 BIGINT、TINYINT 以及两个包含大约 1k 个字符的 TEXT 字段（有意地）。

Blue := SELECT * FROM post ORDER BY id LIMIT {offset}, 5
Orange := @Quassnoi's method. SELECT t.* FROM (SELECT id FROM post ORDER BY id LIMIT {offset}, 5) AS q JOIN post t ON t.id = q.id
Of course, the third method, ... WHERE id>xxx LIMIT 0,5, does not appear here since it should be constant time.

蓝色 := SELECT * FROM post ORDER BY id LIMIT {offset}, 5
橙色 := @Quassnoi 的方法。 SELECT t.* FROM (SELECT id FROM post ORDER BY id LIMIT {offset}, 5) AS q JOIN post t ON t.id = q.id
当然，第三种方法，... WHERE id>xxx LIMIT 0,5这里没有出现，因为它应该是常数时间。

Experiment 2: Similar thing, except that one row only has 3 BIGINTs.

实验 2：类似的东西，除了一行只有 3 个 BIGINT。

green := the blue before
red := the orange before

绿色 := 之前的蓝色
红色 := 之前的橙色

为什么 MYSQL 较高的 LIMIT 偏移量会减慢查询速度？

提问by Rahman

回答by Nikos Kyr

回答by Quassnoi

回答by Riedsio

回答by sym

回答by bobs

回答by ch271828n

相关推荐

最近更新

标签

为什么 MYSQL 较高的 LIMIT 偏移量会减慢查询速度？

提问by Rahman

回答by Nikos Kyr

回答by Quassnoi

回答by Riedsio

回答by sym

回答by bobs

回答by ch271828n

相关推荐

mysql2 gem 无法在带有 Homebrew 的 OS X 上使用 MySQL 5.6.12 进行编译

MySQL wait_timeout 变量 - 全局 vs SESSION

MySQL 错误 175：找不到指定的数据存储提供程序

选择 MySQL 表的前 N ​​列

相关推荐

最近更新

标签

选择 MySQL 表的前 N 列