MySQL 哪个最快?SELECT SQL_CALC_FOUND_ROWS FROM `table`,或 SELECT COUNT(*)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/186588/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*)
提问by Jrgns
When you limit the number of rows to be returned by a SQL query, usually used in paging, there are two methods to determine the total number of records:
当你限制一个 SQL 查询返回的行数时,通常用于分页,有两种方法可以确定总记录数:
Method 1
方法一
Include the SQL_CALC_FOUND_ROWS
option in the original SELECT
, and then get the total number of rows by running SELECT FOUND_ROWS()
:
SQL_CALC_FOUND_ROWS
在原文件中包含该选项SELECT
,然后通过运行获取总行数SELECT FOUND_ROWS()
:
SELECT SQL_CALC_FOUND_ROWS * FROM table WHERE id > 100 LIMIT 10;
SELECT FOUND_ROWS();
Method 2
方法二
Run the query normally, and then get the total number of rows by running SELECT COUNT(*)
正常运行查询,然后通过运行获取总行数 SELECT COUNT(*)
SELECT * FROM table WHERE id > 100 LIMIT 10;
SELECT COUNT(*) FROM table WHERE id > 100;
Which method is the best / fastest?
哪种方法最好/最快?
采纳答案by nathan
It depends. See the MySQL Performance Blog post on this subject: http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/
这取决于。请参阅有关此主题的 MySQL 性能博客文章:http: //www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/
Just a quick summary: Peter says that it depends on your indexes and other factors. Many of the comments to the post seem to say that SQL_CALC_FOUND_ROWS is almost always slower - sometimes up to 10x slower - than running two queries.
只是一个快速总结:彼得说这取决于您的索引和其他因素。这篇文章的许多评论似乎都说 SQL_CALC_FOUND_ROWS 几乎总是比运行两个查询慢——有时最多慢 10 倍。
回答by Jeff Clemens
When choosing the "best" approach, a more important consideration than speed might be the maintainability and correctness of your code. If so, SQL_CALC_FOUND_ROWS is preferable because you only need to maintain a single query. Using a single query completely precludes the possibility of a subtle difference between the main and count queries, which may lead to an inaccurate COUNT.
在选择“最佳”方法时,比速度更重要的考虑因素可能是代码的可维护性和正确性。如果是这样,SQL_CALC_FOUND_ROWS 更可取,因为您只需要维护一个查询。使用单个查询完全排除了主查询和计数查询之间存在细微差异的可能性,这可能导致 COUNT 不准确。
回答by Madhur Bhaiya
MySQL has started deprecating SQL_CALC_FOUND_ROWS
functionality with version 8.0.17 onwards.
MySQL 从SQL_CALC_FOUND_ROWS
8.0.17 版本开始弃用功能。
So, it is always preferredto consider executing your query with LIMIT
, and then a second query with COUNT(*)
and without LIMIT
to determine whether there are additional rows.
因此,始终优先考虑使用 执行查询LIMIT
,然后使用COUNT(*)
和不使用第二个查询LIMIT
来确定是否还有其他行。
From docs:
从文档:
The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS() function are deprecated as of MySQL 8.0.17 and will be removed in a future MySQL version.
COUNT(*) is subject to certain optimizations. SQL_CALC_FOUND_ROWS causes some optimizations to be disabled.
Use these queries instead:
SELECT * FROM tbl_name WHERE id > 100 LIMIT 10; SELECT COUNT(*) WHERE id > 100;
SQL_CALC_FOUND_ROWS 查询修饰符和随附的 FOUND_ROWS() 函数自 MySQL 8.0.17 起已弃用,并将在未来的 MySQL 版本中删除。
COUNT(*) 受某些优化的影响。SQL_CALC_FOUND_ROWS 会导致一些优化被禁用。
请改用这些查询:
SELECT * FROM tbl_name WHERE id > 100 LIMIT 10; SELECT COUNT(*) WHERE id > 100;
Also, SQL_CALC_FOUND_ROWS
has been observed to having more issues generally, as explained in the MySQL WL# 12615:
此外,SQL_CALC_FOUND_ROWS
已观察到通常存在更多问题,如MySQL WL# 12615 中所述:
SQL_CALC_FOUND_ROWS has a number of problems. First of all, it's slow. Frequently, it would be cheaper to run the query with LIMIT and then a separate SELECT COUNT() for the same query, since COUNT() can make use of optimizations that can't be done when searching for the entire result set (e.g. filesort can be skipped for COUNT(*), whereas with CALC_FOUND_ROWS, we must disable some filesort optimizations to guarantee the right result)
More importantly, it has very unclear semantics in a number of situations. In particular, when a query has multiple query blocks (e.g. with UNION), there's simply no way to calculate the number of “would-have-been” rows at the same time as producing a valid query. As the iterator executor is progressing towards these kinds of queries, it is genuinely difficult to try to retain the same semantics. Furthermore, if there are multiple LIMITs in the query (e.g. for derived tables), it's not necessarily clear to which of them SQL_CALC_FOUND_ROWS should refer to. Thus, such nontrivial queries will necessarily get different semantics in the iterator executor compared to what they had before.
Finally, most of the use cases where SQL_CALC_FOUND_ROWS would seem useful should simply be solved by other mechanisms than LIMIT/OFFSET. E.g., a phone book should be paginated by letter (both in terms of UX and in terms of index use), not by record number. Discussions are increasingly infinite-scroll ordered by date (again allowing index use), not by paginated by post number. And so on.
SQL_CALC_FOUND_ROWS 有很多问题。首先,它很慢。通常,使用 LIMIT 运行查询,然后为同一查询运行单独的 SELECT COUNT( )会更便宜,因为 COUNT() 可以利用在搜索整个结果集时无法完成的优化(例如文件排序可以跳过 COUNT(*),而对于 CALC_FOUND_ROWS,我们必须禁用一些文件排序优化以保证正确的结果)
更重要的是,它在许多情况下的语义非常不清楚。特别是,当查询有多个查询块时(例如使用 UNION),根本无法在生成有效查询的同时计算“would-have-been”行的数量。随着迭代器执行器朝着这些类型的查询发展,尝试保留相同的语义确实很困难。此外,如果查询中有多个 LIMIT(例如,对于派生表),则 SQL_CALC_FOUND_ROWS 应该引用其中的哪一个不一定很清楚。因此,与以前相比,这种非平凡的查询在迭代器执行器中必然会获得不同的语义。
最后,大多数 SQL_CALC_FOUND_ROWS 看起来有用的用例应该简单地通过 LIMIT/OFFSET 以外的其他机制解决。例如,电话簿应该按字母分页(在用户体验和索引使用方面),而不是按记录编号。讨论越来越按日期无限滚动(再次允许使用索引),而不是按帖子编号分页。等等。
回答by patapouf_ai
According to the following article: https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/
根据以下文章:https: //www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/
If you have an INDEXon your where clause (if id is indexed in your case), then it is better not to use SQL_CALC_FOUND_ROWSand use 2 queries instead, but if you don't have an index on what you put in your where clause (id in your case) then using SQL_CALC_FOUND_ROWSis more efficient.
如果您的 where 子句上有一个INDEX(如果 id 在您的情况下被索引),那么最好不要使用SQL_CALC_FOUND_ROWS而是使用 2 个查询,但是如果您没有关于您在 where 子句中放置的内容的索引(id 在你的情况下)然后使用SQL_CALC_FOUND_ROWS更有效。
回答by Pierre-Olivier Vares
IMHO, the reason why 2 queries
恕我直言,2个查询的原因
SELECT * FROM count_test WHERE b = 666 ORDER BY c LIMIT 5;
SELECT count(*) FROM count_test WHERE b = 666;
are faster than using SQL_CALC_FOUND_ROWS
比使用 SQL_CALC_FOUND_ROWS 更快
SELECT SQL_CALC_FOUND_ROWS * FROM count_test WHERE b = 555 ORDER BY c LIMIT 5;
has to be seen as a particular case.
必须被视为一个特例。
It in facts depends on the selectivity of the WHERE clause compared to the selectivity of the implicit one equivalent to the ORDER + LIMIT.
事实上,它取决于 WHERE 子句的选择性与等效于 ORDER + LIMIT 的隐式子句的选择性。
As Arvids told in comment (http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/#comment-1174394), the fact that the EXPLAIN use, or not, a temporay table, should be a good base for knowing if SCFR will be faster or not.
正如 Arvids 在评论中所说(http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/#comment-1174394),EXPLAIN使用或不使用的事实,临时表,应该是了解 SCFR 是否更快的良好基础。
But, as I added (http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/#comment-8166482), the result really, really depends on the case. For a particular paginator, you could get to the conclusion that “for the 3 first pages, use 2 queries; for the following pages, use a SCFR” !
但是,正如我补充说的(http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/#comment-8166482),结果真的,真的取决于情况。对于特定的分页器,您可以得出这样的结论:“对于 3 个首页,使用 2 个查询;对于以下页面,请使用 SCFR”!
回答by Jessé Catrinck
Removing some unnecessary SQL and then COUNT(*)
will be faster than SQL_CALC_FOUND_ROWS
. Example:
删除一些不必要的 SQL 然后COUNT(*)
会比SQL_CALC_FOUND_ROWS
. 例子:
SELECT Person.Id, Person.Name, Job.Description, Card.Number
FROM Person
JOIN Job ON Job.Id = Person.Job_Id
LEFT JOIN Card ON Card.Person_Id = Person.Id
WHERE Job.Name = 'WEB Developer'
ORDER BY Person.Name
Then count without unnecessary part:
然后计算没有不必要的部分:
SELECT COUNT(*)
FROM Person
JOIN Job ON Job.Id = Person.Job_Id
WHERE Job.Name = 'WEB Developer'
回答by Code4R7
There are other options for you to benchmark:
还有其他选项供您进行基准测试:
1.)A window function will return the actual size directly (tested in MariaDB):
1.)一个窗口函数会直接返回实际大小(在 MariaDB 中测试过):
SELECT
`mytable`.*,
COUNT(*) OVER() AS `total_count`
FROM `mytable`
ORDER BY `mycol`
LIMIT 10, 20
2.)Thinking out of the box, most of the time users don't need to know the EXACTsize of the table, an approximate is often good enough.
2.)开箱即用,大多数时候用户不需要知道表的确切大小,一个近似值通常就足够了。
SELECT `TABLE_ROWS` AS `rows_approx`
FROM `INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_SCHEMA` = DATABASE()
AND `TABLE_TYPE` = "BASE TABLE"
AND `TABLE_NAME` = ?