MySQL 如何使用索引优化 InnoDB 上的 COUNT(*) 性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19267507/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 19:05:36  来源:igfitidea点击:

How to optimize COUNT(*) performance on InnoDB by using index

mysqlinnodb

提问by andig

I have a largish but narrow InnoDB table with ~9m records. Doing count(*)or count(id)on the table is extremely slow (6+ seconds):

我有一个较大但很窄的 InnoDB 表,其中包含约 9m 条记录。做count(*)count(id)在桌子上非常慢(6 秒以上):

DROP TABLE IF EXISTS `perf2`;

CREATE TABLE `perf2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `channel_id` int(11) DEFAULT NULL,
  `timestamp` bigint(20) NOT NULL,
  `value` double NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `ts_uniq` (`channel_id`,`timestamp`),
  KEY `IDX_CHANNEL_ID` (`channel_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

RESET QUERY CACHE;
SELECT COUNT(*) FROM perf2;

While the statement is not run too often it would be nice to optimize it. According to http://www.cloudspace.com/blog/2009/08/06/fast-mysql-innodb-count-really-fast/this should be possible by forcing InnoDB to use an index:

虽然该语句不会经常运行,但对其进行优化会很好。根据http://www.cloudspace.com/blog/2009/08/06/fast-mysql-innodb-count-really-fast/这应该可以通过强制 InnoDB 使用索引来实现:

SELECT COUNT(id) FROM perf2 USE INDEX (PRIMARY);

The explain plan seems fine:

解释计划似乎很好:

id  select_type table   type    possible_keys   key     key_len ref     rows    Extra
1   SIMPLE      perf2   index   NULL            PRIMARY 4       NULL    8906459 Using index

Unfortunately the statement is as slow as before. According to "SELECT COUNT(*)" is slow, even with where clauseI've also tried optimizing the table without success.

不幸的是,语句和以前一样慢。根据“SELECT COUNT(*)”很慢,即使使用 where 子句,我也尝试优化表但没有成功。

What/is the/re a way to optimize COUNT(*)performance on InnoDB?

什么/是/是优化COUNT(*)InnoDB 性能的方法?

采纳答案by andig

For the time being I've solved the problem by using this approximation:

目前我已经通过使用这个近似解决了这个问题:

EXPLAIN SELECT COUNT(id) FROM data USE INDEX (PRIMARY)

The approximate number of rows can be read from the rowscolumn of the explain plan when using InnoDB as shown above. When using MyISAM this will remain EMPTY as the table reference isbeing optimized away- so if empty fallback to traditional SELECT COUNTinstead.

rows使用 InnoDB 时,可以从解释计划的列中读取近似行数,如上所示。当使用 MyISAM 时,这将保持为空,因为表引用正在被优化 - 所以如果空回退到传统SELECT COUNT

回答by Che

As of MySQL 5.1.6 you can use the Event Schedulerand insert the count to a stats table regularly.

从 MySQL 5.1.6 开始,您可以使用Event Scheduler并定期将计数插入 stats 表。

First create a table to hold the count:

首先创建一个表来保存计数:

CREATE TABLE stats (
`key` varchar(50) NOT NULL PRIMARY KEY,
`value` varchar(100) NOT NULL);

Then create an event to update the table:

然后创建一个事件来更新表:

CREATE EVENT update_stats
ON SCHEDULE
  EVERY 5 MINUTE
DO
  INSERT INTO stats (`key`, `value`)
  VALUES ('data_count', (select count(id) from data))
  ON DUPLICATE KEY UPDATE value=VALUES(value);

It's not perfect but it offers a self contained solution (no cronjob or queue) that can be easily tailored to run as often as the required freshness of the count.

它并不完美,但它提供了一个自包含的解决方案(没有 cronjob 或队列),可以很容易地定制,以根据所需的计数新鲜度运行。

回答by MQuirion

Based on @Che code, you can also use triggers on INSERT and on UPDATE to perf2 in order to keep the value in stats table up to date.

基于@Che 代码,您还可以在 INSERT 和 UPDATE 上使用触发器到 perf2,以保持 stats 表中的值是最新的。

CREATE TRIGGER `count_up` AFTER INSERT ON `perf2` FOR EACH ROW UPDATE `stats`
SET 
  `stats`.`value` = `stats`.`value` + 1 
WHERE
  `stats`.`key` = "perf2_count";

CREATE TRIGGER `count_down` AFTER DELETE ON `perf2` FOR EACH ROW UPDATE `stats`
SET 
  `stats`.`value` = `stats`.`value` - 1 
WHERE
  `stats`.`key` = "perf2_count";

This would have the advantage of eliminating the performance issue of performing a count(*) and would only be executed when data changes in table perf2

这将具有消除执行 count(*) 的性能问题的优点,并且只会在表perf2 中的数据更改时执行

回答by newbee

select max(id) - min(id) from xxx_table where....

select max(id) - min(id) from xxx_table where....

This will use "Select tables optimized away", and is very fast!!!

这将使用“选择优化掉的表”,而且速度非常快!!!

Note max(id) - min(id)is actually bigger than count(1).

Notemax(id) - min(id)实际上比count(1).