在 MySQL 内连接表上使用索引

Question

提问by Yurii Shylov

I have table Foo with 200 million records and table Bar with 1000 records, they are connected many-to-one. There are indexes for columns Foo.someTime and Bar.someField. Also in Bar 900 records have someField of 1, 100 have someField of 2.

我有包含 2 亿条记录的 Foo 表和包含 1000 条记录的 Bar 表，它们是多对一连接的。有 Foo.someTime 和 Bar.someField 列的索引。同样在 Bar 中，900 条记录的 someField 为 1，100 条记录的 someField 为 2。

(1) This query executes immediately:

(1) 该查询立即执行：

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 1 limit 20;
...
20 rows in set (0.00 sec)

(2) This one takes just forever (the only change is b.someField = 2):

(2) 这需要永远（唯一的变化是 b.someField = 2）：

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 2 limit 20;

(3) But if I drop out where clause on someTime than it also executes immediately:

(3) 但是如果我在 someTime 上退出 where 子句，它也会立即执行：

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where b.someField = 2 limit 20;
...
20 rows in set (0.00 sec)

(4) Also I can speed it up by forcing the index usage:

（4）我也可以通过强制使用索引来加快速度：

mysql> select * from Foo f inner join Bar b force index(someField) on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 2 limit 20;
...
20 rows in set (0.00 sec)

Here is the explain on query (2) (which takes forever)

这是对查询 (2) 的解释（需要永远）

+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| id | select_type | table | type   | possible_keys                 | key       | key_len | ref                      | rows     | Extra       |
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
|  1 | SIMPLE      | g     | range  | bar_id,bar_id_2,someTime      | someTime  | 4       | NULL                     | 95022220 | Using where |
|  1 | SIMPLE      | t     | eq_ref | PRIMARY,someField,bar_id      | PRIMARY   | 4       | db.f.bar_id              |        1 | Using where |
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+

Here is the explain on (4) (which has force index)

这是对（4）的解释（具有力指数）

+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| id | select_type | table | type | possible_keys                 | key       | key_len | ref                      | rows     | Extra       |
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
|  1 | SIMPLE      | t     | ref  | someField                     | someField | 1       |   const                  |       92 |             |
|  1 | SIMPLE      | g     | ref  | bar_id,bar_id_2,someTime      | bar_id    | 4       | db.f.foo_id              | 10558024 | Using where |
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+

So the question is how to teach MySQL to use right index? The query is generated by ORM and is not limited to only these two fields. And also it would be nice to avoid changing the query much (though I'm not sure that inner join fits here).

所以问题是如何教 MySQL 使用正确的索引？查询由 ORM 生成，并不仅限于这两个字段。而且最好避免对查询进行太多更改（尽管我不确定内部联接是否适合此处）。

UPDATE:

更新：

mysql> create index index_name on Foo (bar_id, someTime);

After that the query (2) executes in 0.00 sec.

之后查询 (2) 在 0.00 秒内执行。

Answer 1

采纳答案by mvp

If you create compound index for foo(table_id, sometime), it should help a lot. This is because server will be able to narrow down result set by table_idfirst, and then by sometime.

如果您为创build 复合索引foo(table_id, sometime)，它应该会有很大帮助。这是因为服务器将能够首先缩小结果集的范围table_id，然后是sometime。

Note that when using LIMIT, server does not guarantee which rows will be fetched if many qualify to your WHERE constraint. Technically, every execution can give you slightly different result. If you want to avoid ambiguity, you should always use ORDER BYwhen you use LIMIT. However, that also means you should be more careful in creating appropriate indexes.

请注意，当使用时LIMIT，如果许多行符合您的 WHERE 约束，则服务器不保证将获取哪些行。从技术上讲，每次执行都会给您带来略有不同的结果。如果你想避免歧义，ORDER BY当你使用LIMIT. 但是，这也意味着您在创建适当的索引时应该更加小心。

在 MySQL 内连接表上使用索引

提问by Yurii Shylov

采纳答案by mvp

相关推荐

最近更新

标签

在 MySQL 内连接表上使用索引

提问by Yurii Shylov

采纳答案by mvp

相关推荐

MySQL 优化 INSERT 速度因索引而变慢

MySQL 快速从 600K 行中随机选择 10 行

Mysql 仅从最新日期中选择不同的记录

尝试使用 MySQL 将 1 添加到当前字段值，但无法弄清楚我的语法有什么问题

相关推荐

最近更新

标签