如何让 MySQL 使用 INDEX 进行视图查询?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13944946/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I get MySQL to use an INDEX for view query?
提问by Roger Ray
I'm working on a web project with MySql database on Java EE. We needed a view to summarize data from 3 tables with over 3M rows overall. Each table was created with index. But I haven't found out a way to take advantages in the indexes in the conditional select statement retrieval from the view that we created with [group by].
我正在使用 Java EE 上的 MySql 数据库开发 Web 项目。我们需要一个视图来汇总 3 个总行超过 300 万行的表中的数据。每个表都是用索引创建的。但是我还没有找到一种方法可以从我们使用 [group by] 创建的视图中利用条件选择语句检索中的索引。
I've getting suggestions from people that using views in MySql is not a good idea. Because you can't create index for views in mysql like in oracle. But in some test that I took, indexes can be used in view select statement. Maybe I've created those views in a wrong way.
我从人们那里得到建议,在 MySql中使用视图不是一个好主意。因为您不能像在 oracle 中那样为 mysql 中的视图创建索引。但是在我进行的一些测试中,可以在视图选择语句中使用索引。也许我以错误的方式创建了这些视图。
I'll use a example to describe my problem.
我将用一个例子来描述我的问题。
We have a table that records data for high scores in NBA games, with index on column [happend_in]
我们有一个记录 NBA 比赛高分数据的表格,在列 [happend_in] 上有索引
CREATE TABLE `highscores` (
`tbl_id` int(11) NOT NULL auto_increment,
`happened_in` int(4) default NULL,
`player` int(3) default NULL,
`score` int(3) default NULL,
PRIMARY KEY (`tbl_id`),
KEY `index_happened_in` (`happened_in`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
insert data(8 rows)
插入数据(8 行)
INSERT INTO highscores(happened_in, player, score)
VALUES (2006, 24, 61),(2006, 24, 44),(2006, 24, 81),
(1998, 23, 51),(1997, 23, 46),(2006, 3, 55),(2007, 24, 34), (2008, 24, 37);
then I create a view to see the highest score that Kobe Bryant got in each year
然后我创建一个视图来查看科比·布莱恩特每年获得的最高分
CREATE OR REPLACE VIEW v_kobe_highScores
AS
SELECT player, max(score) AS highest_score, happened_in
FROM highscores
WHERE player = 24
GROUP BY happened_in;
I wrote a conditional statement to see the highest score that kobegot in 2006;
我写了个条件语句,看看科比在2006年拿到的最高分;
select * from v_kobe_highscores where happened_in = 2006;
When I explain it in toad for mysql, I found out that mysql have scan all rowsto form the view, then find data with condition in it, without using index on [happened_in].
当我在toad for mysql中解释的时候,我发现mysql已经扫描了所有行来形成视图,然后在其中查找有条件的数据,而不使用[happened_in]上的索引。
explain select * from v_kobe_highscores where happened_in = 2006;
The view that we use in our project is built among tables with millions of rows. Scanning all the rows from table in every view data retrieval is unacceptable. Please help! Thanks!
我们在项目中使用的视图是在具有数百万行的表中构建的。在每个视图数据检索中扫描表中的所有行是不可接受的。请帮忙!谢谢!
@zerkms Here is the result I tested on real-life. I don't see much differences between. I think @spencer7593 has the right point. The MySQL optimizer doesn't "push" that predicate down in the view query.
@zerkms 这是我在现实生活中测试的结果。我看不出有多大区别。我认为@spencer7593 的观点是正确的。MySQL 优化器不会在视图查询中“推送”该谓词。
采纳答案by spencer7593
How do you get MySQL to use an index for a view query? The short answer, provide an index that MySQL can use.
如何让 MySQL 使用索引进行视图查询?简短的回答,提供一个 MySQL 可以使用的索引。
In this case, the optimum index is likely a "covering" index:
在这种情况下,最佳索引可能是“覆盖”索引:
... ON highscores (player, happened_in, score)
It's likely that MySQL will use that index, and the EXPLAIN will show: "Using index"
due to the WHERE player = 24
(an equality predicate on the leading column in the index. The GROUP BY happened_id
(the second column in the index), may allow MySQL to optimize that using the index to avoid a sort operation. Including the score
column in the index will allow the query to satisfied entirely from the index, without having to visit (lookup) the data pages referenced by the index.
MySQL 很可能会使用该索引,并且 EXPLAIN 将显示:"Using index"
由于WHERE player = 24
(索引中前导列上的相等谓词。索引中GROUP BY happened_id
的第二列),可能允许 MySQL 使用索引来优化它避免排序操作。将score
列包含在索引中将使查询完全从索引中得到满足,而不必访问(查找)索引引用的数据页。
That's the quick answer. The longer answer is that MySQL is very unlikely to use an index with leading column of happened_id
for the view query.
这就是快速答案。更长的答案是 MySQL 不太可能使用带有前导列的索引happened_id
来进行视图查询。
Why the view causes a performance issue
为什么视图会导致性能问题
One of the issues you have with the MySQL view is that MySQL does not "push" the predicate from the outer query down into the view query.
MySQL 视图的问题之一是 MySQL 不会将谓词从外部查询“推送”到视图查询中。
Your outer query specifies WHERE happened_in = 2006
. The MySQL optimizer does not consider the predicate when it runs the inner "view query". That query for the view gets executed separately, before the outer query. The resultset from the execution of that query get "materialized"; that is, the results are stored as an intermediate MyISAM table. (MySQL calls it a "derived table", and that name they use makes sense, when you understand the operations that MysQL performs.)
您的外部查询指定WHERE happened_in = 2006
. MySQL 优化器在运行内部“视图查询”时不考虑谓词。该视图查询在外部查询之前单独执行。执行该查询的结果集被“物化”;也就是说,结果存储为一个中间 MyISAM 表。(MySQL 称其为“派生表”,当您了解 MySQL 执行的操作时,他们使用的名称是有道理的。)
The bottom line is that the index you have defined on happened_in
is not being used by MySQL when it rusn the query that forms the view definition.
最重要的是,happened_in
当 MySQL 运行形成视图定义的查询时,您定义的索引没有被 MySQL 使用。
After the intermediate "derived table" is created, THEN the outer query is executed, using that "derived table" as a rowsource. It's when that outer query runs that the happened_in = 2006
predicate is evaluated.
创建中间“派生表”后,然后执行外部查询,使用该“派生表”作为行源。当外部查询运行时,happened_in = 2006
谓词被评估。
Note that all of the rows from the view query are stored, which (in your case) is a row for EVERY value of happened_in
, not just the one you specify an equality predicate on in the outer query.
请注意,视图查询中的所有行都被存储,(在您的情况下)是每个值的行happened_in
,而不仅仅是您在外部查询中指定相等谓词的行。
The way that view queries are processed may be "unexpected" by some, and this is one reason that using "views" in MySQL can lead to performance problems, as compared to the way view queries are processed by other relational databases.
处理视图查询的方式可能有些人“出乎意料”,这就是与其他关系数据库处理视图查询的方式相比,在 MySQL 中使用“视图”会导致性能问题的原因之一。
Improving performance of the view query with a suitable covering index
使用合适的覆盖索引提高视图查询的性能
Given your view definition and your query, about the best you are going to get would be a "Using index" access method for the view query. To get that, you'd need a covering index, e.g.
给定您的视图定义和查询,您将获得的最好的方法是视图查询的“使用索引”访问方法。为此,您需要一个覆盖索引,例如
... ON highscores (player, happened_in, score).
That's likely to be the most beneficial index (performance wise) for your existing view definition and your existing query. The player
column is the leading column because you have an equality predicate on that column in the view query. The happened_in
column is next, because you've got a GROUP BY operation on that column, and MySQL is going to be able to use this index to optimize the GROUP BY operation. We also include the score
column, because that is the only other column referenced in your query. That makes the index a "covering" index, because MySQL can satisfy that query directly from index pages, without a need to visit any pages in the underlying table. And that's as good as we're going to get out of that query plan: "Using index" with no "Using filesort".
对于现有视图定义和现有查询,这可能是最有益的索引(性能方面)。该player
列是前导列,因为您在视图查询中的该列上有一个相等谓词。happened_in
接下来是列,因为您对该列进行了 GROUP BY 操作,而 MySQL 将能够使用此索引来优化 GROUP BY 操作。我们还包括该score
列,因为这是您的查询中唯一引用的其他列。这使得索引成为“覆盖”索引,因为 MySQL 可以直接从索引页面满足该查询,而无需访问基础表中的任何页面。这与我们将要退出查询计划一样好:“使用索引”而不是“使用文件排序”。
Compare performance to standalone query with no derived table
将性能与没有派生表的独立查询进行比较
You could compare the execution plan for your query against the view vs. an equivalent standalone query:
您可以将查询的执行计划与视图与等效的独立查询进行比较:
SELECT player
, MAX(score) AS highest_score
, happened_in
FROM highscores
WHERE player = 24
AND happened_in = 2006
GROUP
BY player
, happened_in
The standalone query can also make use of a covering index e.g.
独立查询也可以使用覆盖索引,例如
... ON highscores (player, happened_in, score)
but without a need to materialize an intermediate MyISAM table.
但无需实现中间 MyISAM 表。
I am not sure that any of the previous provides a direct answer to the question you were asking.
我不确定以前的任何一个都为您提出的问题提供了直接答案。
Q: How do I get MySQL to use an INDEX for view query?
问:如何让 MySQL 使用 INDEX 进行视图查询?
A: Define a suitable INDEX that the view query can use.
A:定义视图查询可以使用的合适的INDEX。
The short answer is provide a "covering index" (index includes all columns referenced in the view query). The leading columns in that index should be the columns that are referenced with equality predicates (in your case, the column player
would be a leading column because you have a player = 24
predicate in the query. Also, the columns referenced in the GROUP BY should be leading columns in the index, which allows MySQL to optimize the GROUP BY
operation, by making use of the index rather than using a sort operation.
简短的回答是提供一个“覆盖索引”(索引包括视图查询中引用的所有列)。该索引中的前导列应该是用等式谓词引用的列(在您的情况下,该列player
将是前导列,因为您player = 24
在查询中有一个谓词。此外,在 GROUP BY 中引用的列应该是前导列在索引中,它允许 MySQLGROUP BY
通过使用索引而不是使用排序操作来优化操作。
The key point here is that the view query is basically a standalone query; the results from that query get stored in an intermediate "derived" table (a MyISAM table that gets created when a query against the view gets run.
这里的关键是视图查询基本上是一个独立的查询;该查询的结果存储在一个中间的“派生”表(一个 MyISAM 表,该表在针对视图的查询运行时创建。
Using views in MySQL is not necessarily a "bad idea", but I would strongly caution those who choose to use views within MySQL to be AWARE of how MySQL processes queries that reference those views. And the way MySQL processes view queries differs (significantly) from the way view queries are handled by other databases (e.g. Oracle, SQL Server).
在 MySQL 中使用视图不一定是一个“坏主意”,但我强烈警告那些选择在 MySQL 中使用视图的人要注意 MySQL 如何处理引用这些视图的查询。并且 MySQL 处理视图查询的方式(显着地)不同于其他数据库(例如 Oracle、SQL Server)处理视图查询的方式。
回答by zerkms
Creating the compositeindex with player + happened_in
(in this particular order) columns is the best you can do in this case.
在这种情况下,使用(按此特定顺序)列创建复合索引player + happened_in
是您可以做的最好的事情。
PS: don't test mysql optimizer behaviour on such small amount of rows, because it's likely to prefer fullscan over indexes. If you want to see what will happen in real life - fill it with real life-alike amount of data.
PS:不要在如此少量的行上测试 mysql 优化器行为,因为它可能更喜欢全扫描而不是索引。如果你想看看现实生活中会发生什么 - 用真实的数据量填充它。
回答by ray.gurganus
This doesn't directly answer the question, but it is a directly related workaround for others running into this issue. This achieves the same benefits of using a view, while minimizing the disadvantages.
这并没有直接回答问题,但对于遇到此问题的其他人来说,这是一个直接相关的解决方法。这实现了与使用视图相同的好处,同时最大限度地减少了缺点。
I setup a PHP function to which I can send parameters, things to push into the inside to maximize index usage, rather than using them in a join or where clause outside a view. In the function you can formulate the SQL syntax for a derived table, and return that syntax. Then in the calling program, you can do something like this:
我设置了一个 PHP 函数,我可以向它发送参数,将内容推入内部以最大化索引使用,而不是在视图外的连接或 where 子句中使用它们。在该函数中,您可以为派生表制定 SQL 语法,并返回该语法。然后在调用程序中,您可以执行以下操作:
$table = tablesyntax(parameters);
select field1, field2 from {$table} as x... + other SQL
Thus you get the encapsulation benefits of the view, the ability to call it as if it is a view, but not the index limitations.
因此,您可以获得视图的封装优势,能够像视图一样调用它,但没有索引限制。