MySQL 子查询与连接
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/141278/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Subqueries vs joins
提问by palmsey
I refactored a slow section of an application we inherited from another company to use an inner join instead of a subquery like:
我重构了我们从另一家公司继承的应用程序的一个缓慢部分,以使用内部联接而不是子查询,例如:
WHERE id IN (SELECT id FROM ...)
The refactored query runs about 100x faster.(~50 seconds to ~0.3) I expected an improvement, but can anyone explain why it was so drastic? The columns used in the where clause were all indexed. Does SQL execute the query in the where clause once per row or something?
重构后的查询运行速度提高了大约 100 倍。(约 50 秒到约 0.3 秒)我预计会有改进,但谁能解释为什么它如此激烈?where 子句中使用的列都已编入索引。SQL 是否每行执行一次 where 子句中的查询?
Update- Explain results:
更新- 解释结果:
The difference is in the second part of the "where id in ()" query -
不同之处在于“where id in ()”查询的第二部分——
2 DEPENDENT SUBQUERY submission_tags ref st_tag_id st_tag_id 4 const 2966 Using where
vs 1 indexed row with the join:
与 1 个带连接的索引行:
SIMPLE s eq_ref PRIMARY PRIMARY 4 newsladder_production.st.submission_id 1 Using index
回答by Jeffrey L Whitledge
A "correlated subquery" (i.e., one in which the where condition depends on values obtained from the rows of the containing query) will execute once for each row. A non-correlated subquery (one in which the where condition is independent of the containing query) will execute once at the beginning. The SQL engine makes this distinction automatically.
“相关子查询”(即,其中 where 条件取决于从包含查询的行中获得的值的查询)将对每一行执行一次。非相关子查询(其中 where 条件独立于包含查询的查询)将在开始时执行一次。SQL 引擎会自动进行这种区分。
But, yeah, explain-plan will give you the dirty details.
但是,是的,解释计划会给你肮脏的细节。
回答by Sklivvz
You are running the subquery once for every rowwhereas the join happens on indexes.
您为每一行运行一次子查询,而连接发生在索引上。
回答by Giuseppe Maxia
Here's an example of how subqueries are evaluated in MySQL 6.0.
这是在 MySQL 6.0中如何评估子查询的示例。
The new optimizer will convert this kind of subqueries into joins.
新的优化器会将这种子查询转换为连接。
回答by scotta
Run the explain-plan on each version, it will tell you why.
在每个版本上运行解释计划,它会告诉你原因。
回答by pfranza
before the queries are run against the dataset they are put through a query optimizer, the optimizer attempts to organize the query in such a fashion that it can remove as many tuples (rows) from the result set as quickly as it can. Often when you use subqueries (especially bad ones) the tuples can't be pruned out of the result set until the outer query starts to run.
在对数据集运行查询之前,它们通过查询优化器,优化器尝试以这样一种方式组织查询,即它可以尽快从结果集中删除尽可能多的元组(行)。通常,当您使用子查询(尤其是坏的)时,在外部查询开始运行之前,无法从结果集中修剪元组。
With out seeing the the query its hard to say what was so bad about the original, but my guess would be it was something that the optimizer just couldn't make much better. Running 'explain' will show you the optimizers method for retrieving the data.
如果没有看到查询,很难说原始的有什么不好,但我的猜测是优化器无法做得更好。运行“explain”将向您展示用于检索数据的优化器方法。
回答by Shawn
The where subquery has to run 1 query for each returned row. The inner join just has to run 1 query.
where 子查询必须为每个返回的行运行 1 个查询。内连接只需要运行 1 个查询。
回答by Pete Karl II
This question is somewhat general, so here's a general answer:
这个问题有点笼统,所以这里有一个通用的答案:
Basically, queries take longer when MySQL has tons of rows to sort through.
基本上,当 MySQL 有大量行要排序时,查询需要更长的时间。
Do this:
做这个:
Run an EXPLAIN on each of the queries (the JOIN'ed one, then the Subqueried one), and post the results here.
对每个查询(加入的查询,然后是子查询的查询)运行 EXPLAIN,并在此处发布结果。
I think seeing the difference in MySQL's interpretation of those queries would be a learning experience for everyone.
我认为看到 MySQL 对这些查询的解释的不同对每个人来说都是一种学习经验。
回答by Amy B
Look at the query plan for each query.
查看每个查询的查询计划。
Where inand Joincan typicallybe implemented using the same execution plan, so typicallythere is zero speed-up from changing between them.
凡在和加入可通常使用相同的执行计划来实现,所以通常有零加速从他们之间发生变化。
回答by Mark Roddy
Usually its the result of the optimizer not being able to figure out that the subquery can be executed as a join in which case it executes the subquery for each record in the table rather then join the table in the subquery against the table you are querying. Some of the more "enterprisey" database are better at this, but they still miss it sometimes.
通常,优化器的结果是无法确定子查询可以作为连接执行,在这种情况下,它会为表中的每条记录执行子查询,而不是将子查询中的表连接到您正在查询的表中。一些更“企业”的数据库在这方面做得更好,但有时他们仍然会错过。
回答by Cade Roux
Optimizer didn't do a very good job. Usually they can be transformed without any difference and the optimizer can do this.
优化器没有做得很好。通常它们可以在没有任何区别的情况下进行转换,优化器可以做到这一点。