MySQL JOIN 的评估顺序是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/228424/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
In what order are MySQL JOINs evaluated?
提问by Kyle Noland
I have the following query:
我有以下查询:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123;
I have the following questions:
我有以下问题:
- Is the USING syntax synonymous with ON syntax?
- Are these joins evaluated left to right? In other words, does this query say: x = companies JOIN users; y = x JOIN jobs; z = y JOIN useraccounts;
- If the answer to question 2 is yes, is it safe to assume that the companies table has companyid, userid and jobid columns?
- I don't understand how the WHERE clause can be used to pick rows on the companies table when it is referring to the alias "j"
- USING 语法是 ON 语法的同义词吗?
- 这些连接是从左到右评估的吗?换句话说,这个查询是否说:x = 公司加入用户;y = x JOIN 作业;z = y 加入用户帐户;
- 如果问题 2 的答案是肯定的,那么假设公司表具有 companyid、userid 和 jobid 列是否安全?
- 我不明白 WHERE 子句在引用别名“j”时如何用于选择公司表上的行
Any help would be appreciated!
任何帮助,将不胜感激!
回答by staticsan
USING (fieldname) is a shorthand way of saying ON table1.fieldname = table2.fieldname.
SQL doesn't define the 'order' in which JOINS are done because it is not the nature of the language. Obviously an order has to be specified in the statement, but an INNER JOIN can be considered commutative: you can list them in any order and you will get the same results.
That said, when constructing a SELECT ... JOIN, particularly one that includes LEFT JOINs, I've found it makes sense to regard the third JOIN as joining the new table to the results of the first JOIN, the fourth JOIN as joining the results of the second JOIN, and so on.
More rarely, the specified order can influence the behaviour of the query optimizer, due to the way it influences the heuristics.
No. The way the query is assembled, it requires that companies and users both have a companyid, jobs has a userid and a jobid and useraccounts has a userid. However, only one of companies oruser needs a userid for the JOIN to work.
The WHERE clause is filtering the whole result -- i.e. all JOINed columns -- using a column provided by the jobs table.
USING (fieldname) 是表示 ON table1.fieldname = table2.fieldname 的简写方式。
SQL 没有定义 JOINS 的“顺序”,因为它不是语言的本质。显然必须在语句中指定一个顺序,但 INNER JOIN 可以被认为是可交换的:您可以按任何顺序列出它们,您将获得相同的结果。
也就是说,在构建 SELECT ... JOIN 时,尤其是包含 LEFT JOIN 的 JOIN 时,我发现将第三个 JOIN 视为将新表连接到第一个 JOIN 的结果,第四个 JOIN 作为连接第二次 JOIN 的结果,依此类推。
更罕见的是,指定的顺序会影响查询优化器的行为,因为它影响启发式的方式。
否。查询的组装方式要求公司和用户都有一个 companyid,jobs 有一个 userid 和一个 jobid,而 useraccounts 有一个 userid。但是,只有公司或用户之一需要用户 ID 才能使 JOIN 工作。
WHERE 子句使用jobs 表提供的列过滤整个结果——即所有JOINed 列。
回答by benjismith
I can't answer the bit about the USING syntax. That's weird. I've never seen it before, having always used an ON clause instead.
我无法回答有关 USING 语法的问题。这很奇怪。我以前从未见过它,而是一直使用 ON 子句。
But what I cantell you is that the order of JOIN operations is determined dynamically by the query optimizer when it constructs its query plan, based on a system of optimization heuristics, some of which are:
但我可以告诉你的是,JOIN 操作的顺序是由查询优化器在构建查询计划时动态确定的,基于优化启发式系统,其中一些是:
Is the JOIN performed on a primary key field? If so, this gets high priority in the query plan.
Is the JOIN performed on a foreign key field? This also gets high priority.
Does an index exist on the joined field? If so, bump the priority.
Is a JOIN operation performed on a field in WHERE clause? Can the WHERE clause expression be evaluated by examining the index (rather than by performing a table scan)? This is a majoroptimization opportunity, so it gets a major priority bump.
What is the cardinality of the joined column? Columns with high cardinality give the optimizer more opportunities to discriminate against false matches (those that don't satisfy the WHERE clause or the ON clause), so high-cardinality joins are usually processed before low-cardinality joins.
How many actual rows are in the joined table? Joining against a table with only 100 values is going to create less of a data explosion than joining against a table with ten million rows.
JOIN 是否在主键字段上执行?如果是这样,这将在查询计划中获得高优先级。
JOIN 是否在外键字段上执行?这也获得了高优先级。
连接字段上是否存在索引?如果是这样,请提高优先级。
是否对 WHERE 子句中的字段执行了 JOIN 操作?是否可以通过检查索引(而不是执行表扫描)来评估 WHERE 子句表达式?这是一个重要的优化机会,因此它获得了主要的优先级提升。
连接列的基数是多少?具有高基数的列为优化器提供了更多机会来区分错误匹配(那些不满足 WHERE 子句或 ON 子句的匹配),因此通常在低基数连接之前处理高基数连接。
连接表中有多少实际行?与只有 100 个值的表联接相比,联接具有 1000 万行的表所产生的数据爆炸更少。
Anyhow... the point is... there are a LOT of variables that go into the query execution plan. If you want to see how MySQL optimizes its queries, use the EXPLAIN syntax.
无论如何......重点是......有很多变量进入查询执行计划。如果您想了解 MySQL 如何优化其查询,请使用 EXPLAIN 语法。
And here's a good article to read:
这是一篇值得阅读的好文章:
http://www.informit.com/articles/article.aspx?p=377652
http://www.informit.com/articles/article.aspx?p=377652
ON EDIT:
编辑:
To answer your 4th question: You aren't querying the "companies" table. You're querying the joined cross-product of ALLfour tables in your FROM and USING clauses.
回答您的第四个问题:您不是在查询“公司”表。您正在查询FROM 和 USING 子句中所有四个表的连接叉积。
The "j.jobid" alias is just the fully-qualified name of one of the columns in that joined collection of tables.
“j.jobid”别名只是该表的联接集合中的列之一的完全限定名称。
回答by NickZoic
In MySQL, it's often interesting to ask the query optimizer what it plans to do, with:
在 MySQL 中,询问查询优化器它计划做什么通常很有趣,包括:
EXPLAIN SELECT [...]
回答by William Entriken
Here is a more detailed answer on JOIN
precedence. In your case, the JOIN
s are all commutative. Let's try one where they aren't.
这是关于JOIN
优先级的更详细的答案。在您的情况下,JOIN
s 都是可交换的。让我们尝试一个他们不在的地方。
Build schema:
构建架构:
CREATE TABLE users (
name text
);
CREATE TABLE orders (
order_id text,
user_name text
);
CREATE TABLE shipments (
order_id text,
fulfiller text
);
Add data:
添加数据:
INSERT INTO users VALUES ('Bob'), ('Mary');
INSERT INTO orders VALUES ('order1', 'Bob');
INSERT INTO shipments VALUES ('order1', 'Fulfilling Mary');
Run query:
运行查询:
SELECT *
FROM users
LEFT OUTER JOIN orders
ON orders.user_name = users.name
JOIN shipments
ON shipments.order_id = orders.order_id
Result:
结果:
Only the Bob row is returned
仅返回 Bob 行
Analysis:
分析:
In this query the LEFT OUTER JOIN
was evaluated first and the JOIN
was evaluated on the composite result of the LEFT OUTER JOIN
.
在此查询中,LEFT OUTER JOIN
首先对JOIN
求值,然后对 的复合结果求值LEFT OUTER JOIN
。
Second query:
第二个查询:
SELECT *
FROM users
LEFT OUTER JOIN (
orders
JOIN shipments
ON shipments.order_id = orders.order_id)
ON orders.user_name = users.name
Result:
结果:
One row for Bob (with the fulfillment data) and one row for Mary with NULLs for fulfillment data.
Bob 的一行(带有履行数据)和 Mary 的一行,带有 NULL 的履行数据。
Analysis:
分析:
The parenthesis changed the evaluation order.
括号更改了评估顺序。
Further MySQL documentation is at https://dev.mysql.com/doc/refman/5.5/en/nested-join-optimization.html
更多 MySQL 文档位于https://dev.mysql.com/doc/refman/5.5/en/nested-join-optimization.html
回答by luke
Im not sure about the ON vs USING part (though this websitesays they are the same)
我不确定 ON vs USING 部分(虽然这个网站说它们是一样的)
As for the ordering question, its entirely implementation (and probably query) specific. MYSQL most likely picks an order when compiling the request. If you do want to enforce a particular order you would have to 'nest' your queries:
至于排序问题,它完全是实现(可能是查询)特定的。MYSQL 最有可能在编译请求时选择一个订单。如果您确实想强制执行特定命令,则必须“嵌套”您的查询:
SELECT c.*
FROM companies AS c
JOIN (SELECT * FROM users AS u
JOIN (SELECT * FROM jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123)
)
as for part 4: the where clause limits what rows from the jobs table are eligible to be JOINed on. So if there are rows which would join due to the matching userids but don't have the correct jobid then they will be omitted.
至于第 4 部分: where 子句限制了来自作业表的哪些行有资格加入。因此,如果有由于匹配的用户 ID 而加入但没有正确作业 ID 的行,那么它们将被省略。
回答by Dave K
1) Using is not exactly the same as on, but it is short hand where both tables have a column with the same name you are joining on... see: http://www.java2s.com/Tutorial/MySQL/0100__Table-Join/ThekeywordUSINGcanbeusedasareplacementfortheONkeywordduringthetableJoins.htm
1) Using 与 on 不完全相同,但它是简写,其中两个表都有一个与您加入的名称相同的列...请参阅:http: //www.java2s.com/Tutorial/MySQL/0100__Table -Join/ThekeywordUSINGcanbeusedasareplacementfortheONkeywordduringthetableJoins.htm
It is more difficult to read in my opinion, so I'd go spelling out the joins.
在我看来,它更难阅读,所以我会拼出连接。
3) It is not clear from this query, but I would guess it does not.
3)从这个查询中不清楚,但我猜它没有。
2) Assuming you are joining through the other tables (not all directly on companyies) the order in this query does matter... see comparisons below:
2)假设您通过其他表(并非全部直接在公司上)加入此查询中的顺序确实很重要...请参阅下面的比较:
Origional:
原文:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123
What I think it is likely suggesting:
我认为它可能暗示的是:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = u.userid
JOIN useraccounts AS us on us.userid = u.userid
WHERE j.jobid = 123
You could switch you lines joining jobs & usersaccounts here.
您可以在此处切换加入工作和用户帐户的线路。
What it would look like if everything joined on company:
如果一切都加入公司会是什么样子:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = c.userid
JOIN useraccounts AS us on us.userid = c.userid
WHERE j.jobid = 123
This doesn't really make logical sense... unless each user has their own company.
这真的没有逻辑意义……除非每个用户都有自己的公司。
4.) The magic of sql is that you can only show certain columns but all of them are their for sorting and filtering...
4.) sql 的神奇之处在于你只能显示某些列,但它们都是用于排序和过滤的......
if you returned
如果你回来了
SELECT c.*, j.jobid....
you could clearly see what it was filtering on, but the database server doesn't care if you output a row or not for filtering.
您可以清楚地看到它过滤的内容,但数据库服务器并不关心您是否输出一行进行过滤。
回答by micahwittman
SEE http://dev.mysql.com/doc/refman/5.0/en/join.html
见http://dev.mysql.com/doc/refman/5.0/en/join.html
AND start reading here:
并从这里开始阅读:
Join Processing Changes in MySQL 5.0.12
MySQL 5.0.12 中的联接处理更改
Beginning with MySQL 5.0.12, natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard. The goal was to align the syntax and semantics of MySQL with respect to NATURAL JOIN and JOIN ... USING according to SQL:2003. However, these changes in join processing can result in different output columns for some joins. Also, some queries that appeared to work correctly in older versions must be rewritten to comply with the standard.
从 MySQL 5.0.12 开始,根据 SQL:2003 标准处理自然连接和使用 USING 的连接,包括外部连接变体。目标是根据 SQL:2003 使 MySQL 的语法和语义与 NATURAL JOIN 和 JOIN ... USING 保持一致。但是,连接处理中的这些更改可能会导致某些连接的输出列不同。此外,某些在旧版本中似乎可以正常工作的查询必须重写以符合标准。
These changes have five main aspects:
这些变化主要有五个方面:
The way that MySQL determines the result columns of NATURAL or USING join operations (and thus the result of the entire FROM clause).
Expansion of SELECT * and SELECT tbl_name.* into a list of selected columns.
Resolution of column names in NATURAL or USING joins.
Transformation of NATURAL or USING joins into JOIN ... ON.
Resolution of column names in the ON condition of a JOIN ... ON.
MySQL 确定 NATURAL 或 USING 连接操作的结果列的方式(以及整个 FROM 子句的结果)。
将 SELECT * 和 SELECT tbl_name.* 扩展为选定列的列表。
NATURAL 或 USING 连接中列名的解析。
将 NATURAL 或 USING 连接转换为 JOIN ... ON。
在 JOIN ... ON 的 ON 条件下解析列名。