MySQL INNER JOIN ON vs WHERE 子句
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1018822/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
INNER JOIN ON vs WHERE clause
提问by JCCyC
For simplicity, assume all relevant fields are NOT NULL
.
为简单起见,假设所有相关字段都是NOT NULL
。
You can do:
你可以做:
SELECT
table1.this, table2.that, table2.somethingelse
FROM
table1, table2
WHERE
table1.foreignkey = table2.primarykey
AND (some other conditions)
Or else:
要不然:
SELECT
table1.this, table2.that, table2.somethingelse
FROM
table1 INNER JOIN table2
ON table1.foreignkey = table2.primarykey
WHERE
(some other conditions)
Do these two work on the same way in MySQL
?
这两个在 中以相同的方式工作MySQL
吗?
采纳答案by Quassnoi
INNER JOIN
is ANSI syntax which you should use.
INNER JOIN
是您应该使用的 ANSI 语法。
It is generally considered more readable, especially when you join lots of tables.
它通常被认为更具可读性,尤其是当您加入大量表时。
It can also be easily replaced with an OUTER JOIN
whenever a need arises.
它也可以OUTER JOIN
在需要时轻松更换。
The WHERE
syntax is more relational model oriented.
该WHERE
语法更关系模型为主。
A result of two tables JOIN
ed is a cartesian product of the tables to which a filter is applied which selects only those rows with joining columns matching.
两个表JOIN
ed 的结果是应用过滤器的表的笛卡尔积,该过滤器仅选择连接列匹配的那些行。
It's easier to see this with the WHERE
syntax.
使用WHERE
语法更容易看到这一点。
As for your example, in MySQL (and in SQL generally) these two queries are synonyms.
至于您的示例,在 MySQL 中(以及通常在 SQL 中),这两个查询是同义词。
Also note that MySQL also has a STRAIGHT_JOIN
clause.
另请注意,MySQL 也有一个STRAIGHT_JOIN
子句。
Using this clause, you can control the JOIN
order: which table is scanned in the outer loop and which one is in the inner loop.
使用该子句,您可以控制JOIN
顺序:在外循环中扫描哪个表,在内循环中扫描哪个表。
You cannot control this in MySQL using WHERE
syntax.
您无法使用WHERE
语法在 MySQL 中控制它。
回答by Carl Manaster
Others have pointed out that INNER JOIN
helps human readability, and that's a top priority, I agree.
Let me try to explain whythe join syntax is more readable.
其他人指出,这INNER JOIN
有助于提高人类可读性,这是重中之重,我同意。
让我尝试解释为什么连接语法更具可读性。
A basic SELECT
query is this:
一个基本的SELECT
查询是这样的:
SELECT stuff
FROM tables
WHERE conditions
The SELECT
clause tells us whatwe're getting back; the FROM
clause tells us wherewe're getting it from, and the WHERE
clause tells us whichones we're getting.
该SELECT
子句告诉我们要返回什么;该FROM
条款告诉我们在那里我们得到它,而WHERE
从句告诉我们,这我们得到的。
JOIN
is a statement about the tables, how they are bound together (conceptually, actually, into a single table).
JOIN
是关于表的声明,它们是如何绑定在一起的(从概念上讲,实际上是在一个表中)。
Any query elements that control the tables - where we're getting stuff from - semantically belong to the FROM
clause (and of course, that's where JOIN
elements go). Putting joining-elements into the WHERE
clause conflates the whichand the where-from, that's why the JOIN
syntax is preferred.
任何控制表的查询元素——我们从哪里获取东西——在语义上都属于FROM
子句(当然,这就是JOIN
元素去的地方)。将连接元素放入WHERE
子句中会将which和where-from混为一谈,这JOIN
就是首选语法的原因。
回答by rafidheen
Applying conditional statements in ON / WHERE
在 ON / WHERE 中应用条件语句
Here I have explained about the logical query processing steps.
这里我已经解释了逻辑查询处理步骤。
Reference : Inside Microsoft? SQL Server? 2005 T-SQL Querying
Publisher: Microsoft Press
Pub Date: March 07, 2006
Print ISBN-10: 0-7356-2313-9
Print ISBN-13: 978-0-7356-2313-2
Pages: 640
参考:微软内部?SQL服务器?2005 T-SQL Querying
Publisher:Microsoft Press
Pub 日期:2006 年 3 月 7 日
打印 ISBN-10:0-7356-2313-9
打印 ISBN-13:978-0-7356-2313-2
页数:640
Inside Microsoft? SQL Server? 2005 T-SQL Querying
(8) SELECT (9) DISTINCT (11) TOP <top_specification> <select_list>
(1) FROM <left_table>
(3) <join_type> JOIN <right_table>
(2) ON <join_condition>
(4) WHERE <where_condition>
(5) GROUP BY <group_by_list>
(6) WITH {CUBE | ROLLUP}
(7) HAVING <having_condition>
(10) ORDER BY <order_by_list>
The first noticeable aspect of SQL that is different than other programming languages is the order in which the code is processed. In most programming languages, the code is processed in the order in which it is written. In SQL, the first clause that is processed is the FROM clause, while the SELECT clause, which appears first, is processed almost last.
SQL 与其他编程语言不同的第一个显着方面是处理代码的顺序。在大多数编程语言中,代码按照编写的顺序进行处理。在 SQL 中,处理的第一个子句是 FROM 子句,而最先出现的 SELECT 子句几乎最后处理。
Each step generates a virtual table that is used as the input to the following step. These virtual tables are not available to the caller (client application or outer query). Only the table generated by the final step is returned to the caller. If a certain clause is not specified in a query, the corresponding step is simply skipped.
每个步骤都会生成一个虚拟表,用作下一步骤的输入。这些虚拟表对调用者(客户端应用程序或外部查询)不可用。只有最后一步生成的表才会返回给调用者。如果查询中没有指定某个子句,则简单地跳过相应的步骤。
Brief Description of Logical Query Processing Phases
逻辑查询处理阶段简述
Don't worry too much if the description of the steps doesn't seem to make much sense for now. These are provided as a reference. Sections that come after the scenario example will cover the steps in much more detail.
如果这些步骤的描述现在似乎没有多大意义,请不要太担心。这些是作为参考提供的。场景示例之后的部分将更详细地介绍这些步骤。
FROM: A Cartesian product (cross join) is performed between the first two tables in the FROM clause, and as a result, virtual table VT1 is generated.
ON: The ON filter is applied to VT1. Only rows for which the
<join_condition>
is TRUE are inserted to VT2.OUTER (join): If an OUTER JOIN is specified (as opposed to a CROSS JOIN or an INNER JOIN), rows from the preserved table or tables for which a match was not found are added to the rows from VT2 as outer rows, generating VT3. If more than two tables appear in the FROM clause, steps 1 through 3 are applied repeatedly between the result of the last join and the next table in the FROM clause until all tables are processed.
WHERE: The WHERE filter is applied to VT3. Only rows for which the
<where_condition>
is TRUE are inserted to VT4.GROUP BY: The rows from VT4 are arranged in groups based on the column list specified in the GROUP BY clause. VT5 is generated.
CUBE | ROLLUP: Supergroups (groups of groups) are added to the rows from VT5, generating VT6.
HAVING: The HAVING filter is applied to VT6. Only groups for which the
<having_condition>
is TRUE are inserted to VT7.SELECT: The SELECT list is processed, generating VT8.
DISTINCT: Duplicate rows are removed from VT8. VT9 is generated.
ORDER BY: The rows from VT9 are sorted according to the column list specified in the ORDER BY clause. A cursor is generated (VC10).
TOP: The specified number or percentage of rows is selected from the beginning of VC10. Table VT11 is generated and returned to the caller.
FROM:在FROM子句中的前两个表之间进行笛卡尔积(交叉连接),从而生成虚拟表VT1。
ON:ON 滤波器应用于 VT1。只有
<join_condition>
为 TRUE 的行才会插入到 VT2。OUTER (join):如果指定了 OUTER JOIN(与 CROSS JOIN 或 INNER JOIN 相反),则保留表中的行或未找到匹配的表将作为外部行添加到 VT2 的行中,生成VT3。如果在 FROM 子句中出现两个以上的表,则在最后一个连接的结果和 FROM 子句中的下一个表之间重复应用步骤 1 到 3,直到处理完所有表。
WHERE:WHERE 过滤器应用于 VT3。只有
<where_condition>
为 TRUE 的行才会插入到 VT4。GROUP BY:VT4 中的行根据 GROUP BY 子句中指定的列列表按组排列。VT5 生成。
立方体 | ROLLUP:超级组(组的组)添加到来自 VT5 的行,生成 VT6。
HAVING:HAVING 过滤器应用于 VT6。只有
<having_condition>
为 TRUE 的组才会插入到 VT7。SELECT:处理SELECT列表,生成VT8。
DISTINCT:从 VT8 中删除重复的行。VT9 生成。
ORDER BY:VT9 中的行根据 ORDER BY 子句中指定的列列表进行排序。生成游标 (VC10)。
TOP:从VC10的开头选择指定的行数或百分比。生成表 VT11 并返回给调用者。
因此,(INNER JOIN) ON 将在应用 WHERE 子句之前过滤数据(此处 VT 的数据计数会自行减少)。随后的连接条件将使用过滤后的数据执行,从而提高性能。之后,只有 WHERE 条件将应用过滤条件。
(Applying conditional statements in ON / WHERE will not make much difference in few cases. This depends how many tables you have joined and number of rows available in each join tables)
(在少数情况下,在 ON / WHERE 中应用条件语句不会有太大区别。这取决于您加入了多少表以及每个连接表中可用的行数)
回答by Cade Roux
The implicit join ANSI syntax is older, less obvious and not recommended.
隐式连接 ANSI 语法较旧,不太明显,不推荐使用。
In addition, the relational algebra allows interchangeability of the predicates in the WHERE
clause and the INNER JOIN
, so even INNER JOIN
queries with WHERE
clauses can have the predicates rearrranged by the optimizer.
此外,关系代数允许WHERE
子句中的谓词和中的谓词可以互换INNER JOIN
,因此即使INNER JOIN
带有WHERE
子句的查询也可以让优化器重新排列谓词。
I recommend you write the queries in the most readble way possible.
我建议您以最易读的方式编写查询。
Sometimes this includes making the INNER JOIN
relatively "incomplete" and putting some of the criteria in the WHERE
simply to make the lists of filtering criteria more easily maintainable.
有时,这包括使INNER JOIN
相对“不完整”和将一些标准放入WHERE
简单的过滤标准列表中,以便更容易维护。
For example, instead of:
例如,而不是:
SELECT *
FROM Customers c
INNER JOIN CustomerAccounts ca
ON ca.CustomerID = c.CustomerID
AND c.State = 'NY'
INNER JOIN Accounts a
ON ca.AccountID = a.AccountID
AND a.Status = 1
Write:
写:
SELECT *
FROM Customers c
INNER JOIN CustomerAccounts ca
ON ca.CustomerID = c.CustomerID
INNER JOIN Accounts a
ON ca.AccountID = a.AccountID
WHERE c.State = 'NY'
AND a.Status = 1
But it depends, of course.
但这当然取决于。
回答by matt b
Implicit joins (which is what your first query is known as) become much much more confusing, hard to read, and hard to maintain once you need to start adding more tables to your query. Imagine doing that same query and type of join on four or five different tables ... it's a nightmare.
一旦您需要开始向查询添加更多表,隐式连接(这就是您的第一个查询所称的)变得更加混乱、难以阅读和难以维护。想象一下,在四五个不同的表上执行相同的查询和连接类型……这是一场噩梦。
Using an explicit join (your second example) is much more readable and easy to maintain.
使用显式连接(您的第二个示例)更具可读性且易于维护。
回答by HLGEM
I'll also point out that using the older syntax is more subject to error. If you use inner joins without an ON clause, you will get a syntax error. If you use the older syntax and forget one of the join conditions in the where clause, you will get a cross join. The developers often fix this by adding the distinct keyword (rather than fixing the join because they still don't realize the join itself is broken) which may appear to cure the problem, but will slow down the query considerably.
我还要指出,使用旧语法更容易出错。如果使用没有 ON 子句的内部联接,则会出现语法错误。如果您使用旧语法并忘记 where 子句中的连接条件之一,您将获得交叉连接。开发人员通常通过添加 distinct 关键字来解决这个问题(而不是修复连接,因为他们仍然没有意识到连接本身已损坏),这似乎可以解决问题,但会大大减慢查询速度。
Additionally for maintenance if you have a cross join in the old syntax, how will the maintainer know if you meant to have one (there are situations where cross joins are needed) or if it was an accident that should be fixed?
此外,如果您在旧语法中有交叉连接,那么维护人员如何知道您是否打算使用交叉连接(在某些情况下需要交叉连接),或者是否是应该修复的事故?
Let me point you to this question to see why the implicit syntax is bad if you use left joins. Sybase *= to Ansi Standard with 2 different outer tables for same inner table
让我指出这个问题,看看为什么使用左连接时隐式语法不好。 Sybase *= 到 Ansi 标准,同一个内表有 2 个不同的外表
Plus (personal rant here), the standard using the explicit joins is over 20 years old, which means implicit join syntax has been outdated for those 20 years. Would you write application code using syntax that has been outdated for 20 years? Why do you want to write database code that is?
另外(这里是个人咆哮),使用显式连接的标准已有 20 多年的历史,这意味着隐式连接语法已经过时了 20 年。您会使用已经过时 20 年的语法编写应用程序代码吗?那你为什么要写数据库代码呢?
回答by John Gietzen
They have a different human-readable meaning.
它们具有不同的人类可读含义。
However, depending on the query optimizer, they may have the same meaning to the machine.
但是,根据查询优化器的不同,它们对机器可能具有相同的含义。
You should always code to be readable.
您应该始终编写可读的代码。
That is to say, if this is a built-in relationship, use the explicit join. if you are matching on weakly related data, use the where clause.
也就是说,如果这是一个内置关系,则使用显式连接。如果要匹配弱相关数据,请使用 where 子句。
回答by Brent Baisley
The SQL:2003 standard changed some precedence rules so a JOIN statement takes precedence over a "comma" join. This can actually change the results of your query depending on how it is setup. This cause some problems for some people when MySQL 5.0.12 switched to adhering to the standard.
SQL:2003 标准更改了一些优先规则,因此 JOIN 语句优先于“逗号”连接。这实际上可以根据设置的方式更改查询的结果。当 MySQL 5.0.12 切换到坚持标准时,这会给某些人带来一些问题。
So in your example, your queries would work the same. But if you added a third table: SELECT ... FROM table1, table2 JOIN table3 ON ... WHERE ...
因此,在您的示例中,您的查询将起作用。但是如果你添加了第三个表: SELECT ... FROM table1, table2 JOIN table3 ON ... WHERE ...
Prior to MySQL 5.0.12, table1 and table2 would be joined first, then table3. Now (5.0.12 and on), table2 and table3 are joined first, then table1. It doesn't always change the results, but it can and you may not even realize it.
在 MySQL 5.0.12 之前,table1 和 table2 将首先连接,然后 table3。现在(5.0.12 及以上),table2 和 table3 首先连接,然后 table1。它并不总是会改变结果,但它可以,你甚至可能没有意识到。
I never use the "comma" syntax anymore, opting for your second example. It's a lot more readable anyway, the JOIN conditions are with the JOINs, not separated into a separate query section.
我不再使用“逗号”语法,选择您的第二个示例。无论如何,它的可读性要高得多,JOIN 条件与 JOIN 一起使用,而不是分成单独的查询部分。
回答by Jo?o Marcus
I know you're talking about MySQL, but anyway: In Oracle 9 explicit joins and implicit joins would generate different execution plans. AFAIK that has been solved in Oracle 10+: there's no such difference anymore.
我知道您在谈论 MySQL,但无论如何:在 Oracle 9 中,显式连接和隐式连接会生成不同的执行计划。已在 Oracle 10+ 中解决的 AFAIK:不再有这种区别。
回答by Benzo
ANSI join syntax is definitely more portable.
ANSI join 语法肯定更便携。
I'm going through an upgrade of Microsoft SQL Server, and I would also mention that the =* and *= syntax for outer joins in SQL Server is not supported (without compatability mode) for 2005 sql server and later.
我正在升级 Microsoft SQL Server,我还要提到 SQL Server 中外部联接的 =* 和 *= 语法在 2005 sql server 及更高版本中不受支持(没有兼容模式)。