SQL JOIN - WHERE 子句与 ON 子句

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/354070/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 00:30:01  来源:igfitidea点击:

SQL JOIN - WHERE clause vs. ON clause

sqljoinwhere-clauseon-clause

提问by BCS

After reading it, this is nota duplicate of Explicit vs Implicit SQL Joins. The answer may be related (or even the same) but the questionis different.

阅读后,这不是Explicit vs Implicit SQL Joins的重复。答案可能相关(甚至相同),但问题不同。



What is the difference and what should go in each?

有什么区别,每个应该包含什么?

If I understand the theory correctly, the query optimizer should be able to use both interchangeably.

如果我正确理解了理论,查询优化器应该能够互换使用两者。

回答by Joel Coehoorn

They are not the same thing.

它们不是同一件事。

Consider these queries:

考虑这些查询:

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345

and

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID 
    AND Orders.ID = 12345

The first will return an order and its lines, if any, for order number 12345. The second will return all orders, but only order 12345will have any lines associated with it.

第一个将返回一个订单及其行(如果有),用于 order number 12345。第二个将返回所有订单,但只有订单12345才会有任何与之关联的行。

With an INNER JOIN, the clauses are effectivelyequivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.

使用 an 时INNER JOIN,子句实际上是等效的。但是,仅仅因为它们在功能上相同,产生相同的结果,并不意味着这两种子句具有相同的语义。

回答by Sandeep Jindal

  • Does not matter for inner joins
  • Matters for outer joins

    a. WHEREclause: Afterjoining. Records will be filtered after join has taken place.

    b. ONclause - Beforejoining. Records (from right table) will be filtered before joining. This may end up as null in the result (since OUTER join).

  • 对内连接无关紧要
  • 外连接事项

    一种。WHERE条款:加入。加入后将过滤记录。

    ON条款 -加入之前。记录(来自右表)将在加入之前被过滤。这可能最终在结果中为 null(因为 OUTER join)。





Example: Consider the below tables:

示例:考虑下表:

    1. documents:
     | id    | name        |
     --------|-------------|
     | 1     | Document1   |
     | 2     | Document2   |
     | 3     | Document3   |
     | 4     | Document4   |
     | 5     | Document5   |


    2. downloads:
     | id   | document_id   | username |
     |------|---------------|----------|
     | 1    | 1             | sandeep  |
     | 2    | 1             | simi     |
     | 3    | 2             | sandeep  |
     | 4    | 2             | reya     |
     | 5    | 3             | simi     |

a) Inside WHEREclause:

a) 内部WHERE条款:

  SELECT documents.name, downloads.id
    FROM documents
    LEFT OUTER JOIN downloads
      ON documents.id = downloads.document_id
    WHERE username = 'sandeep'

 For above query the intermediate join table will look like this.

    | id(from documents) | name         | id (from downloads) | document_id | username |
    |--------------------|--------------|---------------------|-------------|----------|
    | 1                  | Document1    | 1                   | 1           | sandeep  |
    | 1                  | Document1    | 2                   | 1           | simi     |
    | 2                  | Document2    | 3                   | 2           | sandeep  |
    | 2                  | Document2    | 4                   | 2           | reya     |
    | 3                  | Document3    | 5                   | 3           | simi     |
    | 4                  | Document4    | NULL                | NULL        | NULL     |
    | 5                  | Document5    | NULL                | NULL        | NULL     |

  After applying the `WHERE` clause and selecting the listed attributes, the result will be: 

   | name         | id |
   |--------------|----|
   | Document1    | 1  |
   | Document2    | 3  | 

b) Inside JOINclause

b) 内部JOIN条款

  SELECT documents.name, downloads.id
  FROM documents
    LEFT OUTER JOIN downloads
      ON documents.id = downloads.document_id
        AND username = 'sandeep'

For above query the intermediate join table will look like this.

    | id(from documents) | name         | id (from downloads) | document_id | username |
    |--------------------|--------------|---------------------|-------------|----------|
    | 1                  | Document1    | 1                   | 1           | sandeep  |
    | 2                  | Document2    | 3                   | 2           | sandeep  |
    | 3                  | Document3    | NULL                | NULL        | NULL     |
    | 4                  | Document4    | NULL                | NULL        | NULL     |
    | 5                  | Document5    | NULL                | NULL        | NULL     |

Notice how the rows in `documents` that did not match both the conditions are populated with `NULL` values.

After Selecting the listed attributes, the result will be: 

   | name       | id   |
   |------------|------|
   |  Document1 | 1    |
   |  Document2 | 3    | 
   |  Document3 | NULL |
   |  Document4 | NULL | 
   |  Document5 | NULL | 

回答by Cade Roux

On INNER JOINs they are interchangeable, and the optimizer will rearrange them at will.

INNER JOINs 它们是可以互换的,优化器会随意重新排列它们。

On OUTER JOINs, they are not necessarily interchangeable, depending on which side of the join they depend on.

OUTER JOINs 上,它们不一定可以互换,这取决于它们依赖于连接的哪一侧。

I put them in either place depending on the readability.

我根据可读性把它们放在任何一个地方。

回答by HLGEM

The way I do it is:

我这样做的方式是:

  • Always put the join conditions in the ONclause if you are doing an INNER JOIN. So, do not add any WHERE conditions to the ON clause, put them in the WHEREclause.

  • If you are doing a LEFT JOIN, add any WHERE conditions to the ONclause for the table in the rightside of the join. This is a must, because adding a WHERE clause that references the right side of the join will convert the join to an INNER JOIN.

    The exception is when you are looking for the records that are not in a particular table. You would add the reference to a unique identifier (that is not ever NULL) in the RIGHT JOIN table to the WHERE clause this way: WHERE t2.idfield IS NULL. So, the only time you should reference a table on the right side of the join is to find those records which are not in the table.

  • 始终把在连接条件ON子句如果你正在做的INNER JOIN。因此,不要在 ON 子句中添加任何 WHERE 条件,将它们放在WHERE子句中。

  • 如果您正在执行 a LEFT JOIN,请将任何 WHERE 条件添加到连接右侧ON表的子句中。这是必须的,因为添加引用联接右侧的 WHERE 子句会将联接转换为 INNER JOIN。

    例外情况是当您要查找不在特定表中的记录时。你会参考添加到一个唯一的标识符(不是以往NULL)在RIGHT JOIN表WHERE子句是这样的:WHERE t2.idfield IS NULL。因此,您应该在联接右侧引用表的唯一时间是查找那些不在表中的记录。

回答by matt b

On an inner join, they mean the same thing. However you will get different results in an outer join depending on if you put the join condition in the WHERE vs the ON clause. Take a look at this related questionand this answer(by me).

在内部联接中,它们的含义相同。但是,根据您是将连接条件放在 WHERE 还是 ON 子句中,您将在外连接中获得不同的结果。看看这个相关的问题这个答案(由我)。

I think it makes the most sense to be in the habit of always putting the join condition in the ON clause (unless it is an outer join and you actually do want it in the where clause) as it makes it clearer to anyone reading your query what conditions the tables are being joined on, and also it helps prevent the WHERE clause from being dozens of lines long.

我认为养成始终将连接条件放在 ON 子句中的习惯是最有意义的(除非它是一个外部连接并且您确实希望在 where 子句中使用它),因为它使任何阅读您的查询的人都更清楚表的连接条件是什么,它还有助于防止 WHERE 子句长达数十行。

回答by Vlad Mihalcea

This is a very common question, so this answer is based on this articleI wrote.

这是一个很常见的问题,所以这个答案是基于我写的这篇文章

Table relationship

表关系

Considering we have the following postand post_commenttables:

考虑到我们有以下postpost_comment表格:

The <code>post</code>and <code>post_comment</code>tables

<code>post</code>和 <code>post_comment</code>表

The posthas the following records:

post具有以下记录:

| id | title     |
|----|-----------|
| 1  | Java      |
| 2  | Hibernate |
| 3  | JPA       |

and the post_commenthas the following three rows:

并且post_comment具有以下三行:

| id | review    | post_id |
|----|-----------|---------|
| 1  | Good      | 1       |
| 2  | Excellent | 1       |
| 3  | Awesome   | 2       |

SQL INNER JOIN

SQL 内连接

The SQL JOIN clause allows you to associate rows that belong to different tables. For instance, a CROSS JOINwill create a Cartesian Product containing all possible combinations of rows between the two joining tables.

SQL JOIN 子句允许您关联属于不同表的行。例如,CROSS JOIN将创建一个笛卡尔积,其中包含两个连接表之间所有可能的行组合。

While the CROSS JOIN is useful in certain scenarios, most of the time, you want to join tables based on a specific condition. And, that's where INNER JOIN comes into play.

虽然 CROSS JOIN 在某些情况下很有用,但大多数情况下,您希望根据特定条件连接表。而且,这就是 INNER JOIN 发挥作用的地方。

The SQL INNER JOIN allows us to filter the Cartesian Product of joining two tables based on a condition that is specified via the ON clause.

SQL INNER JOIN 允许我们根据通过 ON 子句指定的条件过滤连接两个表的笛卡尔积。

SQL INNER JOIN - ON "always true" condition

SQL INNER JOIN - 在“始终为真”条件下

If you provide an "always true" condition, the INNER JOIN will not filter the joined records, and the result set will contain the Cartesian Product of the two joining tables.

如果您提供“始终为真”的条件,则 INNER JOIN 不会过滤连接的记录,结果集将包含两个连接表的笛卡尔积。

For instance, if we execute the following SQL INNER JOIN query:

例如,如果我们执行以下 SQL INNER JOIN 查询:

SELECT
   p.id AS "p.id",
   pc.id AS "pc.id"
FROM post p
INNER JOIN post_comment pc ON 1 = 1

We will get all combinations of postand post_commentrecords:

我们将获得postpost_comment记录的所有组合:

| p.id    | pc.id      |
|---------|------------|
| 1       | 1          |
| 1       | 2          |
| 1       | 3          |
| 2       | 1          |
| 2       | 2          |
| 2       | 3          |
| 3       | 1          |
| 3       | 2          |
| 3       | 3          |

So, if the ON clause condition is "always true", the INNER JOIN is simply equivalent to a CROSS JOIN query:

因此,如果 ON 子句条件“始终为真”,则 INNER JOIN 仅等效于 CROSS JOIN 查询:

SELECT
   p.id AS "p.id",
   pc.id AS "pc.id"
FROM post p
CROSS JOIN post_comment
WHERE 1 = 1
ORDER BY p.id, pc.id

SQL INNER JOIN - ON "always false" condition

SQL INNER JOIN - 在“始终为假”条件下

On the other hand, if the ON clause condition is "always false", then all the joined records are going to be filtered out and the result set will be empty.

另一方面,如果 ON 子句条件为“始终为假”,则所有连接的记录都将被过滤掉,结果集将为空。

So, if we execute the following SQL INNER JOIN query:

因此,如果我们执行以下 SQL INNER JOIN 查询:

SELECT
   p.id AS "p.id",
   pc.id AS "pc.id"
FROM post p
INNER JOIN post_comment pc ON 1 = 0
ORDER BY p.id, pc.id

We won't get any result back:

我们不会得到任何结果:

| p.id    | pc.id      |
|---------|------------|

That's because the query above is equivalent to the following CROSS JOIN query:

这是因为上面的查询等效于以下 CROSS JOIN 查询:

SELECT
   p.id AS "p.id",
   pc.id AS "pc.id"
FROM post p
CROSS JOIN post_comment
WHERE 1 = 0
ORDER BY p.id, pc.id

SQL INNER JOIN - ON clause using the Foreign Key and Primary Key columns

SQL INNER JOIN - 使用外键和主键列的 ON 子句

The most common ON clause condition is the one that matches the Foreign Key column in the child table with the Primary Key column in the parent table, as illustrated by the following query:

最常见的 ON 子句条件是将子表中的外键列与父表中的主键列匹配,如以下查询所示:

SELECT
   p.id AS "p.id",
   pc.post_id AS "pc.post_id",
   pc.id AS "pc.id",
   p.title AS "p.title",
   pc.review  AS "pc.review"
FROM post p
INNER JOIN post_comment pc ON pc.post_id = p.id
ORDER BY p.id, pc.id

When executing the above SQL INNER JOIN query, we get the following result set:

当执行上述 SQL INNER JOIN 查询时,我们得到以下结果集:

| p.id    | pc.post_id | pc.id      | p.title    | pc.review |
|---------|------------|------------|------------|-----------|
| 1       | 1          | 1          | Java       | Good      |
| 1       | 1          | 2          | Java       | Excellent |
| 2       | 2          | 3          | Hibernate  | Awesome   |

So, only the records that match the ON clause condition are included in the query result set. In our case, the result set contains all the postalong with their post_commentrecords. The postrows that have no associated post_commentare excluded since they can not satisfy the ON Clause condition.

因此,只有符合 ON 子句条件的记录才会包含在查询结果集中。在我们的例子中,结果集包含所有的post和他们的post_comment记录。post没有关联的行post_comment被排除,因为它们不能满足 ON 子句条件。

Again, the above SQL INNER JOIN query is equivalent to the following CROSS JOIN query:

同样,上面的 SQL INNER JOIN 查询等效于以下 CROSS JOIN 查询:

SELECT
   p.id AS "p.id",
   pc.post_id AS "pc.post_id",
   pc.id AS "pc.id",
   p.title AS "p.title",
   pc.review  AS "pc.review"
FROM post p, post_comment pc
WHERE pc.post_id = p.id

The non-struck rows are the ones that satisfy the WHERE clause, and only these records are going to be included in the result set. That's the best way to visualize how the INNER JOIN clause works.

未命中的行是满足 WHERE 子句的行,并且只有这些记录将包含在结果集中。这是可视化 INNER JOIN 子句如何工作的最佳方式。

| p.id | pc.post_id | pc.id | p.title   | pc.review |
|------|------------|-------|-----------|-----------|
| 1    | 1          | 1     | Java      | Good      |
| 1    | 1          | 2     | Java      | Excellent |
| 1    | 2          | 3     | Java      | Awesome   |
| 2    | 1          | 1     | Hibernate | Good      |
| 2    | 1          | 2     | Hibernate | Excellent |
| 2    | 2          | 3     | Hibernate | Awesome   |
| 3    | 1          | 1     | JPA       | Good      |
| 3    | 1          | 2     | JPA       | Excellent |
| 3    | 2          | 3     | JPA       | Awesome   |

Conclusion

结论

An INNER JOIN statement can be rewritten as a CROSS JOIN with a WHERE clause matching the same condition you used in the ON clause of the INNER JOIN query.

INNER JOIN 语句可以重写为 CROSS JOIN,其 WHERE 子句与您在 INNER JOIN 查询的 ON 子句中使用的条件相同。

Not that this only applies to INNER JOIN, not for OUTER JOIN.

并不是这仅适用于 INNER JOIN,不适用于 OUTER JOIN。

回答by Hrishikesh Mishra

There is great difference between where clausevs. on clause, when it comes to left join.

当涉及到左连接时,where 子句on 子句之间有很大的不同。

Here is example:

这是示例:

mysql> desc t1; 
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     | NO   |     | NULL    |       |
| fid   | int(11)     | NO   |     | NULL    |       |
| v     | varchar(20) | NO   |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+

There fid is id of table t2.

fid 是表 t2 的 id。

mysql> desc t2;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     | NO   |     | NULL    |       |
| v     | varchar(10) | NO   |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

Query on "on clause" :

查询“on 子句”:

mysql> SELECT * FROM `t1` left join t2 on fid = t2.id AND t1.v = 'K' 
    -> ;
+----+-----+---+------+------+
| id | fid | v | id   | v    |
+----+-----+---+------+------+
|  1 |   1 | H | NULL | NULL |
|  2 |   1 | B | NULL | NULL |
|  3 |   2 | H | NULL | NULL |
|  4 |   7 | K | NULL | NULL |
|  5 |   5 | L | NULL | NULL |
+----+-----+---+------+------+
5 rows in set (0.00 sec)

Query on "where clause":

查询“where子句”:

mysql> SELECT * FROM `t1` left join t2 on fid = t2.id where t1.v = 'K';
+----+-----+---+------+------+
| id | fid | v | id   | v    |
+----+-----+---+------+------+
|  4 |   7 | K | NULL | NULL |
+----+-----+---+------+------+
1 row in set (0.00 sec)

It is clear that, the first query returns a record from t1 and its dependent row from t2, if any, for row t1.v = 'K'.

很明显,第一个查询从 t1 返回一条记录,并从 t2 返回它的从属行(如果有的话),对于行 t1.v = 'K'。

The second query returns rows from t1, but only for t1.v = 'K' will have any associated row with it.

第二个查询从 t1 返回行,但仅对于 t1.v = 'K' 将有任何关联的行。

回答by Grant Limberg

In terms of the optimizer, it shouldn't make a difference whether you define your join clauses with ON or WHERE.

就优化器而言,使用 ON 或 WHERE 定义连接子句应该没有区别。

However, IMHO, I think it's much clearer to use the ON clause when performing joins. That way you have a specific section of you query that dictates how the join is handled versus intermixed with the rest of the WHERE clauses.

但是,恕我直言,我认为在执行连接时使用 ON 子句要清楚得多。这样,您就有了查询的特定部分,该部分指示如何处理连接以及如何与其余的 WHERE 子句混合。

回答by Cid

Let's consider those tables :

让我们考虑这些表:

A

一种

id | SomeData

B

id | id_A | SomeOtherData

id_Abeing a foreign key to table A

id_A作为表的外键 A

Writting this query :

编写此查询:

SELECT *
FROM A
LEFT JOIN B
ON A.id = B.id_A;

Will provide this result :

将提供这个结果:

/ : part of the result
                                       B
                      +---------------------------------+
            A         |                                 |
+---------------------+-------+                         |
|/////////////////////|///////|                         |
|/////////////////////|///////|                         |
|/////////////////////|///////|                         |
|/////////////////////|///////|                         |
|/////////////////////+-------+-------------------------+
|/////////////////////////////|
+-----------------------------+

What is in A but not in B means that there is null values for B.

在 A 中但不在 B 中的内容意味着 B 有空值。



Now, let's consider a specific part in B.id_A, and highlight it from the previous result :

现在,让我们考虑 中的特定部分B.id_A,并从之前的结果中突出显示它:

/ : part of the result
* : part of the result with the specific B.id_A
                                       B
                      +---------------------------------+
            A         |                                 |
+---------------------+-------+                         |
|/////////////////////|///////|                         |
|/////////////////////|///////|                         |
|/////////////////////+---+///|                         |
|/////////////////////|***|///|                         |
|/////////////////////+---+---+-------------------------+
|/////////////////////////////|
+-----------------------------+


Writting this query :

编写此查询:

SELECT *
FROM A
LEFT JOIN B
ON A.id = B.id_A
AND B.id_A = SpecificPart;

Will provide this result :

将提供这个结果:

/ : part of the result
* : part of the result with the specific B.id_A
                                       B
                      +---------------------------------+
            A         |                                 |
+---------------------+-------+                         |
|/////////////////////|       |                         |
|/////////////////////|       |                         |
|/////////////////////+---+   |                         |
|/////////////////////|***|   |                         |
|/////////////////////+---+---+-------------------------+
|/////////////////////////////|
+-----------------------------+

Because this removes in the inner join the values that aren't in B.id_A = SpecificPart

因为这会在内部连接中删除不在的值 B.id_A = SpecificPart



Now, let's change the query to this :

现在,让我们将查询更改为:

SELECT *
FROM A
LEFT JOIN B
ON A.id = B.id_A
WHERE B.id_A = SpecificPart;

The result is now :

结果现在是:

/ : part of the result
* : part of the result with the specific B.id_A
                                       B
                      +---------------------------------+
            A         |                                 |
+---------------------+-------+                         |
|                     |       |                         |
|                     |       |                         |
|                     +---+   |                         |
|                     |***|   |                         |
|                     +---+---+-------------------------+
|                             |
+-----------------------------+

Because the whole result is filtered against B.id_A = SpecificPartremoving the parts B.id_A = NULL, that are in the A that aren't in B

因为整个结果进行筛选B.id_A = SpecificPart除去部件B.id_A = NULL,即是在甲不在乙

回答by matthew david

Are you trying to join data or filter data?

您是要连接数据还是过滤数据?

For readability it makes the most sense to isolate these use cases to ON and WHERE respectively.

为了可读性,将这些用例分别隔离到 ON 和 WHERE 是最有意义的。

  • join data in ON
  • filter data in WHERE
  • 在 ON 中加入数据
  • 在 WHERE 中过滤数据

It can become very difficult to read a query where the JOIN condition and a filtering condition exist in the WHERE clause.

读取 WHERE 子句中存在 JOIN 条件和过滤条件的查询会变得非常困难。

Performance wise you should not see a difference, though different types of SQL sometimes handle query planning differently so it can be worth trying ˉ\_(ツ)_/ˉ(Do be aware of caching effecting the query speed)

性能方面你不应该看到差异,尽管不同类型的 SQL 有时会以不同的方式处理查询计划,因此值得尝试ˉ\_(ツ)_/ˉ(请注意缓存会影响查询速度)

Also as others have noted, if you use an outer join you will get different results if you place the filter condition in the ON clause because it only effects one of the tables.

另外正如其他人所指出的,如果您使用外部联接,如果将过滤条件放在 ON 子句中,您将获得不同的结果,因为它只影响其中一个表。

I wrote a more in depth post about this here: https://dataschool.com/learn/difference-between-where-and-on-in-sql

我在这里写了一篇更深入的文章:https: //dataschool.com/learn/difference-between-where-and-on-in-sql