MySQL 在连接条件上使用 IS NULL 或 IS NOT NULL - 理论问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6630887/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using IS NULL or IS NOT NULL on join conditions - Theory question
提问by JoshG
Theory question here:
理论问题在这里:
Why does specifying table.field IS NULL or table.field IS NOT NULL not work on a join condition (left or right join for instance) but only in the where condition?
为什么指定 table.field IS NULL 或 table.field IS NOT NULL 不适用于连接条件(例如左连接或右连接),而仅适用于 where 条件?
Non working Example:
非工作示例:
-this should return all shipments with any returns (non null values) filtered out. However, this returns all shipments regardless if anything meets the [r.id is null] statement.
- 这应该返回过滤掉任何退货(非空值)的所有货件。但是,无论是否有任何内容符合 [r.id is null] 语句,这都会返回所有货物。
SELECT
*
FROM
shipments s
LEFT OUTER JOIN returns r
ON s.id = r.id
AND r.id is null
WHERE
s.day >= CURDATE() - INTERVAL 10 DAY
Working example:
工作示例:
-This returns the correct amount of rows which is total shipments, less any related to a returns (non null values).
- 这将返回正确的行数,即总发货量,减去与退货相关的任何行数(非空值)。
SELECT
*
FROM
shipments s
LEFT OUTER JOIN returns r
ON s.id = r.id
WHERE
s.day >= CURDATE() - INTERVAL 10 DAY
AND r.id is null
Why is this the case? All other filter conditions between two tables being joined work just fine, but for some reason IS NULL and IS NOT NULL filters do not work unless in the where statement.
为什么会这样?连接的两个表之间的所有其他过滤条件都可以正常工作,但由于某种原因,除非在 where 语句中,否则 IS NULL 和 IS NOT NULL 过滤器不起作用。
What is the reason for this?
这是什么原因?
回答by ypercube??
Example with tables A and B:
表 A 和 B 的示例:
A (parent) B (child)
============ =============
id | name pid | name
------------ -------------
1 | Alex 1 | Kate
2 | Bill 1 | Lia
3 | Cath 3 | Mary
4 | Dale NULL | Pan
5 | Evan
If you want to find parents and their kids, you do an INNER JOIN
:
如果你想找到父母和他们的孩子,你可以INNER JOIN
:
SELECT id, parent.name AS parent
, pid, child.name AS child
FROM
parent INNER JOIN child
ON parent.id = child.pid
Result is that every match of a parent
's id
from the left table and a child
's pid
from the second table will show as a row in the result:
结果是一个每场比赛parent
的id
左表和child
S'pid
从第二表将显示在结果中的一行:
+----+--------+------+-------+
| id | parent | pid | child |
+----+--------+------+-------+
| 1 | Alex | 1 | Kate |
| 1 | Alex | 1 | Lia |
| 3 | Cath | 3 | Mary |
+----+--------+------+-------+
Now, the above does not show parents without kids (because their ids do not have a match in child's ids, so what do you do? You do an outer join instead. There are three types of outer joins, the left, the right and the full outer join. We need the left one as we want the "extra" rows from the left table (parent):
现在,上面没有显示没有孩子的父母(因为他们的 id 在孩子的 id 中没有匹配,所以你怎么办?你做一个外连接。外连接有三种类型,左,右和完整的外连接。我们需要左边的,因为我们想要左表(父)中的“额外”行:
SELECT id, parent.name AS parent
, pid, child.name AS child
FROM
parent LEFT JOIN child
ON parent.id = child.pid
Result is that besides previous matches, all parents that do not have a match (read: do not have a kid) are shown too:
结果是,除了之前的匹配项之外,所有没有匹配项(阅读:没有孩子)的父母也会显示出来:
+----+--------+------+-------+
| id | parent | pid | child |
+----+--------+------+-------+
| 1 | Alex | 1 | Kate |
| 1 | Alex | 1 | Lia |
| 3 | Cath | 3 | Mary |
| 2 | Bill | NULL | NULL |
| 4 | Dale | NULL | NULL |
| 5 | Evan | NULL | NULL |
+----+--------+------+-------+
Where did all those NULL
come from? Well, MySQL (or any other RDBMS you may use) will not know what to put there as these parents have no match (kid), so there is no pid
nor child.name
to match with those parents. So, it puts this special non-value called NULL
.
所有这些NULL
都是从哪里来的?好吧,MySQL(或您可能使用的任何其他 RDBMS)将不知道该放什么,因为这些父母没有匹配(孩子),所以没有pid
也没有child.name
匹配这些父母。所以,它把这个特殊的非值称为NULL
.
My point is that these NULLs
are created (in the result set) during the LEFT OUTER JOIN
.
我的观点是这些NULLs
是在LEFT OUTER JOIN
.
So, if we want to show only the parents that do NOT have a kid, we can add a WHERE child.pid IS NULL
to the LEFT JOIN
above. The WHERE
clause is evaluated (checked) after the JOIN
is done. So, it's clear from the above result that only the last three rows where the pid
is NULL will be shown:
所以,如果我们只想显示没有孩子的父母,我们可以WHERE child.pid IS NULL
在LEFT JOIN
上面添加一个。在WHERE
JOIN
完成后评估(检查)该子句。所以,从上面的结果可以清楚地看出,只有最后三行pid
是 NULL 才会显示:
SELECT id, parent.name AS parent
, pid, child.name AS child
FROM
parent LEFT JOIN child
ON parent.id = child.pid
WHERE child.pid IS NULL
Result:
结果:
+----+--------+------+-------+
| id | parent | pid | child |
+----+--------+------+-------+
| 2 | Bill | NULL | NULL |
| 4 | Dale | NULL | NULL |
| 5 | Evan | NULL | NULL |
+----+--------+------+-------+
Now, what happens if we move that IS NULL
check from the WHERE
to the joining ON
clause?
现在,如果我们将该IS NULL
检查从 the移动WHERE
到 joinON
子句会发生什么?
SELECT id, parent.name AS parent
, pid, child.name AS child
FROM
parent LEFT JOIN child
ON parent.id = child.pid
AND child.pid IS NULL
In this case the database tries to find rows from the two tables that match these conditions. That is, rows where parent.id = child.pid
ANDchild.pid IN NULL
. But it can find no such matchbecause no child.pid
can be equal to something (1, 2, 3, 4 or 5) and be NULL at the same time!
在这种情况下,数据库尝试从两个表中查找与这些条件匹配的行。也就是说,parent.id = child.pid
AND所在的行child.pid IN NULL
。但是它找不到这样的匹配项,因为 nochild.pid
可以等于某物(1、2、3、4 或 5)并且同时为 NULL!
So, the condition:
所以,条件:
ON parent.id = child.pid
AND child.pid IS NULL
is equivalent to:
相当于:
ON 1 = 0
which is always False
.
这总是False
.
So, why does it return ALL rows from the left table? Because it's a LEFT JOIN!And left joins return rows that match (none in this case)and also rows from the left table that do not matchthe check (all in this case):
那么,为什么它返回左表中的所有行?因为它是一个LEFT JOIN!左连接返回匹配的行(在这种情况下没有)以及左表中与检查不匹配的行(在这种情况下都是):
+----+--------+------+-------+
| id | parent | pid | child |
+----+--------+------+-------+
| 1 | Alex | NULL | NULL |
| 2 | Bill | NULL | NULL |
| 3 | Cath | NULL | NULL |
| 4 | Dale | NULL | NULL |
| 5 | Evan | NULL | NULL |
+----+--------+------+-------+
I hope the above explanation is clear.
我希望上面的解释是清楚的。
Sidenote (not directly related to your question): Why on earth doesn't Pan
show up in none of our JOINs? Because his pid
is NULL
and NULL in the (not common) logic of SQL is not equal to anything so it can't match with any of the parent ids (which are 1,2,3,4 and 5). Even if there was a NULL there, it still wouldn't match because NULL
does not equal anything, not even NULL
itself (it's a very strange logic, indeed!). That's why we use the special check IS NULL
and not a = NULL
check.
旁注(与您的问题没有直接关系):为什么Pan
我们的 JOIN 中都没有出现?因为他的pid
isNULL
和 NULL 在 SQL 的(不常见)逻辑中不等于任何东西,所以它不能与任何父 ID(1、2、3、4 和 5)匹配。即使那里有一个 NULL,它仍然不会匹配,因为NULL
它不等于任何东西,甚至不等于NULL
它本身(这是一个非常奇怪的逻辑,确实!)。这就是我们使用特殊支票IS NULL
而不是= NULL
支票的原因。
So, will Pan
show up if we do a RIGHT JOIN
? Yes, it will! Because a RIGHT JOIN will show all results that match (the first INNER JOIN we did) plus all rows from the RIGHT table that don't match (which in our case is one, the (NULL, 'Pan')
row.
那么,Pan
如果我们做 a会出现RIGHT JOIN
吗?是的,它会!因为 RIGHT JOIN 将显示所有匹配的结果(我们做的第一个 INNER JOIN)加上 RIGHT 表中不匹配的所有行(在我们的例子中是一个,(NULL, 'Pan')
行。
SELECT id, parent.name AS parent
, pid, child.name AS child
FROM
parent RIGHT JOIN child
ON parent.id = child.pid
Result:
结果:
+------+--------+------+-------+
| id | parent | pid | child |
+---------------+------+-------+
| 1 | Alex | 1 | Kate |
| 1 | Alex | 1 | Lia |
| 3 | Cath | 3 | Mary |
| NULL | NULL | NULL | Pan |
+------+--------+------+-------+
Unfortunately, MySQL does not have FULL JOIN
. You can try it in other RDBMSs, and it will show:
不幸的是,MySQL 没有FULL JOIN
. 您可以在其他 RDBMS 中尝试它,它会显示:
+------+--------+------+-------+
| id | parent | pid | child |
+------+--------+------+-------+
| 1 | Alex | 1 | Kate |
| 1 | Alex | 1 | Lia |
| 3 | Cath | 3 | Mary |
| 2 | Bill | NULL | NULL |
| 4 | Dale | NULL | NULL |
| 5 | Evan | NULL | NULL |
| NULL | NULL | NULL | Pan |
+------+--------+------+-------+
回答by Sabeen Malik
The NULL
part is calculated AFTER the actual join, so that is why it needs to be in the where clause.
该NULL
部分是在实际连接之后计算的,这就是为什么它需要在 where 子句中。
回答by Muhammad Usama
Actually NULL filter is not being ignored. Thing is this is how joining two tables work.
实际上 NULL 过滤器并没有被忽略。事情是这就是连接两个表的工作方式。
I will try to walk down with the steps performed by database server to make it understand.
For example when you execute the query which you said is ignoring the NULL condition.
SELECT
*
FROM
shipments s
LEFT OUTER JOIN returns r
ON s.id = r.id
AND r.id is null
WHERE
s.day >= CURDATE() - INTERVAL 10 DAY
我将尝试逐步介绍数据库服务器执行的步骤以使其理解。例如,当您执行您所说的忽略 NULL 条件的查询时。SELECT * FROM
shippings s LEFT OUTER JOIN 返回 r ON s.id = r.id AND r.id 为 null WHERE s.day >= CURDATE() - INTERVAL 10 DAY
1st thing happened is all the rows from table SHIPMENTS get selected
发生的第一件事是表 SHIPMENTS 中的所有行都被选中
on next step database server will start selecting one by one record from 2nd(RETURNS) table.
在下一步数据库服务器将开始从 2nd(RETURNS) 表中选择一条记录。
on third step the record from RETURNS table will be qualified against the join conditions you have provided in the query which in this case is (s.id = r.id and r.id is NULL)
在第三步中,RETURNS 表中的记录将根据您在查询中提供的连接条件进行限定,在本例中为 (s.id = r.id 且 r.id 为 NULL)
note that this qualification applied on third step only decides if server should accept or reject the current record of RETURNS table to append with the selected row of SHIPMENT table. It can in no way effect the selection of record from SHIPMENT table.
请注意,在第三步中应用的此限定仅决定服务器是否应接受或拒绝 RETURNS 表的当前记录以附加到 SHIPMENT 表的选定行。它决不会影响从 SHIPMENT 表中选择记录。
And once server is done with joining two tables which contains all the rows of SHIPMENT table and selected rows of RETURNS table it applies the where clause on the intermediate result. so when you put (r.id is NULL) condition in where clause than all the records from the intermediate result with r.id = null gets filtered out.
一旦服务器完成连接两个表,其中包含 SHIPMENT 表的所有行和 RETURNS 表的选定行,它会在中间结果上应用 where 子句。因此,当您在 where 子句中放置 (r.id is NULL) 条件时,来自具有 r.id = null 的中间结果的所有记录都会被过滤掉。
回答by Suroot
You're doing a LEFT OUTTER JOIN
which indicates that you want every tuple from the table on the LEFT of the statement regardless of it has a matching record in the RIGHT table. This being the case, your results are being pruned from the RIGHT table but you're ending up with the same results as if you didn't include the AND at all within the ON clause.
您正在执行 aLEFT OUTTER JOIN
表示您想要语句左侧表中的每个元组,而不管它在 RIGHT 表中有匹配的记录。在这种情况下,您的结果正在从 RIGHT 表中删除,但您最终得到的结果与在 ON 子句中根本没有包含 AND 的结果相同。
Performing the AND in the WHERE clause causes the prune to happen after the LEFT JOIN takes place.
在 WHERE 子句中执行 AND 会导致在 LEFT JOIN 发生后发生修剪。
回答by Joe Stefanelli
The WHERE
clause is evaluated after the JOIN
conditions have been processed.
在WHERE
后子句评价JOIN
条件都已经被处理。
回答by Paul Sonier
Your execution plan should make this clear; the JOIN takes precedence, after which the results are filtered.
你的执行计划应该清楚这一点;JOIN 优先,然后过滤结果。