SQL NOT IN 子句中的 NULL 值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/129077/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 23:36:52  来源:igfitidea点击:

NULL values inside NOT IN clause

sqlsql-servertsqlnullnotin

提问by Jamie Ide

This issue came up when I got different records counts for what I thought were identical queries one using a not inwhereconstraint and the other a left join. The table in the not inconstraint had one null value (bad data) which caused that query to return a count of 0 records. I sort of understand why but I could use some help fully grasping the concept.

当我为我认为相同的查询获得不同的记录计数时出现了这个问题,一个使用not inwhere约束,另一个使用left join. not in约束中的表有一个空值(坏数据),这导致该查询返回 0 条记录。我有点理解为什么,但我可以使用一些帮助来完全掌握这个概念。

To state it simply, why does query A return a result but B doesn't?

简单地说,为什么查询 A 返回结果而 B 没有?

A: select 'true' where 3 in (1, 2, 3, null)
B: select 'true' where 3 not in (1, 2, null)

This was on SQL Server 2005. I also found that calling set ansi_nulls offcauses B to return a result.

这是在 SQL Server 2005 上。我还发现调用set ansi_nulls off会导致 B 返回结果。

回答by Brannon

Query A is the same as:

查询 A 与以下内容相同:

select 'true' where 3 = 1 or 3 = 2 or 3 = 3 or 3 = null

Since 3 = 3is true, you get a result.

既然3 = 3是真的,你就会得到一个结果。

Query B is the same as:

查询 B 与以下内容相同:

select 'true' where 3 <> 1 and 3 <> 2 and 3 <> null

When ansi_nullsis on, 3 <> nullis UNKNOWN, so the predicate evaluates to UNKNOWN, and you don't get any rows.

ansi_nulls打开时,3 <> null是 UNKNOWN,所以谓词评估为 UNKNOWN,并且您不会得到任何行。

When ansi_nullsis off, 3 <> nullis true, so the predicate evaluates to true, and you get a row.

ansi_nulls关闭时,3 <> null为真,因此谓词评估为真,并且您得到一行。

回答by kristof

Whenever you use NULL you are really dealing with a Three-Valued logic.

每当您使用 NULL 时,您实际上是在处理三值逻辑。

Your first query returns results as the WHERE clause evaluates to:

您的第一个查询返回结果,因为 WHERE 子句的计算结果为:

    3 = 1 or 3 = 2 or 3 = 3 or 3 = null
which is:
    FALSE or FALSE or TRUE or UNKNOWN
which evaluates to 
    TRUE

The second one:

第二个:

    3 <> 1 and 3 <> 2 and 3 <> null
which evaluates to:
    TRUE and TRUE and UNKNOWN
which evaluates to:
    UNKNOWN

The UNKNOWN is not the same as FALSE you can easily test it by calling:

UNKNOWN 与 FALSE 不同,您可以通过调用轻松测试它:

select 'true' where 3 <> null
select 'true' where not (3 <> null)

Both queries will give you no results

这两个查询都不会给你任何结果

If the UNKNOWN was the same as FALSE then assuming that the first query would give you FALSE the second would have to evaluate to TRUE as it would have been the same as NOT(FALSE).
That is not the case.

如果 UNKNOWN 与 FALSE 相同,那么假设第一个查询会给你 FALSE,第二个将必须评估为 TRUE,因为它与 NOT(FALSE) 相同。
事实并非如此。

There is a very good article on this subject on SqlServerCentral.

SqlServerCentral 上一篇关于这个主题的非常好的文章

The whole issue of NULLs and Three-Valued Logic can be a bit confusing at first but it is essential to understand in order to write correct queries in TSQL

NULL 和三值逻辑的整个问题一开始可能有点令人困惑,但为了在 TSQL 中编写正确的查询,理解它是必不可少的

Another article I would recommend is SQL Aggregate Functions and NULL.

我推荐的另一篇文章是SQL 聚合函数和 NULL

回答by YonahW

NOT INreturns 0 records when compared against an unknown value

NOT IN与未知值比较时返回 0 条记录

Since NULLis an unknown, a NOT INquery containing a NULLor NULLs in the list of possible values will always return 0records since there is no way to be sure that the NULLvalue is not the value being tested.

由于NULL是未知数,因此在可能值列表中NOT IN包含 aNULLNULLs的查询将始终返回0记录,因为无法确定该NULL值不是正在测试的值。

回答by Sunny Milenov

Compare to null is undefined, unless you use IS NULL.

与 null 比较未定义,除非您使用 IS NULL。

So, when comparing 3 to NULL (query A), it returns undefined.

因此,当将 3 与 NULL(查询 A)进行比较时,它返回 undefined。

I.e. SELECT 'true' where 3 in (1,2,null) and SELECT 'true' where 3 not in (1,2,null)

即 SELECT 'true' where 3 in (1,2,null) 和 SELECT 'true' where 3 not in (1,2,null)

will produce the same result, as NOT (UNDEFINED) is still undefined, but not TRUE

将产生相同的结果,因为 NOT (UNDEFINED) 仍然未定义,但不是 TRUE

回答by onedaywhen

The title of this question at the time of writing is

在撰写本文时,这个问题的标题是

SQL NOT IN constraint and NULL values

SQL NOT IN 约束和 NULL 值

From the text of the question it appears that the problem was occurring in a SQL DML SELECTquery, rather than a SQL DDL CONSTRAINT.

从问题的文本看来,问题出在 SQL DMLSELECT查询中,而不是 SQL DDL 中CONSTRAINT

However, especially given the wording of the title, I want to point out that some statements made here are potentially misleading statements, those along the lines of (paraphrasing)

然而,特别是考虑到标题的措辞,我想指出这里所做的一些陈述可能具有误导性,那些与(释义)

When the predicate evaluates to UNKNOWN you don't get any rows.

当谓词评估为 UNKNOWN 时,您不会得到任何行。

Although this is the case for SQL DML, when considering constraints the effect is different.

尽管 SQL DML 就是这种情况,但在考虑约束时,效果是不同的。

Consider this very simple table with two constraints taken directly from the predicates in the question (and addressed in an excellent answer by @Brannon):

考虑这个非常简单的表,其中有两个约束直接取自问题中的谓词(并在@Brannon 的出色回答中解决):

DECLARE @T TABLE 
(
 true CHAR(4) DEFAULT 'true' NOT NULL, 
 CHECK ( 3 IN (1, 2, 3, NULL )), 
 CHECK ( 3 NOT IN (1, 2, NULL ))
);

INSERT INTO @T VALUES ('true');

SELECT COUNT(*) AS tally FROM @T;

As per @Brannon's answer, the first constraint (using IN) evaluates to TRUE and the second constraint (using NOT IN) evaluates to UNKNOWN. However, the insert succeeds! Therefore, in this case it is not strictly correct to say, "you don't get any rows" because we have indeed got a row inserted as a result.

根据@Brannon 的回答,第一个约束(使用IN)计算为 TRUE,第二个约束(使用NOT IN)计算为 UNKNOWN。但是,插入成功!因此,在这种情况下,严格地说“你没有得到任何行”是不正确的,因为我们确实因此插入了一行。

The above effect is indeed the correct one as regards the SQL-92 Standard. Compare and contrast the following section from the SQL-92 spec

对于 SQL-92 标准,上述效果确实是正确的。比较和对比 SQL-92 规范中的以下部分

7.6 where clause

The result of the is a table of those rows of T for which the result of the search condition is true.

4.10 Integrity constraints

A table check constraint is satisfied if and only if the specified search condition is not false for any row of a table.

7.6 where子句

的结果是 T 中搜索条件结果为真的那些行的表。

4.10 完整性约束

当且仅当表的任何行的指定搜索条件不为假时,才满足表检查约束。

In other words:

换句话说:

In SQL DML, rows are removed from the result when the WHEREevaluates to UNKNOWN because it does notsatisfy the condition "is true".

在SQL DML,行从结果中删除时,WHERE计算结果为未知的,因为它符合条件“是真的”。

In SQL DDL (i.e. constraints), rows are not removed from the result when they evaluate to UNKNOWN because it doessatisfy the condition "is not false".

在 SQL DDL(即约束)中,当行计算为 UNKNOWN 时,不会从结果中删除行,因为它 确实满足条件“不为假”。

Although the effects in SQL DML and SQL DDL respectively may seem contradictory, there is practical reason for giving UNKNOWN results the 'benefit of the doubt' by allowing them to satisfy a constraint (more correctly, allowing them to not fail to satisfy a constraint): without this behaviour, every constraints would have to explicitly handle nulls and that would be very unsatisfactory from a language design perspective (not to mention, a right pain for coders!)

尽管分别在 SQL DML 和 SQL DDL 中的效果可能看起来相互矛盾,但通过允许它们满足约束(更准确地说,允许它们不会不满足约束)来赋予 UNKNOWN 结果“怀疑的好处”是有实际原因的。 :如果没有这种行为,每个约束都必须明确处理空值,从语言设计的角度来看,这将非常令人不满意(更不用说,对编码人员来说是一种痛苦!)

p.s. if you are finding it as challenging to follow such logic as "unknown does not fail to satisfy a constraint" as I am to write it, then consider you can dispense with all this simply by avoiding nullable columns in SQL DDL and anything in SQL DML that produces nulls (e.g. outer joins)!

ps,如果您发现遵循诸如“未知不会满足约束”这样的逻辑就像我编写它一样具有挑战性,那么请考虑您可以通过避免 SQL DDL 中的可为空列和 SQL 中的任何内容来免除所有这些产生空值的 DML(例如外连接)!

回答by onedaywhen

It may be concluded from answers here that NOT IN (subquery)doesn't handle nulls correctly and should be avoided in favour of NOT EXISTS. However, such a conclusion may be premature. In the following scenario, credited to Chris Date (Database Programming and Design, Vol 2 No 9, September 1989), it is NOT INthat handles nulls correctly and returns the correct result, rather than NOT EXISTS.

可以从这里的答案中得出结论,这些答案NOT IN (subquery)没有正确处理空值,应该避免使用NOT EXISTS. 然而,这样的结论可能为时过早。在以下场景中,归功于 Chris Date(数据库编程和设计,第 2 卷第 9 期,1989 年 9 月),它NOT IN正确处理空值并返回正确的结果,而不是NOT EXISTS

Consider a table spto represent suppliers (sno) who are known to supply parts (pno) in quantity (qty). The table currently holds the following values:

考虑一个表sp来表示sno已知供应pno数量 ( qty) 的零件 ( ) 的供应商( )。该表当前包含以下值:

      VALUES ('S1', 'P1', NULL), 
             ('S2', 'P1', 200),
             ('S3', 'P1', 1000)

Note that quantity is nullable i.e. to be able to record the fact a supplier is known to supply parts even if it is not known in what quantity.

请注意,数量可以为空,即即使不知道供应商的数量是多少,也可以记录供应商已知供应零件的事实。

The task is to find the suppliers who are known supply part number 'P1' but not in quantities of 1000.

任务是找到已知供应零件编号“P1”但数量不超过 1000 的供应商。

The following uses NOT INto correctly identify supplier 'S2' only:

以下仅用于NOT IN正确识别供应商“S2”:

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1', NULL ), 
                       ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT DISTINCT spx.sno
  FROM sp spx
 WHERE spx.pno = 'P1'
       AND 1000 NOT IN (
                        SELECT spy.qty
                          FROM sp spy
                         WHERE spy.sno = spx.sno
                               AND spy.pno = 'P1'
                       );

However, the below query uses the same general structure but with NOT EXISTSbut incorrectly includes supplier 'S1' in the result (i.e. for which the quantity is null):

但是,下面的查询使用相同的一般结构,NOT EXISTS但在结果中错误地包含了供应商“S1”(即数量为空):

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1', NULL ), 
                       ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT DISTINCT spx.sno
  FROM sp spx
 WHERE spx.pno = 'P1'
       AND NOT EXISTS (
                       SELECT *
                         FROM sp spy
                        WHERE spy.sno = spx.sno
                              AND spy.pno = 'P1'
                              AND spy.qty = 1000
                      );

So NOT EXISTSis not the silver bullet it may have appeared!

所以NOT EXISTS不是它可能出现的银弹!

Of course, source of the problem is the presence of nulls, therefore the 'real' solution is to eliminate those nulls.

当然,问题的根源是空值的存在,因此“真正”的解决方案是消除这些空值。

This can be achieved (among other possible designs) using two tables:

这可以使用两个表来实现(在其他可能的设计中):

  • spsuppliers known to supply parts
  • spqsuppliers known to supply parts in known quantities
  • sp已知供应零件的供应商
  • spq已知供应已知数量零件的供应商

noting there should probably be a foreign key constraint where spqreferences sp.

注意到可能应该有一个外键约束,其中spq引用sp

The result can then be obtained using the 'minus' relational operator (being the EXCEPTkeyword in Standard SQL) e.g.

然后可以使用“减”关系运算符(EXCEPT标准 SQL 中的关键字)获得结果,例如

WITH sp AS 
     ( SELECT * 
         FROM ( VALUES ( 'S1', 'P1' ), 
                       ( 'S2', 'P1' ),
                       ( 'S3', 'P1' ) )
              AS T ( sno, pno )
     ),
     spq AS 
     ( SELECT * 
         FROM ( VALUES ( 'S2', 'P1', 200 ),
                       ( 'S3', 'P1', 1000 ) )
              AS T ( sno, pno, qty )
     )
SELECT sno
  FROM spq
 WHERE pno = 'P1'
EXCEPT 
SELECT sno
  FROM spq
 WHERE pno = 'P1'
       AND qty = 1000;

回答by Dave Costa

In A, 3 is tested for equality against each member of the set, yielding (FALSE, FALSE, TRUE, UNKNOWN). Since one of the elements is TRUE, the condition is TRUE. (It's also possible that some short-circuiting takes place here, so it actually stops as soon as it hits the first TRUE and never evaluates 3=NULL.)

在 A 中,针对集合中的每个成员测试 3 是否相等,产生 (FALSE, FALSE, TRUE, UNKNOWN)。由于元素之一为 TRUE,因此条件为 TRUE。(也有可能这里发生了一些短路,所以它实际上在遇到第一个 TRUE 时立即停止并且永远不会评估 3=NULL。)

In B, I think it is evaluating the condition as NOT (3 in (1,2,null)). Testing 3 for equality against the set yields (FALSE, FALSE, UNKNOWN), which is aggregated to UNKNOWN. NOT ( UNKNOWN ) yields UNKNOWN. So overall the truth of the condition is unknown, which at the end is essentially treated as FALSE.

在 B 中,我认为它将条件评估为 NOT (3 in (1,2,null))。测试 3 是否与集合收益率(FALSE、FALSE、UNKNOWN)相等,该集合被聚合到 UNKNOWN。NOT ( UNKNOWN ) 产生 UNKNOWN。所以总的来说,情况的真相是未知的,最后基本上被视为 FALSE。

回答by Cruachan

Null signifies and absence of data, that is it is unknown, not a data value of nothing. It's very easy for people from a programming background to confuse this because in C type languages when using pointers null is indeed nothing.

Null 表示没有数据,也就是说它是未知的,而不是什么都没有的数据值。有编程背景的人很容易混淆这一点,因为在 C 类型语言中,当使用指针时 null 确实什么都不是。

Hence in the first case 3 is indeed in the set of (1,2,3,null) so true is returned

因此在第一种情况下 3 确实在 (1,2,3,null) 的集合中,所以返回 true

In the second however you can reduce it to

但是,在第二个中,您可以将其减少到

select 'true' where 3 not in (null)

选择“真”,其中 3 不在(空)

So nothing is returned because the parser knows nothing about the set to which you are comparing it - it's not an empty set but an unknown set. Using (1, 2, null) doesn't help because the (1,2) set is obviously false, but then you're and'ing that against unknown, which is unknown.

因此没有返回任何内容,因为解析器对您要比较的集合一无所知 - 它不是空集而是未知集。使用 (1, 2, null) 无济于事,因为 (1,2) 集合显然是错误的,但是您正在和未知的未知对象相反。

回答by Mihai

IF you want to filter with NOT IN for a subquery containg NULLs justcheck for not null

如果您想使用 NOT IN 过滤包含 NULL 的子查询,请检查 not null

SELECT blah FROM t WHERE blah NOT IN
        (SELECT someotherBlah FROM t2 WHERE someotherBlah IS NOT NULL )

回答by Salman A

SQL uses three-valued logic for truth values. The INquery produces the expected result:

SQL 对真值使用三值逻辑。该IN查询产生预期的结果:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE col IN (NULL, 1)
-- returns first row

But adding a NOTdoes not invert the results:

但是添加 aNOT不会反转结果:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE NOT col IN (NULL, 1)
-- returns zero rows

This is because the above query is equivalent of the following:

这是因为上述查询等效于以下内容:

SELECT * FROM (VALUES (1), (2)) AS tbl(col) WHERE NOT (col = NULL OR col = 1)

Here is how the where clause is evaluated:

以下是 where 子句的评估方式:

| col | col = NULL (1) | col = 1 | col = NULL OR col = 1 | NOT (col = NULL OR col = 1) |
|-----|----------------|---------|-----------------------|-----------------------------|
| 1   | UNKNOWN        | TRUE    | TRUE                  | FALSE                       |
| 2   | UNKNOWN        | FALSE   | UNKNOWN (2)           | UNKNOWN (3)                 |

Notice that:

请注意:

  1. The comparison involving NULLyields UNKNOWN
  2. The ORexpression where none of the operands are TRUEand at least one operand is UNKNOWNyields UNKNOWN(ref)
  3. The NOTof UNKNOWNyields UNKNOWN(ref)
  1. 涉及NULL产量的比较UNKNOWN
  2. OR没有操作数TRUE且至少有一个操作数为UNKNOWNyields UNKNOWN( ref)的表达式
  3. 所述NOTUNKNOWN产率UNKNOWNREF

You can extend the above example to more than two values (e.g. NULL, 1 and 2) but the result will be same: if one of the values is NULLthen no row will match.

您可以将上面的示例扩展到两个以上的值(例如 NULL、1 和 2),但结果将相同:如果其中一个值是,NULL则没有行将匹配。