SQL EXISTS 与 JOIN 以及 EXISTS 子句的使用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7082449/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 11:42:46  来源:igfitidea点击:

EXISTS vs JOIN and use of EXISTS clause

sqlsql-serversql-server-2005sql-server-2008

提问by satyajit

Below is the code sample:

下面是代码示例:

CREATE TABLE #titles(
    title_id       varchar(20),
    title          varchar(80)       NOT NULL,
    type           char(12)          NOT NULL,
    pub_id         char(4)               NULL,
    price          money                 NULL,
    advance        money                 NULL,
    royalty        int                   NULL,
    ytd_sales      int                   NULL,
    notes          varchar(200)          NULL,
    pubdate        datetime          NOT NULL
 )
 GO

 insert #titles values ('1', 'Secrets',   'popular_comp', '1389', .00, 00.00, 10, 4095,'Note 1','06/12/94')
 insert #titles values ('2', 'The',       'business',     '1389', .99, 00.00, 10, 4095,'Note 2','06/12/91')
 insert #titles values ('3', 'Emotional', 'psychology',   '0736', .99,  00.00, 10, 3336,'Note 3','06/12/91')
 insert #titles values ('4', 'Prolonged', 'psychology',   '0736', .99, 00.00, 10, 4072,'Note 4','06/12/91')
 insert #titles values ('5', 'With',      'business',     '1389', .95, 00.00, 10, 3876,'Note 5','06/09/91')
 insert #titles values ('6', 'Valley',    'mod_cook',     '0877', .99, 
SELECT * 
  FROM TABLE a
 WHERE a.val IN (1,2,3)
   AND NOT EXISTS(SELECT NULL
                    FROM TABLE b
                   WHERE b.id = a.id
                     AND b.val NOT IN (1, 2, 3))
.00, 12, 2032,'Note 6','06/09/91') insert #titles values ('7', 'Any?', 'trad_cook', '0877', .99, 00.00, 10, 4095,'Note 7','06/12/91') insert #titles values ('8', 'Fifty', 'trad_cook', '0877', .95, 00.00, 14, 1509,'Note 8','06/12/91') GO CREATE TABLE #sales( stor_id char(4) NOT NULL, ord_num varchar(20) NOT NULL, ord_date datetime NOT NULL, qty smallint NOT NULL, payterms varchar(12) NOT NULL, title_id varchar(80) ) GO insert #sales values('1', 'QA7442.3', '09/13/94', 75, 'ON Billing','1') insert #sales values('2', 'D4482', '09/14/94', 10, 'Net 60', '1') insert #sales values('3', 'N914008', '09/14/94', 20, 'Net 30', '2') insert #sales values('4', 'N914014', '09/14/94', 25, 'Net 30', '3') insert #sales values('5', '423LL922', '09/14/94', 15, 'ON Billing','3') insert #sales values('6', '423LL930', '09/14/94', 10, 'ON Billing','2') SELECT title, price FROM #titles WHERE EXISTS (SELECT * FROM #sales WHERE #sales.title_id = #titles.title_id AND qty >30) SELECT t.title, t.price FROM #titles t inner join #sales s on t.title_id = s.title_id where s.qty >30

I want to know what is the difference between the above 2 queries which gives the same result.Also want to know the purpose of EXISTS keyword and where exactly to use?

我想知道上面给出相同结果的 2 个查询之间有什么区别。还想知道 EXISTS 关键字的用途以及确切使用的位置?

回答by JNK

EXISTSis used to return a boolean value, JOINreturns a whole other table

EXISTS用于返回布尔值,JOIN返回整个其他表

EXISTSis only used to test if a subquery returns results, and short circuits as soon as it does. JOINis used to extend a result set by combining it with additional fields from another table to which there is a relation.

EXISTS仅用于测试子查询是否返回结果,并在返回后立即短路。 JOIN用于通过将结果集与来自另一个有关系的表的附加字段组合来扩展结果集。

In your example, the queries are semantically equivalent.

在您的示例中,查询在语义上是等效的。

In general, use EXISTSwhen:

通常,EXISTS在以下情况下使用:

  • You don't need to return data from the related table
  • You have dupes in the related table (JOINcan cause duplicate rows if values are repeated)
  • You want to check existence (use instead of LEFT OUTER JOIN...NULLcondition)
  • 您不需要从相关表中返回数据
  • 您在相关表中JOIN有重复(如果值重复,可能会导致重复行)
  • 你想检查存在(使用而不是LEFT OUTER JOIN...NULL条件)

If you have proper indexes, most of the time the EXISTSwill perform identically to the JOIN. The exception is on very complicated subqueries, where it is normally quicker to use EXISTS.

如果您有适当的索引,大多数情况下EXISTS将与JOIN. 例外是在非常复杂的子查询中,通常使用EXISTS.

If your JOINkey is not indexed, it may be quicker to use EXISTSbut you will need to test for your specific circumstance.

如果您的JOIN密钥未编入索引,使用起来可能会更快,EXISTS但您需要针对您的特定情况进行测试。

JOINsyntax is easier to read and clearer normally as well.

JOIN语法通常也更容易阅读和清晰。

回答by gbn

  • EXISTS is a semi-join
  • JOIN is a join
  • EXISTS 是半连接
  • JOIN 是一个连接

So with 3 rows and 5 rows matching

所以有 3 行和 5 行匹配

  • JOIN gives 15 rows
  • EXISTS gives 3 rows
  • JOIN 给出 15 行
  • EXISTS 给出 3 行

The result is the "short circuit" effect mentioned by others and no need to use DISTINCT with a JOIN. EXISTS is almost always quicker when looking for existence of rows on the n side of a 1:n relationship.

结果是其他人提到的“短路”效应,不需要使用 DISTINCT 和 JOIN。在 1:n 关系的 n 侧查找行的存在时,EXISTS 几乎总是更快。

回答by Aaron Bertrand

EXISTSis primarily used to shortcut. Essentially the optimizer will bail out as soon as the condition is true, so it may not need to scan the entire table (in modern versions of SQL Server this optimization can occur for IN()as well, though this was not always true). This behavior can vary from query to query, and in some cases the join may actually give the optimizer more opportunity to do its job. So I think it's hard to say "this is when you should use EXISTS, and this is when you shouldn't" because, like a lot of things, "it depends."

EXISTS主要用于快捷方式。本质上,只要条件为真,优化器就会退出,因此它可能不需要扫描整个表(在现代版本的 SQL Server 中也可以进行这种优化IN(),尽管并非总是如此)。这种行为可能因查询而异,在某些情况下,联接实际上可能会给优化器更多的机会来完成它的工作。所以我认为很难说“这是你应该使用的时候EXISTS,这是你不应该使用的时候”,因为像很多事情一样,“这取决于”。

That said, in this case, since you have essentially a 1:1 match between the tables, you are unlikely to see any performance difference and the optimizer will likely produce a similar or even identical plan. You may see something different if you compare join/exists on the sales table when you add 50,000 rows for each title (never mind that you will need to change your join query to remove duplicates, aggregate, what have you).

也就是说,在这种情况下,由于表之间基本上是 1:1 匹配,因此您不太可能看到任何性能差异,优化器可能会生成类似甚至相同的计划。如果在为每个标题添加 50,000 行时比较 sales 表上的 join/exists,您可能会看到不同的东西(不要介意您需要更改 join 查询以删除重复项、聚合、您有什么)。

回答by corsiKa

I find exists to be most useful when I have rows I would like to exclude based on how they interact with other rows.

我发现当我想根据行与其他行的交互方式排除行时,exists 最有用。

For example,

例如,

##代码##

In this case, I'm excluding a row in my aquery based on having a brecord with the same id but being invalid.

在这种情况下,我a根据具有b相同 ID 但无效的记录排除查询中的一行。

This actually came from a production problem I had at work. The query moved most of the exclusion logic in the query instead of in the application, taking load time from over 24 seconds to under 2 seconds. :-)

这实际上来自我在工作中遇到的生产问题。查询移动了查询中而不是应用程序中的大部分排除逻辑,将加载时间从超过 24 秒缩短到不到 2 秒。:-)