SQL EXISTS 语句如何工作?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5846882/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 10:21:45  来源:igfitidea点击:

How do SQL EXISTS statements work?

sql

提问by Dan

I'm trying to learn SQL and am having a hard time understanding EXISTS statements. I came across this quote about "exists" and don't understand something:

我正在尝试学习 SQL 并且很难理解 EXISTS 语句。我遇到了关于“存在”的引用,但不明白:

Using the exists operator, your subquery can return zero, one, or many rows, and the condition simply checks whether the subquery returned any rows. If you look at the select clause of the subquery, you will see that it consists of a single literal (1); since the condition in the containing query only needs to know how many rows have been returned, the actual data the subquery returned is irrelevant.

使用exists 运算符,您的子查询可以返回零、一或多行,并且条件仅检查子查询是否返回任何行。如果您查看子查询的 select 子句,您将看到它由单个文字 (1) 组成;由于包含查询中的条件只需要知道返回了多少行,因此子查询返回的实际数据无关紧要。

What I don't understand is how does the outer query know which row the subquery is checking? For example:

我不明白的是外部查询如何知道子查询正在检查哪一行?例如:

SELECT *
  FROM suppliers
 WHERE EXISTS (select *
                 from orders
                where suppliers.supplier_id = orders.supplier_id);

I understand that if the id from the supplier and orders table match, the subquery will return true and all the columns from the matching row in the suppliers' table will be outputted. What I don't get is how the subquery communicates which specific row (lets say the row with supplier id 25) should be printed if only a true or false is being returned.

我知道如果供应商和订单表中的 id 匹配,子查询将返回 true 并且供应商表中匹配行中的所有列都将被输出。我不明白的是,如果仅返回 true 或 false,子查询如何传达应打印哪个特定行(假设供应商 ID 为 25 的行)。

It appears to me that there is no relationship between the outer query and the subquery.

在我看来,外部查询和子查询之间没有关系。

回答by sojin

Think of it this way:

可以这样想:

For 'each' row from Suppliers, check if there 'exists' a row in the Ordertable that meets the condition Suppliers.supplier_id(this comes from Outer query current 'row') = Orders.supplier_id. When you find the first matching row, stop right there - the WHERE EXISTShas been satisfied.

对于来自 的'每一'行Suppliers,检查表中是否'存在'Order满足条件的行Suppliers.supplier_id(这来自外部查询当前'行')= Orders.supplier_id。当您找到第一个匹配的行时,就停在那里 -WHERE EXISTS已满足。

The magic link between the outer query and the subquery lies in the fact that Supplier_idgets passed from the outer query to the subquery for each row evaluated.

外部查询和子查询之间的神奇链接在于,Supplier_id对于评估的每一行,从外部查询传递到子查询的事实。

Or, to put it another way, the subquery is executed for each table row of the outer query.

或者,换句话说,子查询是针对外部查询的每个表行执行的。

It is NOT like the subquery is executed on the whole and gets the 'true/false' and then tries to match this 'true/false' condition with outer query.

它不像子查询在整体上执行并获得“真/假”,然后尝试将这个“真/假”条件与外部查询相匹配。

回答by OMG Ponies

It appears to me that there is no relationship between the outer query and the subquery.

在我看来,外部查询和子查询之间没有关系。

What do you think the WHERE clause inside the EXISTS example is doing? How do you come to that conclusion when the SUPPLIERS reference isn't in the FROM or JOIN clauses within the EXISTS clause?

您认为 EXISTS 示例中的 WHERE 子句在做什么?当 SUPPLIERS 引用不在 EXISTS 子句中的 FROM 或 JOIN 子句中时,您如何得出该结论?

EXISTS valuates for TRUE/FALSE, and exits as TRUE on the first match of the criteria -- this is why it can be faster than IN. Also be aware that the SELECT clause in an EXISTS is ignored - IE:

EXISTS 评估 TRUE/FALSE,并在第一次匹配条件时作为 TRUE 退出——这就是为什么它比IN. 还要注意 EXISTS 中的 SELECT 子句被忽略 - 即:

SELECT s.*
  FROM SUPPLIERS s
 WHERE EXISTS (SELECT 1/0
                 FROM ORDERS o
                WHERE o.supplier_id = s.supplier_id)

...should hit a division by zero error, but it won't. The WHERE clause is the most important piece of an EXISTS clause.

...应该被零错误除以,但它不会。WHERE 子句是 EXISTS 子句中最重要的部分。

Also be aware that a JOIN is not a direct replacement for EXISTS, because there will be duplicate parent records if there's more than one child record associated to the parent.

另请注意,JOIN 不是 EXISTS 的直接替代品,因为如果有多个子记录与父记录相关联,则会出现重复的父记录。

回答by Anthony Faull

You can produce identical results using either JOIN, EXISTS, IN, or INTERSECT:

您可以使用产生相同的结果JOINEXISTSIN,或INTERSECT

SELECT s.supplier_id
FROM suppliers s
INNER JOIN (SELECT DISTINCT o.supplier_id FROM orders o) o
    ON o.supplier_id = s.supplier_id

SELECT s.supplier_id
FROM suppliers s
WHERE EXISTS (SELECT * FROM orders o WHERE o.supplier_id = s.supplier_id)

SELECT s.supplier_id 
FROM suppliers s 
WHERE s.supplier_id IN (SELECT o.supplier_id FROM orders o)

SELECT s.supplier_id
FROM suppliers s
INTERSECT
SELECT o.supplier_id
FROM orders o

回答by Menahem

If you had a where clause that looked like this:

如果您有一个如下所示的 where 子句:

WHERE id in (25,26,27) -- and so on

you can easily understand why some rows are returned and some are not.

您可以轻松理解为什么会返回某些行而有些则不会。

When the where clause is like this:

当 where 子句是这样的:

WHERE EXISTS (select * from orders where suppliers.supplier_id = orders.supplier_id);

it just means : return rows that have an existing record in the orders table with te same id.

它只是意味着:返回具有相同 ID 的订单表中现有记录的行。

回答by Vlad Mihalcea

This is a very good question, so I decided to write a very detailed articleabout this topic on my blog.

这是一个很好的问题,所以我决定在我的博客上写一篇关于这个主题的非常详细的文章

Database table model

数据库表模型

Let's assume we have the following two tables in our database, that form a one-to-many table relationship.

假设我们的数据库中有以下两个表,它们形成一对多的表关系。

SQL EXISTS tables

SQL EXISTS 表

The studenttable is the parent, and the student_gradeis the child table since it has a student_id Foreign Key column referencing the id Primary Key column in the student table.

student表是父表,而student_grade是子表,因为它有一个 student_id 外键列引用学生表中的 id 主键列。

The student tablecontains the following two records:

student table包含以下两个记录:

| id | first_name | last_name | admission_score |
|----|------------|-----------|-----------------|
| 1  | Alice      | Smith     | 8.95            |
| 2  | Bob        | Johnson   | 8.75            |

And, the student_gradetable stores the grades the students received:

并且,该student_grade表存储了学生获得的成绩:

| id | class_name | grade | student_id |
|----|------------|-------|------------|
| 1  | Math       | 10    | 1          |
| 2  | Math       | 9.5   | 1          |
| 3  | Math       | 9.75  | 1          |
| 4  | Science    | 9.5   | 1          |
| 5  | Science    | 9     | 1          |
| 6  | Science    | 9.25  | 1          |
| 7  | Math       | 8.5   | 2          |
| 8  | Math       | 9.5   | 2          |
| 9  | Math       | 9     | 2          |
| 10 | Science    | 10    | 2          |
| 11 | Science    | 9.4   | 2          |

SQL EXISTS

SQL 存在

Let's say we want to get all students that have received a 10 grade in Math class.

假设我们想要让所有在数学课上获得 10 分的学生。

If we are only interested in the student identifier, then we can run a query like this one:

如果我们只对学生标识符感兴趣,那么我们可以运行如下查询:

SELECT
    student_grade.student_id
FROM
    student_grade
WHERE
    student_grade.grade = 10 AND
    student_grade.class_name = 'Math'
ORDER BY
    student_grade.student_id

But, the application is interested in displaying the full name of a student, not just the identifier, so we need info from the studenttable as well.

但是,应用程序感兴趣的是显示 a 的全名student,而不仅仅是标识符,因此我们还需要student表中的信息。

In order to filter the studentrecords that have a 10 grade in Math, we can use the EXISTS SQL operator, like this:

为了过滤student数学中有 10 分的记录,我们可以使用 EXISTS SQL 运算符,如下所示:

SELECT
    id, first_name, last_name
FROM
    student
WHERE EXISTS (
    SELECT 1
    FROM
        student_grade
    WHERE
        student_grade.student_id = student.id AND
        student_grade.grade = 10 AND
        student_grade.class_name = 'Math'
)
ORDER BY id

When running the query above, we can see that only the Alice row is selected:

运行上面的查询时,我们可以看到只选择了 Alice 行:

| id | first_name | last_name |
|----|------------|-----------|
| 1  | Alice      | Smith     |

The outer query selects the studentrow columns we are interested in returning to the client. However, the WHERE clause is using the EXISTS operator with an associated inner subquery.

外部查询选择student我们有兴趣返回给客户端的行列。但是,WHERE 子句将 EXISTS 运算符与关联的内部子查询一起使用。

The EXISTS operator returns true if the subquery returns at least one record and false if no row is selected. The database engine does not have to run the subquery entirely. If a single record is matched, the EXISTS operator returns true, and the associated other query row is selected.

如果子查询至少返回一条记录,则 EXISTS 运算符返回 true,如果未选择任何行,则返回 false。数据库引擎不必完全运行子查询。如果匹配单个记录,则 EXISTS 运算符返回 true,并选择关联的其他查询行。

The inner subquery is correlated because the student_id column of the student_gradetable is matched against the id column of the outer student table.

内部子查询是相关的,因为student_grade表的 student_id 列与外部 student 表的 id 列匹配。

回答by David Fells

EXISTS means that the subquery returns at least one row, that's really it. In that case, it's a correlated subquery because it checks the supplier_id of the outer table to the supplier_id of the inner table. This query says, in effect:

EXISTS 意味着子查询至少返回一行,就是这样。在这种情况下,它是一个相关子查询,因为它将外部表的供应商 ID 与内部表的供应商 ID 进行检查。这个查询实际上说:

SELECT all suppliers For each supplier ID, see if an order exists for this supplier If the supplier is not present in the orders table, remove the supplier from the results RETURN all suppliers who have corresponding rows in the orders table

选择所有供应商 对于每个供应商 ID,查看是否存在该供应商的订单 如果该供应商不在订单表中,则从结果中删除该供应商 RETURN 在订单表中具有相应行的所有供应商

You could do the same thing in this case with an INNER JOIN.

在这种情况下,您可以使用 INNER JOIN 做同样的事情。

SELECT suppliers.* 
  FROM suppliers 
 INNER 
  JOIN orders 
    ON suppliers.supplier_id = orders.supplier_id;

Ponies comment is correct. You'd need to do grouping with that join, or select distinct depending on the data you need.

小马的评论是正确的。您需要对该连接进行分组,或者根据您需要的数据选择不同的。

回答by Wouter van Nifterick

What you describe is a so called query with a correlated subquery.

您所描述的是所谓的带有相关子查询的查询

(In general) it's something that you should try to avoid by writing the query by using a join instead:

(通常)您应该通过使用连接来编写查询来避免这种情况:

SELECT suppliers.* 
FROM suppliers 
JOIN orders USING supplier_id
GROUP BY suppliers.supplier_id

Because otherwise, the subquery will be executed for each row in the outer query.

因为否则,将对外部查询中的每一行执行子查询。