SQL 左连接与 FROM 行上的多个表?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/894490/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL left join vs multiple tables on FROM line?
提问by jmucchiello
Most SQL dialects accept both the following queries:
大多数 SQL 方言都接受以下两种查询:
SELECT a.foo, b.foo
FROM a, b
WHERE a.x = b.x
SELECT a.foo, b.foo
FROM a
LEFT JOIN b ON a.x = b.x
Now obviously when you need an outer join, the second syntax is required. But when doing an inner join why should I prefer the second syntax to the first (or vice versa)?
现在很明显,当您需要外连接时,需要第二种语法。但是在进行内部连接时,为什么我应该更喜欢第二种语法而不是第一种(反之亦然)?
回答by Lasse V. Karlsen
The old syntax, with just listing the tables, and using the WHERE
clause to specify the join criteria, is being deprecated in most modern databases.
仅列出表并使用WHERE
子句指定连接条件的旧语法在大多数现代数据库中已被弃用。
It's not just for show, the old syntax has the possibility of being ambiguous when you use both INNER and OUTER joins in the same query.
这不仅仅是为了展示,当您在同一查询中同时使用 INNER 和 OUTER 连接时,旧语法可能会产生歧义。
Let me give you an example.
让我给你举个例子。
Let's suppose you have 3 tables in your system:
假设您的系统中有 3 个表:
Company
Department
Employee
Each table contain numerous rows, linked together. You got multiple companies, and each company can have multiple departments, and each department can have multiple employees.
每个表包含许多行,链接在一起。你有多个公司,每个公司可以有多个部门,每个部门可以有多个员工。
Ok, so now you want to do the following:
好的,现在您要执行以下操作:
List all the companies, and include all their departments, and all their employees. Note that some companies don't have any departments yet, but make sure you include them as well. Make sure you only retrieve departments that have employees, but always list all companies.
列出所有公司,包括其所有部门和所有员工。请注意,有些公司还没有任何部门,但请确保您也包括它们。确保您只检索有员工的部门,但始终列出所有公司。
So you do this:
所以你这样做:
SELECT * -- for simplicity
FROM Company, Department, Employee
WHERE Company.ID *= Department.CompanyID
AND Department.ID = Employee.DepartmentID
Note that the last one there is an inner join, in order to fulfill the criteria that you only want departments with people.
请注意,最后一个是内部联接,以便满足您只希望部门有人员的条件。
Ok, so what happens now. Well, the problem is, it depends on the database engine, the query optimizer, indexes, and table statistics. Let me explain.
好的,那么现在会发生什么。嗯,问题是,它取决于数据库引擎、查询优化器、索引和表统计信息。让我解释。
If the query optimizer determines that the way to do this is to first take a company, then find the departments, and then do an inner join with employees, you're not going to get any companies that don't have departments.
如果查询优化器确定这样做的方法是先取公司,然后找到部门,然后与员工进行内部联接,那么您将不会得到任何没有部门的公司。
The reason for this is that the WHERE
clause determines which rowsend up in the final result, not individual parts of the rows.
这样做的原因是该WHERE
子句决定了最终结果中的哪些行,而不是行的各个部分。
And in this case, due to the left join, the Department.ID column will be NULL, and thus when it comes to the INNER JOIN to Employee, there's no way to fulfill that constraint for the Employee row, and so it won't appear.
在这种情况下,由于左连接,Department.ID 列将为 NULL,因此当涉及到 Employee 的 INNER JOIN 时,没有办法满足 Employee 行的约束,所以它不会出现。
On the other hand, if the query optimizer decides to tackle the department-employee join first, and then do a left join with the companies, you will see them.
另一方面,如果查询优化器决定先处理部门员工联接,然后与公司进行左联接,您将看到它们。
So the old syntax is ambiguous. There's no way to specify what you want, without dealing with query hints, and some databases have no way at all.
所以旧的语法是模棱两可的。没有办法指定你想要什么,不处理查询提示,有些数据库根本没有办法。
Enter the new syntax, with this you can choose.
输入新语法,您可以选择。
For instance, if you want all companies, as the problem description stated, this is what you would write:
例如,如果您想要所有公司,如问题描述所述,您可以这样写:
SELECT *
FROM Company
LEFT JOIN (
Department INNER JOIN Employee ON Department.ID = Employee.DepartmentID
) ON Company.ID = Department.CompanyID
Here you specify that you want the department-employee join to be done as one join, and then left join the results of that with the companies.
在此指定您希望将部门-员工联接作为一个联接完成,然后将其结果与公司进行左联接。
Additionally, let's say you only want departments that contains the letter X in their name. Again, with old style joins, you risk losing the company as well, if it doesn't have any departments with an X in its name, but with the new syntax, you can do this:
此外,假设您只需要名称中包含字母 X 的部门。同样,使用旧式联接,您也有失去公司的风险,如果它的名称中没有任何带有 X 的部门,但使用新语法,您可以这样做:
SELECT *
FROM Company
LEFT JOIN (
Department INNER JOIN Employee ON Department.ID = Employee.DepartmentID
) ON Company.ID = Department.CompanyID AND Department.Name LIKE '%X%'
This extra clause is used for the joining, but is not a filter for the entire row. So the row might appear with company information, but might have NULLs in all the department and employee columns for that row, because there is no department with an X in its name for that company. This is hard with the old syntax.
这个额外的子句用于连接,但不是整行的过滤器。因此,该行可能会显示公司信息,但该行的所有部门和员工列中可能都有 NULL,因为该公司的名称中没有带有 X 的部门。这对于旧语法来说很难。
This is why, amongst other vendors, Microsoft has deprecated the old outer join syntax, but not the old inner join syntax, since SQL Server 2005 and upwards. The only way to talk to a database running on Microsoft SQL Server 2005 or 2008, using the old style outer join syntax, is to set that database in 8.0 compatibility mode (aka SQL Server 2000).
这就是为什么在其他供应商中,自 SQL Server 2005 及更高版本以来,Microsoft 已弃用旧的外连接语法,而不是旧的内连接语法。使用旧式外连接语法与运行在 Microsoft SQL Server 2005 或 2008 上的数据库对话的唯一方法是将该数据库设置为 8.0 兼容模式(也称为 SQL Server 2000)。
Additionally, the old way, by throwing a bunch of tables at the query optimizer, with a bunch of WHERE clauses, was akin to saying "here you are, do the best you can". With the new syntax, the query optimizer has less work to do in order to figure out what parts goes together.
此外,旧的方式,通过在查询优化器中抛出一堆表,以及一堆 WHERE 子句,类似于说“在这里,尽你所能”。使用新语法,查询优化器只需要做更少的工作即可找出哪些部分组合在一起。
So there you have it.
所以你有它。
LEFT and INNER JOIN is the wave of the future.
LEFT 和 INNER JOIN 是未来的潮流。
回答by Andomar
The JOIN syntax keeps conditions near the table they apply to. This is especially useful when you join a large amount of tables.
JOIN 语法将条件保持在它们适用的表附近。这在您加入大量表时特别有用。
By the way, you can do an outer join with the first syntax too:
顺便说一句,您也可以使用第一种语法进行外连接:
WHERE a.x = b.x(+)
Or
或者
WHERE a.x *= b.x
Or
或者
WHERE a.x = b.x or a.x not in (select x from b)
回答by Peter Perhá?
Basically, when your FROM clause lists tables like so:
基本上,当您的 FROM 子句列出如下表时:
SELECT * FROM
tableA, tableB, tableC
the result is a cross product of all the rows in tables A, B, C. Then you apply the restriction WHERE tableA.id = tableB.a_id
which will throw away a huge number of rows, then further ... AND tableB.id = tableC.b_id
and you should then get only those rows you are really interested in.
结果是表 A、B、C 中所有行的叉积。然后你应用限制WHERE tableA.id = tableB.a_id
,这将丢弃大量的行,然后进一步......AND tableB.id = tableC.b_id
然后你应该只得到你真正感兴趣的那些行在。
DBMSs know how to optimise this SQL so that the performance difference to writing this using JOINs is negligible (if any). Using the JOIN notation makes the SQL statement morereadable (IMHO, not using joins turns the statement into a mess). Using the cross product, you need to provide join criteria in the WHERE clause, and that's the problem with the notation. You are crowding your WHERE clause with stuff like
DBMS 知道如何优化此 SQL,以便使用 JOIN 编写此 SQL 的性能差异可以忽略不计(如果有)。使用 JOIN 表示法使 SQL 语句更具可读性(恕我直言,不使用连接会使语句变得一团糟)。使用叉积,您需要在 WHERE 子句中提供连接条件,这就是符号的问题。你在你的 WHERE 子句中塞满了类似的东西
tableA.id = tableB.a_id
AND tableB.id = tableC.b_id
which is only used to restrict the cross product. WHERE clause should only contain RESTRICTIONS to the resultset. If you mix table join criteria with resultset restrictions, you (and others) will find your query harder to read. You should definitely use JOINs and keep the FROM clause a FROM clause, and the WHERE clause a WHERE clause.
仅用于限制叉积。WHERE 子句应该只包含对结果集的 RESTRICTIONS。如果将表连接条件与结果集限制混合使用,您(和其他人)会发现您的查询更难阅读。您绝对应该使用 JOIN 并将 FROM 子句保留为 FROM 子句,将 WHERE 子句保留为 WHERE 子句。
回答by Dwight T
The first way is the older standard. The second method was introduced in SQL-92, http://en.wikipedia.org/wiki/SQL. The complete standard can be viewed at http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt.
第一种方式是较旧的标准。第二种方法是在 SQL-92 中引入的,http://en.wikipedia.org/wiki/SQL。完整的标准可以在http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt查看。
It took many years before database companies adopted the SQL-92 standard.
数据库公司采用 SQL-92 标准花了很多年。
So the reason why the second method is preferred, it is the SQL standard according the ANSI and ISO standards committee.
所以之所以首选第二种方法,是根据 ANSI 和 ISO 标准委员会的 SQL 标准。
回答by HLGEM
The second is preferred because it is far less likely to result in an accidental cross join by forgetting to put inthe where clause. A join with no on clause will fail the syntax check, an old style join with no where clause will not fail, it will do a cross join.
第二种是首选,因为它不太可能因忘记放入 where 子句而导致意外交叉连接。没有 on 子句的连接将无法通过语法检查,没有 where 子句的旧样式连接不会失败,它将执行交叉连接。
Additionally when you later have to a left join, it is helpful for maintenance that they all be in the same structure. And the old syntax has been out of date since 1992, it is well past time to stop using it.
此外,当您以后必须进行左连接时,它们都在同一结构中有助于维护。并且旧的语法自 1992 年以来已经过时,停止使用它的时间已经过去了。
Plus I have found that many people who exclusively use the first syntax don't really understand joins and understanding joins is critical to getting correct results when querying.
另外,我发现许多专门使用第一种语法的人并不真正理解连接,而理解连接对于在查询时获得正确结果至关重要。
回答by Alan G
I think there are some good reasons on this page to adopt the second method -using explicit JOINs. The clincher though is that when the JOIN criteria are removed from the WHERE clause it becomes much easier to see the remaining selection criteria in the WHERE clause.
我认为此页面上有一些很好的理由采用第二种方法 - 使用显式 JOIN。但关键在于,当从 WHERE 子句中删除 JOIN 标准时,在 WHERE 子句中查看剩余的选择标准变得更加容易。
In really complex SELECT statements it becomes much easier for a reader to understand what is going on.
在非常复杂的 SELECT 语句中,读者更容易理解发生了什么。
回答by Euro Micelli
The SELECT * FROM table1, table2, ...
syntax is ok for a couple of tables, but it becomes exponentially (not necessarily a mathematically accurate statement) harder and harder to read as the number of tables increases.
SELECT * FROM table1, table2, ...
几个表的语法是可以的,但随着表数量的增加,它变得越来越难阅读(不一定是数学上准确的语句)。
The JOIN syntax is harder to write (at the beginning), but it makes it explicit what criteria affects which tables. This makes it much harder to make a mistake.
JOIN 语法(在开始时)更难编写,但它明确说明了哪些条件会影响哪些表。这使得犯错误变得更加困难。
Also, if all the joins are INNER, then both versions are equivalent. However, the moment you have an OUTER join anywhere in the statement, things get much more complicated and it's virtually guarantee that what you write won't be querying what you think you wrote.
此外,如果所有连接都是 INNER,则两个版本是等效的。但是,当您在语句中的任何位置使用 OUTER 连接时,事情就会变得更加复杂,并且实际上可以保证您编写的内容不会查询您认为自己编写的内容。
回答by Pablo Santa Cruz
When you need an outer join the second syntax is notalways required:
当您需要外部联接时,并不总是需要第二种语法:
Oracle:
甲骨文:
SELECT a.foo, b.foo
FROM a, b
WHERE a.x = b.x(+)
MSSQLServer (although it's been deprecatedin 2000 version)/Sybase:
MSSQLServer(尽管它在 2000 版本中已被弃用)/Sybase:
SELECT a.foo, b.foo
FROM a, b
WHERE a.x *= b.x
But returning to your question. I don't know the answer, but it is probably related to the fact that a joinis more natural (syntactically, at least) than adding an expression to a whereclause when you are doing exactly that: joining.
但是回到你的问题。我不知道答案,但它可能与一个事实,即加入更自然(语法,至少)比增加的表达到哪里,当你正在做的正是子句:加盟。
回答by Gavin H
Well the first and second queries may yield different results because a LEFT JOIN includes all records from the first table, even if there are no corresponding records in the right table.
好吧,第一个和第二个查询可能会产生不同的结果,因为 LEFT JOIN 包括第一个表中的所有记录,即使右表中没有相应的记录。
回答by kemiller2002
I hear a lot of people complain the first one is too difficult to understand and that it is unclear. I don't see a problem with it, but after having that discussion, I use the second one even on INNER JOINS for clarity.
我听到很多人抱怨第一个太难理解而且不清楚。我没有看到它有什么问题,但是在进行了讨论之后,为了清晰起见,我什至在 INNER JOINS 上使用了第二个。