SQL 与交叉连接相比,内连接的性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/670980/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 01:30:01  来源:igfitidea点击:

Performance of inner join compared to cross join

sqlperformance

提问by soulmerge

The effect of issuing an inner join is the same as stating a cross join with the join condition in the WHERE-clause. I noticed that many people in my company use cross joins, where I would use inner joins. I didn't notice any significant performance gain after changing some of these queries and was wondering if it was just a coincidence or if the DBMS optimizes such issues transparently (MySql in our case). And here a concrete example for discussion:

发出内部连接的效果与使用 WHERE 子句中的连接条件声明交叉连接的效果相同。我注意到我公司中的很多人都使用交叉联接,而我会在其中使用内部联接。在更改其中一些查询后,我没有注意到任何显着的性能提升,并且想知道这是否只是巧合,或者 DBMS 是否透明地优化了这些问题(在我们的例子中是 MySql)。这里有一个具体的例子供讨论:

SELECT User.*
FROM User, Address
WHERE User.addressId = Address.id;

SELECT User.*
FROM User
INNER JOIN Address ON (User.addressId = Address.id);

采纳答案by karim79

Cross Joins produce results that consist of every combination of rows from two or more tables. That means if table A has 6 rows and table B has 3 rows, a cross join will result in 18 rows. There is no relationship established between the two tables – you literally just produce every possible combination.

交叉联接产生的结果由来自两个或多个表的行的每种组合组成。这意味着如果表 A 有 6 行而表 B 有 3 行,则交叉联接将产生 18 行。两个表之间没有建立关系——您实际上只是生成了所有可能的组合。

With an inner join, column values from one row of a table are combined with column values from another row of another (or the same) table to form a single row of data.

使用内部联接,来自表的一行的列值与来自另一个(或相同)表的另一行的列值组合以形成单行数据。

If a WHERE clause is added to a cross join, it behaves as an inner join as the WHERE imposes a limiting factor.

如果将 WHERE 子句添加到交叉联接,则它表现为内部联接,因为 WHERE 强加了限制因素。

As long as your queries abide by common sense and vendor specific performance guidelines, I like to think of the decision on which type of join to use to be a simple matter of taste.

只要您的查询符合常识和供应商特定的性能指南,我喜欢将使用哪种连接类型的决定视为一个简单的品味问题。

回答by Otávio Décio

There is no difference other than the inner join is a lot clearer because it defines the join, leaving the where clause to be the actual limiting condition.

除了内部连接更清晰之外,没有什么区别,因为它定义了连接,让 where 子句成为实际的限制条件。

回答by Emil H

Use EXPLAINto view the query plan for both queries, and see if there's any difference. Quite possibly MySQL will use the same execution plan in both cases. I use the INNER JOINsyntax mainly because it's a lot clearer.

使用EXPLAIN查看两个查询的查询计划,看看有没有什么不同。在这两种情况下,MySQL 很可能会使用相同的执行计划。我使用INNER JOIN语法主要是因为它更清晰。

回答by Jonathan

I find that work-places that allow the first syntax (comma separated tables) tend to have significant time taken up debugging cases where more rows are returned than intended. Unintentional cross joins are the bane of a system, and can bring even the most well-tuned database to it's knees. It has brought our pre-prod system to a screeching halt on at least two occasions in the last year.

我发现允许第一种语法(逗号分隔的表)的工作场所往往会花费大量时间来调试返回比预期更多行的情况。无意识的交叉连接是系统的祸根,即使是最优化的数据库也会使其瘫痪。去年,它至少有两次使我们的预生产系统突然停止。

The second syntax (join syntax) forces the writer to think about how the tables are joined together first, and then only return the interesting rows. It is impossible to accidentally do a cross join using this syntax, and thus the danger of accidental poorly performing queries is reduced.

第二种语法(连接语法)迫使作者首先考虑如何将表连接在一起,然后只返回感兴趣的行。使用这种语法不可能意外地进行交叉联接,因此降低了意外执行不佳查询的危险。

However, that issue aside, I have never noticed any speed difference between the two syntaxes in any systems I have had.

但是,撇开这个问题不谈,我从未注意到在我拥有的任何系统中这两种语法之间有任何速度差异。

回答by HLGEM

The first example is functionally the same as the second example. However, this syntax should be avoided for several reasons. First it is much easier to accidentally get a cross join when using this syntax especially when there are mulitple joins in the table. If you see a lot of this type of query with the keyword distinct, you probably have someone who is trying to fix the cross joins.

第一个示例在功能上与第二个示例相同。但是,出于多种原因,应避免使用此语法。首先,在使用此语法时意外获得交叉联接要容易得多,尤其是在表中有多个联接时。如果您看到很多带有关键字 distinct 的此类查询,那么您可能有人正在尝试修复交叉连接。

Next, the left and right join syntax using the older style is deprecated and will no longer be supported. Further, it doesn't work correctly now anyway. Sometimes it misinterprets the outer join and sends back the wrong results set. So any queries you have using = or =in the where clause should immediately be replaced.

接下来,使用旧样式的左右连接语法已弃用,不再受支持。此外,无论如何它现在都不能正常工作。有时它会误解外连接并发回错误的结果集。因此,您在 where 子句中使用= 或 = 的任何查询都应立即替换。

Third, ANSI standard joins are easier to understand and maintain. An understanding of joins is one of the most critical basic skills that anyone querying any relational database needs to have. It has been my experience that some people who use the older style don't really understand joins and how they work and thus write queries that do not actually do what they intended.

第三,ANSI 标准连接更易于理解和维护。对连接的理解是查询任何关系数据库的任何人都需要具备的最关键的基本技能之一。根据我的经验,一些使用旧样式的人并不真正了解连接及其工作方式,因此编写的查询实际上并没有按照他们的意图执行。

回答by Ozgur Ozturk

One additional benefit of the first syntax is you can be more general in your limiting condition. Not just equality.

第一种语法的另一个好处是您可以在限制条件下更加通用。不仅仅是平等。

But if you are using equality, why trust the optimizer? Make sure it won't first generate the cross join and then eliminate rows. Use the second one.

但是,如果您使用的是相等性,为什么要信任优化器?确保它不会首先生成交叉连接然后消除行。使用第二个。

回答by Brimstedt

The order in which you join tables or you put your ON / WHERE conditions should not matter.

加入表的顺序或放置 ON / WHERE 条件的顺序无关紧要。

The query optimizer should optimize and use the best order anyway (and chosing how to best filter the data, where to start, etc)

查询优化器无论如何都应该优化并使用最佳顺序(并选择如何最好地过滤数据、从哪里开始等)

As many others though, I suggest using the INNER JOIN syntax, since it makes things much more readable, it is more transparent with the syntax of LEFT or FULL joins as well.

与其他许多人一样,我建议使用 INNER JOIN 语法,因为它使事情更具可读性,因此使用 LEFT 或 FULL 连接的语法也更加透明。

There's a somewhat more dwelling text about it here: http://linus.brimstedt.se/?/article/articleview/SQLSyntax

这里有更多关于它的文字:http: //linus.brimstedt.se/?/article/articleview/SQL Syntax

/B

/B

回答by nisiumi

SQL Server said "When a WHERE turns a Cross Join into an Inner Join", so there are not difference. http://msdn.microsoft.com/en-us/library/ms190690.aspx

SQL Server 说“当 WHERE 将交叉联接变成内部联接时”,所以没有区别。 http://msdn.microsoft.com/en-us/library/ms190690.aspx

I did SQL server "Execution plan" the Performance is same.

我做了 SQL 服务器“执行计划”,性能是一样的。

回答by minhas23

Explaining both queries gives same output

解释两个查询给出相同的输出

mysql> explain select * from t T1, t T2 where T1.ID=T2.ID;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | T1    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 |                                |
|  1 | SIMPLE      | T2    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

mysql> explain select * from t T1  join t T2 on T1.ID=T2.ID;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | T1    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 |                                |
|  1 | SIMPLE      | T2    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

But using inner join syntax is preferable as its more clearer and more precise. Mysql may internally tune Left and Right join queries to select less data as compared to Cross Join.

但使用内连接语法更可取,因为它更清晰、更精确。与交叉连接相比,Mysql 可能会在内部调整左连接和右连接查询以选择更少的数据。

回答by Tegiri Nenashi

Since the beginning of time optimizers have being built around classic restrict-project-cartesian product syntax. Virtually all the vendors copied the design pioneered by System R. Then, grudgingly, vendors adopted "the latest-and-greatest" ANSI syntax and retrofitted their SQL execution engines. Contrary to what marketing brochure can tell you ("use the latest syntax"), not much on physical implementation level has been changed: it is still [indexed] nested loops, or hash or sort-merge join. Therefore, there is no basis to assume superiority of one syntax over the other.

从一开始优化器就围绕经典的restrict-project-cartesian product 语法构建。几乎所有供应商都复制了 System R 开创的设计。然后,供应商不情愿地采用了“最新最好的”ANSI 语法并改造了他们的 SQL 执行引擎。与营销手册可以告诉您的内容(“使用最新语法”)相反,在物理实现级别上没有太大变化:它仍然是 [索引] 嵌套循环,或者散列或排序合并连接。因此,没有根据假设一种语法优于另一种语法。

To my personal taste, the new syntax is redundant, noisy, and inconsistent. As to being sanctioned by the committee, "walk into any park in every city and you'll find no statue of committee".

就我个人的口味而言,新语法是多余的、嘈杂的和不一致的。至于被委员会批准,“走进每个城市的任何公园,你都找不到委员会的雕像”。