Sql JOIN 顺序会影响性能吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16360860/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Does Sql JOIN order affect performance?
提问by
I was just tidying up some sql when I came across this query:
当我遇到这个查询时,我只是在整理一些 sql:
SELECT
jm.IMEI ,
jm.MaxSpeedKM ,
jm.MaxAccel ,
jm.MaxDeccel ,
jm.JourneyMaxLeft ,
jm.JourneyMaxRight ,
jm.DistanceKM ,
jm.IdleTimeSeconds ,
jm.WebUserJourneyId ,
jm.lifetime_odo_metres ,
jm.[Descriptor]
FROM dbo.Reporting_WebUsers AS wu WITH (NOLOCK)
INNER JOIN dbo.Reporting_JourneyMaster90 AS jm WITH (NOLOCK) ON wu.WebUsersId = jm.WebUsersId
INNER JOIN dbo.Reporting_Journeys AS j WITH (NOLOCK) ON jm.WebUserJourneyId = j.WebUserJourneyId
WHERE ( wu.isActive = 1 )
AND ( j.JourneyDuration > 2 )
AND ( j.JourneyDuration < 1000 )
AND ( j.JourneyDistance > 0 )
My question is does it make any performance difference the order of the joins as for the above query I would have done
我的问题是,对于我会做的上述查询,连接的顺序是否会产生任何性能差异
FROM dbo.Reporting_JourneyMaster90 AS jm
and then joined the other 2 tables to that one
然后将其他 2 张桌子加入到那个桌子上
采纳答案by Mike M.
No, the JOIN by order is changed during optimization.
不,JOIN by order 在优化过程中会发生变化。
The only caveat is the Option FORCE ORDERwhich will force joins to happen in the exact order you have them specified.
唯一需要注意的是 Option FORCE ORDER,它将强制按照您指定的确切顺序进行连接。
回答by Kitster
Join order in SQL2008R2 server does unquestionably affect query performance, particularly in queries where there are a large number of table joins with where clauses applied against multiple tables.
SQL2008R2 服务器中的连接顺序无疑会影响查询性能,特别是在有大量表连接且对多个表应用 where 子句的查询中。
Although the join order is changed in optimisation, the optimiser does't try all possible join orders. It stops when it finds what it considers a workable solution as the very act of optimisation uses precious resources.
虽然在优化中改变了连接顺序,但优化器不会尝试所有可能的连接顺序。当它找到它认为可行的解决方案时,它就会停止,因为优化行为本身就使用了宝贵的资源。
We have seen queries that were performing like dogs (1min + execution time) come down to sub second performance just by changing the order of the join expressions. Please note however that these are queries with 12 to 20 joins and where clauses on several of the tables.
我们已经看到,通过改变连接表达式的顺序,像狗一样执行的查询(1 分钟 + 执行时间)下降到亚秒级性能。但是请注意,这些是具有 12 到 20 个连接的查询,以及对几个表的 where 子句。
The trick is to set your order to help the query optimiser figure out what makes sense. You can use Force Order but that can be too rigid. Try to make sure that your join order starts with the tables where the will reduce data most through where clauses.
诀窍是设置您的顺序以帮助查询优化器弄清楚什么是有意义的。您可以使用 Force Order,但这可能太僵化了。尝试确保您的连接顺序从将通过 where 子句减少数据最多的表开始。
回答by Dave
I have a clear example of inner join affecting performance. It is a simple join between two tables. One had 50+ million records, the other has 2,000. If I select from the smaller table and join the larger it takes 5+ minutes.
我有一个内连接影响性能的明显例子。这是两个表之间的简单连接。一个有 50 多万条记录,另一个有 2,000 条记录。如果我从较小的表中选择并加入较大的表,则需要 5 分钟以上。
If I select from the larger table and join the smaller it takes 2 min 30 seconds.
如果我从较大的表中选择并加入较小的表,则需要 2 分 30 秒。
This is with SQL Server 2012.
这是 SQL Server 2012。
To me this is counter intuitive since I am using the largest dataset for the initial query.
对我来说,这是违反直觉的,因为我使用最大的数据集进行初始查询。
回答by Yaroslav
JOIN
order doesn't matter, the query engine will reorganize their order based on statistics for indexes and other stuff.
JOIN
顺序无关紧要,查询引擎将根据索引和其他内容的统计信息重新组织它们的顺序。
For test do the following:
对于测试,请执行以下操作:
- select show actual execution plan and run first query
- change
JOIN
order and now run the query again - compare execution plans
- 选择显示实际执行计划并运行第一个查询
- 更改
JOIN
订单,现在再次运行查询 - 比较执行计划
They should be identical as the query engine will reorganize them according to other factors.
它们应该相同,因为查询引擎会根据其他因素重新组织它们。
As commented on other asnwer, you could use OPTION (FORCE ORDER)
to use exactly the order you want but maybe it would not be the most efficient one.
正如其他 asnwer 所评论的那样,您可以使用OPTION (FORCE ORDER)
您想要的确切顺序,但也许它不是最有效的。
AS a general rule of thumb, JOIN order should be with table of least records on top, and most records last, as some DBMS engines the order can make a difference, as well as if the FORCE ORDER command was used to help limit the results.
作为一般的经验法则,JOIN 顺序应该是最少记录的表在上面,大多数记录在最后,因为某些 DBMS 引擎该顺序可能会有所不同,以及是否使用 FORCE ORDER 命令来帮助限制结果.
回答by Denis de Bernardy
Usually not. I'm not 100% this applies verbatim to Sql-Server, but in Postgres the query planner reserves the right to reorder the inner joins as it sees fit. The exception is when you reach a threshold beyond which it's too expensive to investigate changing their order.
通常不会。我不是 100% 这逐字适用于 Sql-Server,但在 Postgres 中,查询计划器保留在它认为合适的时候重新排序内部连接的权利。例外情况是,当您达到阈值时,调查更改订单的成本太高。
回答by SQL guy
Wrong. SQL Server 2005 it definitely matters since you are limiting the dataset from the beginning of the FROM clause. If you start with 2000 records instead of 2 million it makes your query faster.
错误的。SQL Server 2005 绝对重要,因为您从 FROM 子句的开头限制了数据集。如果您从 2000 条记录而不是 200 万条记录开始,它会使您的查询更快。