连接条件中的 SQL 过滤条件或更有效的 where 子句
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1401889/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL Filter criteria in join criteria or where clause which is more efficient
提问by Gratzy
I have a relatively simple query joining two tables. The "Where" criteria can be expressed either in the join criteria or as a where clause. I'm wondering which is more efficient.
我有一个连接两个表的相对简单的查询。“Where”条件可以用连接条件或 where 子句来表示。我想知道哪个更有效。
Query is to find max sales for a salesman from the beginning of time until they were promoted.
查询是查找销售员从开始到晋升的最大销售额。
Case 1
情况1
select salesman.salesmanid, max(sales.quantity)
from salesman
inner join sales on salesman.salesmanid =sales.salesmanid
and sales.salesdate < salesman.promotiondate
group by salesman.salesmanid
Case 2
案例二
select salesman.salesmanid, max(sales.quantity)
from salesman
inner join sales on salesman.salesmanid =sales.salesmanid
where sales.salesdate < salesman.promotiondate
group by salesman.salesmanid
Note Case 1 lacks a where clause altogether
注意案例 1 完全没有 where 子句
RDBMS is Sql Server 2005
RDBMS 是 Sql Server 2005
EDITIf the second piece of the join criteria or the where clause was sales.salesdate < some fixed date so its not actually any criteria of joining the two tables does that change the answer.
编辑如果连接标准的第二个部分或 where 子句是 sales.salesdate < 某个固定日期,那么它实际上不是连接两个表的任何标准是否会改变答案。
采纳答案by marc_s
I wouldn't use performance as the deciding factor here - and quite honestly, I don't think there's any measurable performance difference between those two cases, really.
我不会在这里使用性能作为决定因素 - 老实说,我认为这两种情况之间没有任何可衡量的性能差异,真的。
I would always use case #2 - why? Because in my opinion, you should only put the actual criteria that establish the JOIN between the two tables into the JOIN clause - everything else belongs in the WHERE clause.
我总是使用案例#2 - 为什么?因为在我看来,您应该只将在两个表之间建立 JOIN 的实际标准放入 JOIN 子句中 - 其他所有内容都属于 WHERE 子句。
Just a matter of keeping things clean and put things where they belong, IMO.
只是保持事物清洁并将事物放在它们该属于的地方,IMO。
Obviously, there are cases with LEFT OUTER JOINs where the placement of the criteria does make a difference in terms of what results get returned - those cases would be excluded from my recommendation, of course.
显然,在使用 LEFT OUTER JOIN 的情况下,标准的位置确实会对返回的结果产生影响 - 当然,这些情况将被排除在我的建议之外。
Marc
马克
回答by Kevin LaBranche
You can run the execution plan estimator and sql profiler to see how they stack up against each other.
您可以运行执行计划估计器和 sql 分析器来查看它们如何相互叠加。
However, they are semantically the same underneath the hood according to this SQL Server MVP:
但是,根据此 SQL Server MVP,它们在语义上是相同的:
http://www.eggheadcafe.com/conversation.aspx?messageid=29145383&threadid=29145379
http://www.eggheadcafe.com/conversation.aspx?messageid=29145383&threadid=29145379
回答by Spencer Evans
I prefer to have any hard coded criteria in the join. It makes the SQL much more readable and portable.
我更喜欢在连接中使用任何硬编码标准。它使 SQL 更具可读性和可移植性。
Readability: You can see exactly what data you're going to get because all the table criteria is written right there in the join. In large statements, the criteria may be buried within 50 other expressions and is easily missed.
可读性:您可以准确地看到您将获得哪些数据,因为所有表条件都写在连接中。在大型语句中,标准可能隐藏在 50 个其他表达式中,很容易被遗漏。
Portability: You can just copy a chunk out of the FROM clause and paste it somewhere else. That gives the joins and any criteria you need to go with it. If you always use that criteria when joining those two tables, then putting it in the join is the most logical.
可移植性:您可以从 FROM 子句中复制一个块并将其粘贴到其他地方。这给出了连接以及您需要遵循的任何标准。如果在连接这两个表时始终使用该条件,那么将其放入连接中是最合乎逻辑的。
For Example:
例如:
FROM
table1 t1
JOIN table2 t2_ABC ON
t1.c1 = t2_ABC.c1 AND
t2_ABC.c2 = 'ABC'
If you need to get a second column out of table 2 you just copy that block into Notepad, search/repalce "ABC" and presto and entire new block of code ready to paste back in.
如果您需要从表 2 中取出第二列,您只需将该块复制到记事本中,搜索/替换“ABC”,然后准备好重新粘贴整个新代码块。
Additional: It's also easier to change between an inner and outer join without having to worry about any criteria that may be floating around in the WHERE clause.
附加:在内连接和外连接之间进行更改也更容易,而不必担心 WHERE 子句中可能存在的任何标准。
I reserve the WHERE clause strictly for run-time criteria where possible.
我在可能的情况下严格为运行时标准保留 WHERE 子句。
As for efficiency: If you're referring to excecution speed, then as everyone else has stated, it's redundant. If you're referring to easier debugging and reuse, then I prefer option 1.
至于效率:如果您指的是执行速度,那么正如其他人所说,它是多余的。如果您指的是更容易调试和重用,那么我更喜欢选项 1。
回答by Michael Rajesh
One thing I want to say finally as I notified, before that.. Both ways may give the same performance or using the criteria at Whereclause may be little faster as found in some answers..
在我通知之前,我最后想说的一件事.. 两种方式都可能提供相同的性能,或者使用Where子句中的标准可能会比某些答案中发现的快一点。
But I identified one difference, you can use for your logical needs..
但是我发现了一个区别,您可以根据自己的逻辑需求使用..
Using the criteria at ONclause will not filter/skip the rows to select instead the join columns would be null based on the conditions
Using the criteria at Whereclause may filter/skip the rows at the entire results
在ON子句中使用条件不会过滤/跳过要选择的行,而是根据条件连接列将为空
使用Where子句中的条件可以过滤/跳过整个结果中的行
回答by Scott Ivey
I don't think you'll find a finite answer for this one that applies to all cases. The 2 are not always interchangeable - since for some queries (some left joins) you will come up with different results by placing the criteria in the WHERE vs the FROM line.
我认为您不会找到适用于所有情况的有限答案。2 并不总是可以互换 - 因为对于某些查询(一些左连接),您将通过将条件放在 WHERE 与 FROM 行中来得出不同的结果。
In your case, you should evaluate both of these queries. In SSMS, you can view the estimated and actual execution plans of both of these queries - that would be a good first step in determining which is more optimal. You could also view the time & IO for each (set statistics time on, set statistics io on) - and that will also give you information to make your decision.
在您的情况下,您应该评估这两个查询。在 SSMS 中,您可以查看这两个查询的估计和实际执行计划——这是确定哪个更优化的良好第一步。您还可以查看每个的时间和 IO(设置统计时间,设置统计 io) - 这也将为您提供信息以做出决定。
In the case of the queries in your question - I'd bet that they'll both come out with the same query plan - so in this case it may not matter, but in others it could potentially produce different plans.
在您的问题中的查询的情况下 - 我敢打赌它们都会出现相同的查询计划 - 所以在这种情况下它可能无关紧要,但在其他情况下它可能会产生不同的计划。
Try this to see the difference between the 2...
试试这个,看看两者之间的区别......
SET STATISTICS IO ON
SET STATISTICS TIME ON
select salesman.salesmanid,
max(sales.quantity)
from salesmaninner join sales on salesman.salesmanid =sales.salesmanid
and sales.salesdate < salesman.promotiondate
group by salesman.salesmanid
select salesman.salesmanid,
max(sales.quantity)
from salesmaninner join sales on salesman.salesmanid = sales.salesmanid
where sales.salesdate < salesman.promotiondate
group by salesman.salesmanid
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
回答by Pete Alvin
Case 1 (criteria in the JOIN) is better for encapsulation, and increased encapsulation is usually a good thing: decreased copy/paste omissions to another query, decreased bugs if later converted to LEFT JOIN, and increased readability (related stuff together and less "noise" in WHERE clause). In this case, the WHERE clause only captures principal table criteria or criteria that spans multiple tables.
情况 1(JOIN 中的标准)更适合封装,并且增加封装通常是一件好事:减少复制/粘贴到另一个查询的遗漏,减少错误(如果后来转换为 LEFT JOIN),以及增加可读性(相关的东西在一起并且更少” WHERE 子句中的噪音”)。在这种情况下,WHERE 子句仅捕获主体表标准或跨越多个表的标准。
回答by Austin Salonen
Become familiar with the Estimated Execution Plan in SQL Management Studio!! Like others have said, you're at the mercy of the analyzer no matter what you do so trust its estimates. I would guess the two you provided would produce the exact same plan.
熟悉 SQL Management Studio 中的估计执行计划!!就像其他人所说的那样,无论您做什么,您都受分析器的支配,因此请相信它的估计。我猜你提供的两个会产生完全相同的计划。
If it's an attempt to change a development culture, pick the one that gives you a better plan; for the ones that are identical, follow the culture
如果是尝试改变一种开发文化,请选择能够为您提供更好计划的一种;对于相同的,遵循文化
I've commented this on other "efficiency" posts like this one (it's both sincere and sarcastic) -- if this is where your bottlenecks reside, then high-five to you and your team.
我已经在其他“效率”帖子中对此进行了评论(既真诚又讽刺)-如果这是您的瓶颈所在,那么对您和您的团队给予高五。
回答by Bryan S.
Neither is more efficient, using the WHERE method is considered the old way to do so (http://msdn.microsoft.com/en-us/library/ms190014.aspx). YOu can look at the execution plan and see they do the same thing.
两者都不是更有效,使用 WHERE 方法被认为是这样做的旧方法(http://msdn.microsoft.com/en-us/library/ms190014.aspx)。你可以查看执行计划,看看它们做同样的事情。
回答by Craig Shearer
It may seem flippant, but the answer is whichever query for which the query analyzer produces the most efficient plan.
这可能看起来很轻率,但答案是查询分析器为哪个查询生成最有效的计划。
To my mind, they seem to be equivalent, so the query analyzer may well produce identical plans, but you'd have to test.
在我看来,它们似乎是等效的,因此查询分析器很可能会生成相同的计划,但您必须进行测试。