.net 为什么 LINQ JOIN 比用 WHERE 链接快这么多?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5551264/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why is LINQ JOIN so much faster than linking with WHERE?
提问by Tim Schmelter
I've recently upgraded to VS 2010 and am playing around with LINQ to Dataset. I have a strong typed dataset for Authorization that is in HttpCache of an ASP.NET WebApplication.
我最近升级到 VS 2010 并且正在使用 LINQ to Dataset。我有一个用于授权的强类型数据集,它位于 ASP.NET WebApplication 的 HttpCache 中。
So i wanted to know what actually is the fastest way to check if a user is authorized to do something. Hereis my datamodel and some other informations if somebody is interested.
所以我想知道检查用户是否有权做某事的最快方法实际上是什么。如果有人感兴趣,这是我的数据模型和其他一些信息。
I have checked 3 ways:
我检查了 3 种方式:
- direct database
- LINQ query with Whereconditions as "Join" - Syntax
- LINQ query with Join- Syntax
- 直接数据库
- 将Where条件设为“Join”的LINQ 查询- 语法
- 使用Join 的LINQ 查询- 语法
These are the results with 1000 calls on each function:
这些是对每个函数调用 1000 次的结果:
1.Iteration:
1.迭代:
- 4,2841519 sec.
- 115,7796925 sec.
- 2,024749 sec.
- 4,2841519 秒
- 115,7796925 秒。
- 2,024749 秒
2.Iteration:
2.迭代:
- 3,1954857 sec.
- 84,97047 sec.
- 1,5783397 sec.
- 3,1954857 秒。
- 84,97047 秒。
- 1,5783397 秒。
3.Iteration:
3.迭代:
- 2,7922143 sec.
- 97,8713267 sec.
- 1,8432163 sec.
- 2,7922143 秒。
- 97,8713267 秒。
- 1,8432163 秒。
Average:
平均数:
- Database: 3,4239506333 sec.
- Where: 99,5404964 sec.
- Join: 1,815435 sec.
- 数据库:3,4239506333 秒。
- 其中:99,5404964 秒。
- 加入:1,815435 秒。
Why is the Join-version so much faster than the where-syntax which makes it useless although as a LINQ newbie it seems to be the most legible. Or have i missed something in my queries?
为什么 Join-version 比 where-syntax 快得多,这使得它毫无用处,尽管作为 LINQ 新手它似乎是最清晰的。或者我在查询中遗漏了什么?
Here are the LINQ queries, i skip the database:
这是 LINQ 查询,我跳过了数据库:
Where:
哪里:
Public Function hasAccessDS_Where(ByVal accessRule As String) As Boolean
Dim userID As Guid = DirectCast(Membership.GetUser.ProviderUserKey, Guid)
Dim query = From accRule In Authorization.dsAuth.aspnet_AccessRule, _
roleAccRule In Authorization.dsAuth.aspnet_RoleAccessRule, _
role In Authorization.dsAuth.aspnet_Roles, _
userRole In Authorization.dsAuth.aspnet_UsersInRoles _
Where accRule.idAccessRule = roleAccRule.fiAccessRule _
And roleAccRule.fiRole = role.RoleId _
And userRole.RoleId = role.RoleId _
And userRole.UserId = userID And accRule.RuleName.Contains(accessRule)
Select accRule.idAccessRule
Return query.Any
End Function
Join:
加入:
Public Function hasAccessDS_Join(ByVal accessRule As String) As Boolean
Dim userID As Guid = DirectCast(Membership.GetUser.ProviderUserKey, Guid)
Dim query = From accRule In Authorization.dsAuth.aspnet_AccessRule _
Join roleAccRule In Authorization.dsAuth.aspnet_RoleAccessRule _
On accRule.idAccessRule Equals roleAccRule.fiAccessRule _
Join role In Authorization.dsAuth.aspnet_Roles _
On role.RoleId Equals roleAccRule.fiRole _
Join userRole In Authorization.dsAuth.aspnet_UsersInRoles _
On userRole.RoleId Equals role.RoleId _
Where userRole.UserId = userID And accRule.RuleName.Contains(accessRule)
Select accRule.idAccessRule
Return query.Any
End Function
Thank you in advance.
先感谢您。
Edit: after some improvements on both queries to get more meaningful perfomance-values, the advantage of the JOIN is even many times greater than before:
编辑:在对两个查询进行一些改进以获得更有意义的性能值之后,JOIN 的优势甚至比以前大很多倍:
Join:
加入:
Public Overloads Shared Function hasAccessDS_Join(ByVal userID As Guid, ByVal idAccessRule As Int32) As Boolean
Dim query = From accRule In Authorization.dsAuth.aspnet_AccessRule _
Join roleAccRule In Authorization.dsAuth.aspnet_RoleAccessRule _
On accRule.idAccessRule Equals roleAccRule.fiAccessRule _
Join role In Authorization.dsAuth.aspnet_Roles _
On role.RoleId Equals roleAccRule.fiRole _
Join userRole In Authorization.dsAuth.aspnet_UsersInRoles _
On userRole.RoleId Equals role.RoleId _
Where accRule.idAccessRule = idAccessRule And userRole.UserId = userID
Select role.RoleId
Return query.Any
End Function
Where:
哪里:
Public Overloads Shared Function hasAccessDS_Where(ByVal userID As Guid, ByVal idAccessRule As Int32) As Boolean
Dim query = From accRule In Authorization.dsAuth.aspnet_AccessRule, _
roleAccRule In Authorization.dsAuth.aspnet_RoleAccessRule, _
role In Authorization.dsAuth.aspnet_Roles, _
userRole In Authorization.dsAuth.aspnet_UsersInRoles _
Where accRule.idAccessRule = roleAccRule.fiAccessRule _
And roleAccRule.fiRole = role.RoleId _
And userRole.RoleId = role.RoleId _
And accRule.idAccessRule = idAccessRule And userRole.UserId = userID
Select role.RoleId
Return query.Any
End Function
Result for 1000 calls (on a faster computer)
1000 次调用的结果(在更快的计算机上)
- Join | 2. Where
- 加入 | 2. 哪里
1.Iteration:
1.迭代:
- 0,0713669 sec.
- 12,7395299 sec.
- 0,0713669 秒
- 12,7395299 秒。
2.Iteration:
2.迭代:
- 0,0492458 sec.
- 12,3885925 sec.
- 0,0492458 秒
- 12,3885925 秒。
3.Iteration:
3.迭代:
- 0,0501982 sec.
- 13,3474216 sec.
- 0,0501982 秒
- 13,3474216 秒。
Average:
平均数:
- Join: 0,0569367 sec.
- Where: 12,8251813 sec.
- 加入:0,0569367 秒。
- 其中:12,8251813 秒。
Join is 225 times faster
加入速度提高 225 倍
Conclusion:avoid WHERE to specify relations and use JOIN whenever possible(definitely in LINQ to DataSetand Linq-To-Objectsin general).
结论:尽可能避免在 WHERE 指定关系并使用 JOIN(绝对在LINQ to DataSet和Linq-To-Objects一般情况下)。
采纳答案by Thomas Levesque
Your first approach (SQL query in the DB) is quite efficient because the DB knows how to perform a join. But it doesn't really make sense to compare it with the other approaches, since they work directly in memory (Linq to DataSet)
The query with multiple tables and a
Wherecondition actually performs a cartesian productof all the tables, thenfilters the rows that satisfy the condition. This means theWherecondition is evaluated for each combination of rows (n1 * n2 * n3 * n4)The
Joinoperator takes the rows from the first tables, then takes only the rows with a matching key from the second table, then only the rows with a matching key from the third table, and so on. This is much more efficient, because it doesn't need to perform as many operations
您的第一种方法(数据库中的 SQL 查询)非常有效,因为数据库知道如何执行连接。但是将其与其他方法进行比较并没有什么意义,因为它们直接在内存中工作(Linq to DataSet)
带有多个表和一个
Where条件的查询实际上是对所有表执行笛卡尔积,然后过滤满足条件的行。这意味着Where对每个行组合 (n1 * n2 * n3 * n4) 评估条件该
Join运算符从第一个表中获取行,然后只从第二个表中获取具有匹配键的行,然后只从第三个表中获取具有匹配键的行,依此类推。这样效率更高,因为它不需要执行尽可能多的操作
回答by Guffa
The Joinis much faster, because the method knows how to combine the tables to reduce the result to the relevant combinations. When you use Whereto specify the relation, it has to create every possible combination, and then test the condition to see which combinations are relevant.
的Join要快得多,因为方法知道如何将表结合起来,以减少结果的相关组合。当您使用Where指定关系时,它必须创建每个可能的组合,然后测试条件以查看哪些组合是相关的。
The Joinmethod can set up a hash table to use as an index to quicky zip two tables together, while the Wheremethod runs after all the combinations are already created, so it can't use any tricks to reduce the combinations beforehand.
该Join方法可以设置一个哈希表作为索引,将两个表快速压缩在一起,而该Where方法在所有组合都已经创建后运行,因此它不能使用任何技巧来预先减少组合。
回答by phillip
what you really need to know is the sql that was created for the two statements. There are a few ways of getting to it but the simplest is to use LinqPad. There are several buttons right above the query results that will change to the sql. That will give you a lot more information than anything else.
您真正需要知道的是为这两个语句创建的 sql。有几种方法可以实现,但最简单的是使用 LinqPad。查询结果的正上方有几个按钮会变成sql。这将为您提供比其他任何东西都多的信息。
Great information you shared there though.
你在那里分享了很好的信息。

