C# EntityFramework 中的 .Include() 与 .Load() 性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19319116/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 14:44:44  来源:igfitidea点击:

.Include() vs .Load() performance in EntityFramework

c#.netentity-framework

提问by Staeff

When querying a large table where you need to access the navigation properties later on in code (I explicitly don't want to use lazy loading) what will perform better .Include()or .Load()? Or why use the one over the other?

在查询一个大桌子,你需要访问的导航属性以后的代码(我明确不想使用延迟加载)究竟会表现得更好.Include().Load()?或者为什么要使用一个而不是另一个?

In this example the included tables all only have about 10 entries and employees has about 200 entries, and it can happen that most of those will be loaded anyway with include because they match the where clause.

在这个例子中,包含的表都只有大约 10 个条目,而雇员有大约 200 个条目,并且可能发生大多数情况无论如何都会用 include 加载,因为它们匹配 where 子句。

Context.Measurements.Include(m => m.Product)
                    .Include(m => m.ProductVersion)
                    .Include(m => m.Line)
                    .Include(m => m.MeasureEmployee)
                    .Include(m => m.MeasurementType)
                    .Where(m => m.MeasurementTime >= DateTime.Now.AddDays(-1))
                    .ToList();

or

或者

Context.Products.Load();
Context.ProductVersions.Load();
Context.Lines.Load();
Context.Employees.Load();
Context.MeasurementType.Load();

Context.Measurements.Where(m => m.MeasurementTime >= DateTime.Now.AddDays(-1))
                    .ToList();

采纳答案by Michael Edenfield

It depends, try both

这取决于,两个都试试

When using Include(), you get the benefitof loading all of your data in a single call to the underlying data store. If this is a remote SQL Server, for example, that can be a major performance boost.

使用时Include(),你得到的好处加载所有的数据到底层数据存储的单一通话。例如,如果这是一个远程 SQL Server,这可能是一个重大的性能提升。

The downsideis that Include()queries tend to get reallycomplicated, especially if you have any filters (Where()calls, for example) or try to do any grouping. EF will generate very heavily nested queries using sub-SELECTand APPLYstatements to get the data you want. It is also much less efficient -- you get back a single row of data with every possible child-object column in it, so data for your top level objects will be repeated a lot of times. (For example, a single parent object with 10 children will product 10 rows, each with the same data for the parent-object's columns.) I've had single EF queries get so complex they caused deadlockswhen running at the same time as EF update logic.

缺点是,Include()查询往往会得到真正复杂的,特别是如果你有任何过滤器(Where()电话等)或尝试做任何分组。EF 将使用 sub-SELECTAPPLYstatements生成非常密集的嵌套查询以获取您想要的数据。它的效率也低得多——您将返回一行数据,其中包含每个可能的子对象列,因此顶级对象的数据将重复很多次。(例如,具有 10 个子项的单个父对象将产生 10 行,每行都具有父对象列的相同数据。)我曾遇到过单个EF 查询变得如此复杂,以至于在与 EF 同时运行时导致死锁更新逻辑。

The Load()method is much simpler. Each query is a single, easy, straightforward SELECTstatement against a single table. These are much easier in every possible way, exceptyou have to do many of them (possibly many times more). If you have nested collections of collections, you may even need to loop through your top level objects and Loadtheir sub-objects. It can get out of hand.

Load()方法是非常简单的。每个查询都是SELECT针对单个表的单个、简单、直接的语句。这些在所有可能的方式中都容易得多,除非您必须执行其中的许多操作(可能还要执行很多次)。如果您有嵌套的集合集合,您甚至可能需要遍历顶级对象Load及其子对象。它可能会失控。

Quick rule-of-thumb

快速的经验法则

Try to avoidhaving any more than three Includecallsin a single query. I find that EF's queries get too ugly to recognize beyond that; it also matches my rule-of-thumb for SQL Server queries, that up to four JOIN statements in a single query works very well, but after that it's time to consider refactoring.

尽量避免有任何超过三个Include电话在一个单一的查询。我发现 EF 的查询变得太难看而无法识别;它也符合我对 SQL Server 查询的经验法则,单个查询中最多四个 JOIN 语句效果很好,但之后是时候考虑重构了

However, all of that is only a starting point.

然而,这一切都只是一个起点。

It depends on your schema, your environment, your data, and many other factors.

这取决于您的架构、环境、数据和许多其他因素。

In the end, you will just need to try it out each way.

最后,您只需要尝试每种方式

Pick a reasonable "default" pattern to use, see if it's good enough, and if not, optimize to taste.

选择一个合理的“默认”模式来使用,看看它是否足够好,如果不是,优化以适应口味。

回答by CodeCaster

Include()will be written to SQL as JOIN: one database roundtrip.

Include()将写入 SQL 为JOIN:一次数据库往返。

Each Load()-instruction is "explicitly loading" the requested entities, so one database roundtrip per call.

每个 -Load()指令都“显式加载”请求的实体,因此每次调用一次数据库往返。

Thus Include()will most probably be the more sensible choice in this case, but it depends on the database layout, how often this code is called and how long your DbContextlives. Why don't you try both ways and profile the queries and compare the timings?

因此Include()在这种情况下很可能是更明智的选择,但这取决于数据库布局、调用此代码的频率以及您的DbContext生命周期。为什么不尝试两种方式并分析查询并比较时间?

See Loading Related Entities.

请参阅加载相关实体

回答by Henk Jansen

Includeis an example of eager loading, where as you not only load the entities you are querying for, but also all related entities.

Include是一个预先加载的例子,因为你不仅加载了你正在查询的实体,还加载了所有相关的实体。

Loadis an manual override of the EnableLazyLoading. If this one is set to false. You can still lazily load the entity you asked for with .Load()

Load是 的手动覆盖EnableLazyLoading。如果将此设置为false. 您仍然可以延迟加载您要求的实体.Load()

回答by MaxSC

It's always hard to decide whether to go with Eager, Explicit or even Lazy Loading.
What I would recommend anyway is always to perform some profiling. That's the only way to be sure your request will be performant or not.
There're a lot of tools that will help you out. Have a look at this article from Julie Lermanwhere she lists several different ways to do profiling. One simple solution is to start profiling in your SQL Server Management Studio.
Do not hesitate to talk with a DBA (if you have on near you) that will help you to understand the execution plan.
You could also have a look a this presentationwhere I wrote a section about loading data and performance.

总是很难决定是使用 Eager、Explicit 还是 Lazy Loading。
无论如何,我建议始终执行一些分析。这是确保您的请求是否有效的唯一方法。
有很多工具可以帮助你。看看Julie Lerman 的这篇文章,她列出了几种不同的分析方法。一种简单的解决方案是在 SQL Server Management Studio 中开始分析
不要犹豫与 DBA(如果您在附近)交谈,这将帮助您了解执行计划。
你也可以看看这个演示文稿,我写了一个关于加载数据和性能的部分。

回答by Scott Munro

I agree with @MichaelEdenfield in his answerbut I did want to comment on the nested collections scenario. You can get around having to do nested loops (and the many resulting calls to the database) by turning the query inside out.

我同意 @MichaelEdenfield 在他的回答中的观点,但我确实想对嵌套集合场景发表评论。您可以通过将查询内外翻转来避免必须执行嵌套循环(以及对数据库的许多结果调用)。

Rather than loop down through a Customer's Orders collection and then performing another nested loop through the Order's OrderItems collection say, you can query the OrderItems directly with a filter such as the following.

与其循环遍历 Customer 的 Orders 集合,然后通过 Order 的 OrderItems 集合执行另一个嵌套循环,不如说,您可以使用如下过滤器直接查询 OrderItems。

context.OrderItems.Where(x => x.Order.CustomerId == customerId);

You will get the same resulting data as the Loads within nested loops but with just a single call to the database.

您将获得与嵌套循环中的负载相同的结果数据,但只需对数据库进行一次调用。

Also, there is a special case that should be considered with Includes. If the relationship between the parent and the child is one to one then the problem with the parent data being returned multiple times would not be an issue.

此外,还应考虑使用 Includes 的特殊情况。如果父子之间的关系是一对一的,那么多次返回父数据的问题就不是问题。

I am not sure what the effect would be if the majority case was where no child existed - lots of nulls? Sparse children in a one to one relationship might be better suited to the direct query technique that I outlined above.

如果大多数情况下没有孩子存在 - 很多空值,我不确定会产生什么影响?一对一关系中的稀疏子级可能更适合我上面概述的直接查询技术。

回答by PernerOl

One more thing to add to this thread. It depends on what server you use. If you are working on sql server it's ok to use eager loading but for sqlite you will have to use .Load() to avoid crossloading exception cause sqlite can not deal with some include statements that go deeper than one dependency level

要添加到此线程的另一件事。这取决于您使用的服务器。如果您在 sql server 上工作,可以使用预先加载,但对于 sqlite,您将必须使用 .Load() 来避免交叉加载异常,因为 sqlite 无法处理一些比一个依赖级别更深的包含语句