C# 实体框架如何处理大量记录?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19311491/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 14:42:54  来源:igfitidea点击:

How entity framework works for large number of records?

c#entity-frameworkdatabase-designentity-framework-6

提问by Abhijeet

I see already a un-answered question here on.

在这里已经看到一个未回答的问题。

My question is -

我的问题是——

Is EF really production ready for large application?

EF 真的为大型应用做好了生产准备吗?

The question originated from these underlying questions -

这个问题源于这些潜在的问题——

  1. EF pulls all the records into memory then performs the query operation. How EF would behave when table has around ~1000 records?
  2. For simple edit I have to pull the record edit it and then push to db using SaveChanges()
  1. EF 将所有记录拉入内存,然后执行查询操作。当表有大约 1000 条记录时,EF 会如何表现?
  2. 对于简单的编辑,我必须拉出记录编辑它,然后使用 SaveChanges()

采纳答案by Simple Fellow

I faced a similar situation where we had a large database with many tables 7- 10 million records each. we used Entity framework to display the data. To get nice performance here's what I learned; My 10 Golden rules for Entity Framework:

我遇到过类似的情况,我们有一个包含许多表的大型数据库,每个表有 7-1000 万条记录。我们使用实体框架来显示数据。为了获得良好的表现,这是我学到的;我的实体框架 10 条黄金法则

  1. Understand that call to database made only when the actual records are required. all the operations are just used to make the query (SQL) so try to fetch only a piece of data rather then requesting a large number of records. Trim the fetch size as much as possible

  2. Yes, (In some cases stored procedures are a better choice, they are not that evil as some make you believe), you should use stored procedures where necessary. Import them into your model and have function imports for them. You can also call them directly ExecuteStoreCommand(), ExecuteStoreQuery<>(). Same goes for functions and views but EF has a really odd way of calling functions "SELECT dbo.blah(@id)".

  3. EF performs slower when it has to populate an Entity with deep hierarchy. be extremely careful with entities with deep hierarchy

  4. Sometimes when you are requesting records and you are not required to modify them you should tell EF not to watch the property changes (AutoDetectChanges). that way record retrieval is much faster

  5. Indexing of database is good but in case of EF it becomes very important. The columns you use for retrieval and sorting should be properly indexed.

  6. When you model is large, VS2010/VS2012 Model designer gets real crazy. so break your model into medium sized models. There is a limitation that the Entities from different models cannot be shared even though they may be pointing to the same table in the database.

  7. When you have to make changes in the same entity at different places, use the same entity, make changes and save it only once. The point is to AVOID retrieving the same record, make changes & save it multiple times. (Real performance gain tip).

  8. When you need the info in only one or two columns try not to fetch the full entity. you can either execute your sql directly or have a mini entity something. You may need to cache some frequently used data in your application also.

  9. Transactions are slow. be careful with them.

  10. SQL Profiler or any query profiler is your friend. Run it when developing your application to see what does EF sends to database. When you perform a join using LINQ or Lambda expression in ur application, EF usually generates a Select-Where-In-Select style query which may not always perform well. If u find any such case, roll up ur sleeves, perform the join on DB and have EF retrieve results. (I forgot this one, the most important one!)

  1. 了解仅在需要实际记录时才调用数据库。所有操作仅用于进行查询(SQL),因此尝试仅获取一条数据,而不是请求大量记录。尽可能地修剪提取大小

  2. 是的(在某些情况下,存储过程是更好的选择,它们并不像某些人认为的那么邪恶),您应该在必要时使用存储过程。将它们导入您的模型并为它们导入函数。也可以直接调用它们 ExecuteStoreCommand()、ExecuteStoreQuery<>()。函数和视图也是如此,但 EF 有一种非常奇怪的调用函数“SELECT dbo.blah(@id)”的方式。

  3. 当 EF 必须填充具有深层层次结构的实体时,它的执行速度较慢。对具有深层层次结构的实体要格外小心

  4. 有时,当您请求记录并且不需要修改它们时,您应该告诉 EF 不要监视属性更改 (AutoDetectChanges)。这样记录检索要快得多

  5. 数据库的索引很好,但在 EF 的情况下它变得非常重要。您用于检索和排序的列应正确编制索引。

  6. 当您的模型很大时,VS2010/VS2012 模型设计师会变得非常疯狂。所以把你的模型分解成中等大小的模型。即使来自不同模型的实体可能指向数据库中的同一个表,也不能共享它们。

  7. 当您必须在不同位置对同一实体进行更改时,请使用同一实体,进行更改并仅保存一次。重点是避免检索相同的记录,进行更改并多次保存。(实际性能增益提示)。

  8. 当您只需要一两列中的信息时,尽量不要获取完整的实体。你可以直接执行你的 sql 或者有一个迷你实体。您可能还需要在应用程序中缓存一些常用数据。

  9. 交易缓慢。小心他们。

  10. SQL Profiler 或任何查询分析器都是您的朋友。在开发应用程序时运行它以查看 EF 发送到数据库的内容。当您在您的应用程序中使用 LINQ 或 Lambda 表达式执行连接时,EF 通常会生成 Select-Where-In-Select 样式查询,该查询可能并不总是表现良好。如果您发现任何此类情况,请卷起袖子,在 DB 上执行连接并让 EF 检索结果。(我忘记了这个,最重要的一个!)

if you keep these things in mind EF should give almost similar performance as plain ADO.NET if not the same.

如果您牢记这些事情,EF 应该提供与普通 ADO.NET 几乎相似的性能,如果不一样的话。

回答by MarcinJuraszek

1. EF pulls all the records into memory then performs the query operation. How EF would behave when table has around ~1000 records?

1. EF 将所有记录拉入内存,然后执行查询操作。当表有大约 1000 条记录时,EF 会如何表现?

That's not true!EF fetches only necessary records and queries are transformed into proper SQL statements. EF can cache objects locally within DataContext(and track all changes made to entities), but as long as you follow the rule to keep context open only when needed, it won't be a problem.

这不是真的!EF 仅获取必要的记录,并将查询转换为正确的 SQL 语句。EF 可以在本地缓存对象DataContext(并跟踪对实体所做的所有更改),但只要您遵循规则以仅在需要时保持上下文打开,就不会有问题。

2. For simple edit I have to pull the record edit it and then push to db using SaveChanges()

2. 对于简单的编辑,我必须拉出记录编辑它,然后使用 SaveChanges() 推送到数据库

It's true, but I would not botherin doing that unless you really see the performance problems. Because 1. is not true, you'll only get one record from DB fetched before it's saved. You can bypass that, by creating the SQL query as a string and sending it as a plain string.

这是真的,但除非你真的看到性能问题,否则我不会费心去做。因为 1. 不正确,所以在保存之前,您只会从数据库中获取一条记录。您可以通过将 SQL 查询创建为字符串并将其作为纯字符串发送来绕过它。

回答by Botz3000

  1. EF translates your LINQ query into an SQL query, so it doesn't pull all records into memory. The generated SQL might not always be the most efficient, but a thousand records won't be a problem at all.
  2. Yes, that's one way of doing it (assuming you only want to edit one record). If you are changing several records, you can get them all using one query and SaveChanges()will persist all of those changes.
  1. EF 将您的 LINQ 查询转换为 SQL 查询,因此它不会将所有记录拉入内存。生成的 SQL 可能并不总是最有效的,但一千条记录根本不是问题。
  2. 是的,这是一种方法(假设您只想编辑一条记录)。如果您要更改多条记录,则可以使用一个查询获取所有记录,SaveChanges()并将保留所有这些更改。

回答by Dannydust

There is no simple answer for your question. The main thing is about what you want to do with your data? And do you need so much data at one time?

你的问题没有简单的答案。最重要的是你想用你的数据做什么?您一次需要这么多数据吗?

EF translated your Queries to SQL so at this time there is no Object in Memory. When you get the data, then the selected records are in memory. If you are selecting a large amount of large objects then it can be a performance killer if you need to manipulate them all.

EF 将您的查询转换为 SQL,因此此时内存中没有对象。当您获得数据时,所选记录就在内存中。如果您要选择大量大型对象,那么如果您需要对它们进行操作,这可能会成为性能杀手。

If you don't need to manipulate them all you can disable change tracking and enable it later for single objects you need to manipulate.

如果您不需要全部操作它们,您可以禁用更改跟踪并在稍后为您需要操作的单个对象启用它。

So you see it depends on your type of application. If you need to manipulate a mass of data efficient, then don't use a OR-Mapper!

所以你看这取决于你的应用程序类型。如果您需要高效地操作大量数据,那么不要使用 OR-Mapper!

Otherwise EF is fine, but consider how many objects you really need at one time and what you want to do with them.

否则 EF 很好,但请考虑一次您真正需要多少个对象以及您想用它们做什么。

回答by tcwicks

EF is not a bad ORM framework. It is a different one with its own characteristics. Compare Microsoft Entity Framework 6, against say NetTiers which is powered by Microsoft Enterprise Library 6.

EF 不是一个糟糕的 ORM 框架。这是一个不同的,有自己的特点。将 Microsoft Entity Framework 6 与由 Microsoft Enterprise Library 6 提供支持的 NetTiers 进行比较。

These are two entirely different beasts. The accepted answer is really good because it goes through the nuances of EF6. Whats key to understand is that each ORM has its own strengths and weaknesses. Compare the project requirements and its data access patterns against the ORM's behavior patterns.

这是两种完全不同的野兽。接受的答案非常好,因为它经历了 EF6 的细微差别。理解的关键是每个 ORM 都有自己的优点和缺点。将项目需求及其数据访问模式与 ORM 的行为模式进行比较。

For Example: NetTiers will always give you higher raw performance than EF6. However that is primarily because it is not a point and click ORM and as part and parcel of generating the ORM you will be optimizing your data model, adding custom stored procedures where relevant, etc... if you engineered your data model with the same effort for EF6 you could probably get close to the same performance.

例如:NetTiers 将始终为您提供比 EF6 更高的原始性能。然而,这主要是因为它不是一个点击式 ORM,并且作为生成 ORM 的一部分,您将优化您的数据模型,在相关的地方添加自定义存储过程等......如果您使用相同的方式设计数据模型EF6 的努力,您可能会接近相同的性能。

Also consider can you modify the ORM? for example with NetTiers you can add extensions to the codesmith templates to include your own design patterns over and above what is generated by the base ORM library.

还考虑可以修改ORM吗?例如,使用 NetTiers,您可以向 codesmith 模板添加扩展,以在基本 ORM 库生成的内容之上包含您自己的设计模式。

Also consider EF6 makes significant use of reflection whereas NetTiers or any library powered by Microsoft Enterprise Library will make heavy use of Generics instead. These are two entirely different approaches. Why so? Because EF6 is based on dynamic reflection whereas NetTiers is based on static reflection. Which is faster and which is better entirely depends on the usage patterns that will be required of the ORM.

还要考虑 EF6 大量使用反射,而 NetTiers 或任何由 Microsoft Enterprise Library 提供支持的库将大量使用泛型。这是两种完全不同的方法。为什么这样?因为 EF6 基于动态反射,而 NetTiers 基于静态反射。哪个更快,哪个更好完全取决于 ORM 所需的使用模式。

Sometimes a hybrid approach works better: Consider for example EF6 for Web API OData endpoints, A few large tables wrapped with NetTiers & Microsoft Enterprise Library with custom stored procedures, and a few large masterdata tables wrapped with a custom built write through object cache where on initial load the record set is streamed into the memory cache using an ADO data reader.

有时,混合方法效果更好:例如,考虑用于 Web API OData 端点的 EF6、使用 NetTiers 和 Microsoft 企业库与自定义存储过程包装的一些大型表,以及使用自定义构建的直写对象缓存包装的一些大型主数据表。初始加载使用 ADO 数据读取器将记录集流式传输到内存缓存中。

These are all different and they all have their best fit scenarios: EF6, NetTiers, NHibernate, Wilson OR Mapper, XPO from Dev Express, etc...

这些都是不同的,它们都有最适合的场景:EF6、NetTiers、NHibernate、Wilson OR Mapper、Dev Express 的 XPO 等......