C# Parallel 不适用于实体框架
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12827599/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parallel doesnt work with Entity Framework
提问by m0fo
I have a list of IDs, and I need to run several stored procedures on each ID.
我有一个 ID 列表,我需要在每个 ID 上运行多个存储过程。
When I am using a standard foreach loop, it works OK, but when I have many records, it works pretty slow.
当我使用标准的 foreach 循环时,它工作正常,但是当我有很多记录时,它的工作速度很慢。
I wanted to convert the code to work with EF, but I am getting an exception: "The underlying provider failed on Open".
我想将代码转换为与 EF 一起使用,但出现异常:“底层提供程序在打开时失败”。
I am using this code, inside the Parallel.ForEach:
我在 Parallel.ForEach 中使用此代码:
using (XmlEntities osContext = new XmlEntities())
{
//The code
}
But it still throws the exception.
但它仍然抛出异常。
Any idea how can I use Parallel with EF? do I need to create a new context for every procedure I am running? I have around 10 procedures, so I think its very bad to create 10 contexts, one for each.
知道如何将 Parallel 与 EF 一起使用吗?我是否需要为我正在运行的每个程序创建一个新的上下文?我有大约 10 个程序,所以我认为创建 10 个上下文非常糟糕,每个上下文一个。
回答by kevin_fitz
EF is not thread safe, so you cannot use Parallel.
EF 不是线程安全的,因此您不能使用 Parallel。
Take a look at Entity Framework and Multi threading
看看实体框架和多线程
and this article.
还有这篇文章。
回答by casperOne
The underlying database connections that the Entity Framework are using are notthread-safe. You willneed to create a new context for each operation on another thread that you're going to perform.
实体框架使用的底层数据库连接不是线程安全的。你将需要在另一个线程创建的每个操作的新背景下,你要执行。
Your concern about how to parallelize the operation is a valid one; that many contexts are going to be expensive to open and close.
您对如何并行化操作的关注是有效的;许多上下文的打开和关闭代价高昂。
Instead, you might want to invert how your thinking about parallelizing the code. It seems you're looping over a number of items and then calling the stored procedures in serial for each item.
相反,您可能想要颠倒您对并行化代码的想法。似乎您正在遍历许多项目,然后为每个项目依次调用存储过程。
If you can, create a new Task<TResult>(or Task, if you don't need a result) for each procedureand then in that Task<TResult>, open a single context, loop through all of the items, and then execute the stored procedure. This way, you only have a number of contexts equal to the number of stored procedures that you are running in parallel.
如果可以,为每个过程创建一个新的Task<TResult>(或者Task,如果您不需要结果),然后在其中打开一个上下文,循环遍历所有项目,然后执行存储过程。这样,您只有与并行运行的存储过程数量相等的上下文数量。Task<TResult>
Let's assume you have a MyDbContextwith two stored procedures, DoSomething1and DoSomething2, both of which take an instance of a class, MyItem.
假设您有一个MyDbContext带有两个存储过程的DoSomething1和DoSomething2,它们都采用一个类的实例,MyItem。
Implementing the above would look something like:
实现上述内容将类似于:
// You'd probably want to materialize this into an IList<T> to avoid
// warnings about multiple iterations of an IEnumerable<T>.
// You definitely *don't* want this to be an IQueryable<T>
// returned from a context.
IEnumerable<MyItem> items = ...;
// The first stored procedure is called here.
Task t1 = Task.Run(() => {
// Create the context.
using (var ctx = new MyDbContext())
// Cycle through each item.
foreach (MyItem item in items)
{
// Call the first stored procedure.
// You'd of course, have to do something with item here.
ctx.DoSomething1(item);
}
});
// The second stored procedure is called here.
Task t2 = Task.Run(() => {
// Create the context.
using (var ctx = new MyDbContext())
// Cycle through each item.
foreach (MyItem item in items)
{
// Call the first stored procedure.
// You'd of course, have to do something with item here.
ctx.DoSomething2(item);
}
});
// Do something when both of the tasks are done.
If you can'texecute the stored procedures in parallel (each one is dependent on being run in a certain order), then you can still parallelize your operations, it's just a little more complex.
如果您不能并行执行存储过程(每个存储过程都依赖于以特定顺序运行),那么您仍然可以并行化您的操作,只是稍微复杂一些。
You would look at creating custom partitionsacross your items (using the static Createmethodon the Partitionerclass). This will give you the means to get IEnumerator<T>implementations (note, this is notIEnumerable<T>so you can't foreachover it).
您将考虑跨项目创建自定义分区(使用类上的静态Create方法)。这将为您提供获得实现的方法(请注意,这不是您无法克服的)。PartitionerIEnumerator<T>IEnumerable<T>foreach
For each IEnumerator<T>instance you get back, you'd create a new Task<TResult>(if you need a result), and in the Task<TResult>body, you would create the context and then cycle through the items returned by the IEnumerator<T>, calling the stored procedures in order.
对于IEnumerator<T>您返回的每个实例,您将创建一个新实例Task<TResult>(如果您需要结果),并在Task<TResult>正文中创建上下文,然后循环浏览 返回的项目IEnumerator<T>,按顺序调用存储过程。
That would look like this:
那看起来像这样:
// Get the partitioner.
OrdinalPartitioner<MyItem> partitioner = Partitioner.Create(items);
// Get the partitions.
// You'll have to set the parameter for the number of partitions here.
// See the link for creating custom partitions for more
// creation strategies.
IList<IEnumerator<MyItem>> paritions = partitioner.GetPartitions(
Environment.ProcessorCount);
// Create a task for each partition.
Task[] tasks = partitions.Select(p => Task.Run(() => {
// Create the context.
using (var ctx = new MyDbContext())
// Remember, the IEnumerator<T> implementation
// might implement IDisposable.
using (p)
// While there are items in p.
while (p.MoveNext())
{
// Get the current item.
MyItem current = p.Current;
// Call the stored procedures. Process the item
ctx.DoSomething1(current);
ctx.DoSomething2(current);
}
})).
// ToArray is needed (or something to materialize the list) to
// avoid deferred execution.
ToArray();
回答by to11mtm
It's a bit difficult to troubleshoot this one without knowing what the inner exception result is, if any. This could very simply be a problem with the way that the connection string or provider configuration is set up.
在不知道内部异常结果是什么(如果有)的情况下对这个问题进行故障排除有点困难。这很可能只是连接字符串或提供程序配置的设置方式的问题。
In general, you have to be careful with parallel code and EF. What you're doing -should- work, however. One question in my mind; Is any work being done on another instance of that context beforethe parallel? According to your post, you're doing a separate context in each thread. That's good. Part of me wonders however if there's some interesting constructor contention going on between the multiple contexts. If you aren't using that context anywhere before that parallel call, I would suggest trying to run even a simple query against the context to open it and make sure all of the EF bits are fired up before running the parallel method. I'll admit, I have not tried exactlywhat you did here, but I've done close and it's worked.
通常,您必须小心并行代码和 EF。然而,你正在做的 - 应该 - 工作。我心中的一个问题;在并行之前是否对该上下文的另一个实例进行了任何工作?根据您的帖子,您在每个线程中都有一个单独的上下文。那挺好的。然而,我的一部分想知道在多个上下文之间是否存在一些有趣的构造函数争用。如果在并行调用之前没有在任何地方使用该上下文,我建议尝试针对上下文运行一个简单的查询以打开它,并确保在运行并行方法之前启动所有 EF 位。我承认,我还没有完全尝试过你在这里所做的,但我已经接近并成功了。
回答by realstrategos
This is what I use and works great. It additionally supports handling of the error exceptions and has a debug mode which makes it far easier to track things down
这是我使用的并且效果很好。它还支持错误异常的处理,并具有调试模式,可以更轻松地进行跟踪
public static ConcurrentQueue<Exception> Parallel<T>(this IEnumerable<T> items, Action<T> action, int? parallelCount = null, bool debugMode = false)
{
var exceptions = new ConcurrentQueue<Exception>();
if (debugMode)
{
foreach (var item in items)
{
try
{
action(item);
}
// Store the exception and continue with the loop.
catch (Exception e)
{
exceptions.Enqueue(e);
}
}
}
else
{
var partitions = Partitioner.Create(items).GetPartitions(parallelCount ?? Environment.ProcessorCount).Select(partition => Task.Factory.StartNew(() =>
{
while (partition.MoveNext())
{
try
{
action(partition.Current);
}
// Store the exception and continue with the loop.
catch (Exception e)
{
exceptions.Enqueue(e);
}
}
}));
Task.WaitAll(partitions.ToArray());
}
return exceptions;
}
You use it like the following where as db is the original DbContext and db.CreateInstance() creates a new instance using the same connection string.
您可以像下面这样使用它,其中 db 是原始 DbContext 并且 db.CreateInstance() 使用相同的连接字符串创建一个新实例。
var batch = db.Set<SomeListToIterate>().ToList();
var exceptions = batch.Parallel((item) =>
{
using (var batchDb = db.CreateInstance())
{
var batchTime = batchDb.GetDBTime();
var someData = batchDb.Set<Permission>().Where(x=>x.ID = item.ID).ToList();
//do stuff to someData
item.WasMigrated = true; //note that this record is attached to db not batchDb and will only be saved when db.SaveChanges() is called
batchDb.SaveChanges();
}
});
if (exceptions.Count > 0)
{
logger.Error("ContactRecordMigration : Content: Error processing one or more records", new AggregateException(exceptions));
throw new AggregateException(exceptions); //optionally throw an exception
}
db.SaveChanges(); //save the item modifications

