C# Linq 中的 ToList 方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15027402/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
ToList method in Linq
提问by user1976469
If I am not wrong, the ToList() method iterate on each element of provided collection and add them to new instance of List and return this instance.Suppose an example
如果我没猜错,ToList() 方法会迭代提供的集合的每个元素并将它们添加到 List 的新实例并返回这个实例。假设一个例子
//using linq
list = Students.Where(s => s.Name == "ABC").ToList();
//traditional way
foreach (var student in Students)
{
if (student.Name == "ABC")
list.Add(student);
}
I think the traditional way is faster, as it loops only once, where as of above of Linq iterates twice once for Where method and then for ToList() method.
我认为传统的方式更快,因为它只循环一次,上面的 Linq 对 Where 方法和 ToList() 方法迭代两次。
The project I am working on now has extensive use of Lists all over and I see there is alot of such kind of use of ToList() and other Methods that can be made better like above if I take listvariable as IEnumerableand remove .ToList() and use it further as IEnumerable.
我正在从事的项目现在到处都广泛使用了列表,我看到有很多这样的 ToList() 和其他方法的使用,如果我将列表变量作为IEnumerable并删除 .ToList可以像上面那样做得更好() 并将其进一步用作 IEnumerable。
Do these things make any impact on performance?
这些事情对性能有影响吗?
采纳答案by svick
Do these things make any impact on performance?
这些事情对性能有影响吗?
That depends on your code. Most of the time, using LINQ does cause a small performance hit. In some cases, this hit can be significant for you, but you should avoid LINQ only when you know that it is too slow for you (i.e. if profiling your code showed that LINQ is reason why your code is slow).
这取决于你的代码。大多数情况下,使用 LINQ 确实会对性能造成很小的影响。在某些情况下,这种影响对您来说可能很重要,但是只有当您知道 LINQ 对您来说太慢时,您才应该避免使用 LINQ(即,如果分析您的代码表明 LINQ 是您的代码缓慢的原因)。
But you're right that using ToList()
too often can cause significant performance problems. You should call ToList()
only when you have to. Be aware that there are also cases where adding ToList()
can improve performance a lot (e.g. when the collection is loaded from database every time it's iterated).
但是您说得对,使用ToList()
太频繁会导致严重的性能问题。你应该ToList()
只在你必须的时候打电话。请注意,在某些情况下,添加ToList()
可以大大提高性能(例如,每次迭代时从数据库加载集合时)。
Regarding the number of iterations: it depends on what exactly do you mean by “iterates twice”. If you count the number of times MoveNext()
is called on some collection, then yes, using Where()
this way leads to iterating twice. The sequence of operations goes like this (to simplify, I'm going to assume that all items match the condition):
关于迭代次数:这取决于“迭代两次”的确切含义。如果您计算MoveNext()
在某个集合上调用的次数,那么是的,使用Where()
这种方式会导致迭代两次。操作顺序如下(为简化起见,我假设所有项目都符合条件):
Where()
is called, no iteration for now,Where()
returns a special enumerable.ToList()
is called, callingMoveNext()
on the enumerable returned fromWhere()
.Where()
now callsMoveNext()
on the original collection and gets the value.Where()
calls your predicate, which returnstrue
.MoveNext()
called fromToList()
returns,ToList()
gets the value and adds it to the list.- …
Where()
被调用,暂时没有迭代,Where()
返回一个特殊的可枚举。ToList()
被调用,调用MoveNext()
从Where()
.Where()
现在调用MoveNext()
原始集合并获取值。Where()
调用您的谓词,它返回true
.MoveNext()
从ToList()
返回调用,ToList()
获取值并将其添加到列表中。- …
What this means is that if all nitems in the original collection match the condition, MoveNext()
will be called 2ntimes, ntimes from Where()
and ntimes from ToList()
.
这意味着如果原始集合中的所有n项都符合条件,MoveNext()
则将被调用 2 n次、n次 fromWhere()
和n次 from ToList()
。
回答by Evelie
var list = Students.Where(s=>s.Name == "ABC");
This will only create a query and not loop the elements until the query is used. By calling ToList() will first then execute the query and thus only loop your elements once.
这只会创建一个查询,并且在使用该查询之前不会循环元素。通过调用 ToList() 将首先执行查询,因此只循环您的元素一次。
List<Student> studentList = new List<Student>();
var list = Students.Where(s=>s.Name == "ABC");
foreach(Student s in list)
{
studentList.add(s);
}
this example will also only iterate once. Because its only used once. Keep in mind that list will iterate all students everytime its called.. Not only just those whose names are ABC. Since its a query.
这个例子也只会迭代一次。因为它只用过一次。请记住,该列表将在每次调用时迭代所有学生。不仅仅是那些名字是 ABC 的学生。因为它是一个查询。
And for the later discussion Ive made a testexample. Perhaps its not the very best implementation of IEnumable but it does what its supposed to do.
对于后面的讨论,我做了一个测试示例。也许它不是 IEnumable 的最佳实现,但它做了它应该做的。
First we have our list
首先我们有我们的清单
public class TestList<T> : IEnumerable<T>
{
private TestEnumerator<T> _Enumerator;
public TestList()
{
_Enumerator = new TestEnumerator<T>();
}
public IEnumerator<T> GetEnumerator()
{
return _Enumerator;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
throw new NotImplementedException();
}
internal void Add(T p)
{
_Enumerator.Add(p);
}
}
And since we want to count how many times MoveNext is called we have to implement our custom enumerator aswel. Observe in MoveNext we have a counter that is static in our program.
由于我们想要计算 MoveNext 被调用的次数,我们必须实现我们的自定义枚举器 aswel。在 MoveNext 中观察,我们的程序中有一个静态计数器。
public class TestEnumerator : IEnumerator { public Item FirstItem = null; public Item CurrentItem = null;
公共类 TestEnumerator : IEnumerator { public Item FirstItem = null; 公共项目当前项目 = null;
public TestEnumerator()
{
}
public T Current
{
get { return CurrentItem.Value; }
}
public void Dispose()
{
}
object System.Collections.IEnumerator.Current
{
get { throw new NotImplementedException(); }
}
public bool MoveNext()
{
Program.Counter++;
if (CurrentItem == null)
{
CurrentItem = FirstItem;
return true;
}
if (CurrentItem != null && CurrentItem.NextItem != null)
{
CurrentItem = CurrentItem.NextItem;
return true;
}
return false;
}
public void Reset()
{
CurrentItem = null;
}
internal void Add(T p)
{
if (FirstItem == null)
{
FirstItem = new Item<T>(p);
return;
}
Item<T> lastItem = FirstItem;
while (lastItem.NextItem != null)
{
lastItem = lastItem.NextItem;
}
lastItem.NextItem = new Item<T>(p);
}
}
And then we have a custom item that just wraps our value
然后我们有一个自定义项目,它只是包装我们的价值
public class Item<T>
{
public Item(T item)
{
Value = item;
}
public T Value;
public Item<T> NextItem;
}
To use the actual code we create a "list" with 3 entries.
为了使用实际代码,我们创建了一个包含 3 个条目的“列表”。
public static int Counter = 0;
static void Main(string[] args)
{
TestList<int> list = new TestList<int>();
list.Add(1);
list.Add(2);
list.Add(3);
var v = list.Where(c => c == 2).ToList(); //will use movenext 4 times
var v = list.Where(c => true).ToList(); //will also use movenext 4 times
List<int> tmpList = new List<int>(); //And the loop in OP question
foreach(var i in list)
{
tmpList.Add(i);
} //Also 4 times.
}
And conclusion? How does it hit performance? The MoveNext is called n+1 times in this case. Regardless of how many items we have. And also the WhereClause does not matter, he will still run MoveNext 4 times. Because we always run our query on our initial list. The only performance hit we will take is the actual LINQ framework and its calls. The actual loops made will be the same.
和结论?它如何影响性能?在这种情况下,MoveNext 被调用 n+1 次。不管我们有多少物品。并且 WhereClause 也无关紧要,他仍然会运行 MoveNext 4 次。因为我们总是在我们的初始列表上运行我们的查询。我们将采取的唯一性能影响是实际的 LINQ 框架及其调用。实际制作的循环将是相同的。
And before anyone asks why its N+1 times and not N times. Its because he returns false the last time when he is out of elements. Making it the number of elements + end of list.
在有人问为什么是 N+1 次而不是 N 次之前。这是因为他最后一次在元素不足时返回false。使其成为元素数 + 列表结尾。
回答by SWeko
First of all, Why are you even asking me?Measure for yourself and see.
首先,你为什么要问我?自己测量看看。
That said, Where
, Select
, OrderBy
and the other LINQ IEnumerable
extension methods, in general, are implemented as lazy as possible (the yield
keyword is used often). That means that they do not work on the data unless they have to. From your example:
也就是说,Where
, Select
,OrderBy
和其他 LINQIEnumerable
扩展方法,一般都是尽可能懒惰地实现(yield
经常使用关键字)。这意味着除非必须,否则他们不会处理数据。从你的例子:
var list = Students.Where(s => s.Name == "ABC");
won't execute anything. This will return momentarily even if Students
is a list of 10 million objects. The predicate won't be called at all until the result is actually requested somewhere, and that is practically what ToList()
does: It says "Yes, the results - all of them - are required immediately".
不会执行任何东西。即使Students
是 1000 万个对象的列表,这也会立即返回。直到在某处实际请求结果之前,谓词根本不会被调用,这实际上就是这样ToList()
做的:它说“是的,结果 - 所有这些 - 都是立即需要的”。
There is however, some initial overhead in calling of the LINQ methods, so the traditional way will, in general, be faster, but composability and the ease-of-use of the LINQ methods, IMHO, more than compensate for that.
然而,在调用 LINQ 方法时有一些初始开销,因此传统方法通常会更快,但 LINQ 方法的可组合性和易用性,恕我直言,足以弥补这一点。
If you like to take a look at how these methods are implemented, they are available for reference from Microsoft Reference Sources.
如果您想了解这些方法是如何实现的,可以从Microsoft 参考源中获得它们的参考。
回答by Jim Wooley
To answer this completely, it depends on the implementation. If you are talking about LINQ to SQL/EF, there will be only one iteration in this case when .ToList is called, which internally calls .GetEnumerator. The query expression is then parsed into TSQL and passed to the database. The resulting rows are then iterated over (once) and added to the list.
要完全回答这个问题,这取决于实现。如果你在谈论 LINQ to SQL/EF,在这种情况下,当调用 .ToList 时,将只有一次迭代,它内部调用 .GetEnumerator。然后将查询表达式解析为 TSQL 并传递给数据库。然后将结果行迭代(一次)并添加到列表中。
In the case of LINQ to Objects, there is only one pass through the data as well. The use of yield return in the where clause sets up a state machine internally which keeps track of where the process is in the iteration. Where does NOT do a full iteration creating a temporary list and then passing those results to the rest of the query. It just determines if an item meets a criteria and only passes on those that match.
在 LINQ to Objects 的情况下,也只有一次通过数据。在 where 子句中使用 yield return 在内部设置了一个状态机,它跟踪进程在迭代中的位置。哪里不进行完整迭代,创建一个临时列表,然后将这些结果传递给查询的其余部分。它只是确定一个项目是否满足条件,并且只传递那些匹配的项目。