.net 在 LINQ 查询中调用 ToList() 或 ToArray() 更好吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1105990/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is it better to call ToList() or ToArray() in LINQ queries?
提问by Frank Krueger
I often run into the case where I want to eval a query right where I declare it. This is usually because I need to iterate over it multiple times andit is expensive to compute. For example:
我经常遇到我想在我声明查询的地方评估查询的情况。这通常是因为我需要对其进行多次迭代并且计算成本很高。例如:
string raw = "...";
var lines = (from l in raw.Split('\n')
let ll = l.Trim()
where !string.IsNullOrEmpty(ll)
select ll).ToList();
This works fine. Butif I am not going to modify the result, then I might as well call ToArray()instead of ToList().
这工作正常。但如果我不打算修改结果,那么我不妨调用ToArray()而不是ToList().
I wonder however whether ToArray()is implemented by first calling ToList()and is therefore less memory efficient than just calling ToList().
然而,我想知道是否ToArray()是通过第一次调用实现的,ToList()因此内存效率低于仅调用ToList().
Am I crazy? Should I just call ToArray()- safe and secure in the knowledge that the memory won't be allocated twice?
我疯了吗?我应该打电话ToArray()- 知道内存不会被分配两次是安全的吗?
采纳答案by JaredPar
Unless you simply need an array to meet other constraints you should use ToList. In the majority of scenarios ToArraywill allocate more memory than ToList.
除非您只需要一个数组来满足其他约束,否则您应该使用ToList. 在大多数情况下ToArray会分配比ToList.
Both use arrays for storage, but ToListhas a more flexible constraint. It needs the array to be at least as large as the number of elements in the collection. If the array is larger, that is not a problem. However ToArrayneeds the array to be sized exactly to the number of elements.
两者都使用数组进行存储,但ToList具有更灵活的约束。它需要数组至少与集合中的元素数量一样大。如果数组更大,那不是问题。但是ToArray需要将数组的大小精确到元素的数量。
To meet this constraint ToArrayoften does one more allocation than ToList. Once it has an array that is big enough it allocates an array which is exactly the correct size and copies the elements back into that array. The only time it can avoid this is when the grow algorithm for the array just happens to coincide with the number of elements needing to be stored (definitely in the minority).
为了满足这一约束,ToArray通常比 多做一次分配ToList。一旦它有一个足够大的数组,它就会分配一个完全正确大小的数组,并将元素复制回该数组。唯一可以避免这种情况的时候是数组的增长算法恰好与需要存储的元素数量(肯定是少数)一致。
EDIT
编辑
A couple of people have asked me about the consequence of having the extra unused memory in the List<T>value.
有几个人问我在List<T>value中有多余的未使用内存的后果。
This is a valid concern. If the created collection is long lived, is never modified after being created and has a high chance of landing in the Gen2 heap then you may be better off taking the extra allocation of ToArrayup front.
这是一个合理的担忧。如果创建的集合是长期存在的,在创建后永远不会被修改并且很有可能进入 Gen2 堆,那么你最好预先分配额外的分配ToArray。
In general though I find this to be the rarer case. It's much more common to see a lot of ToArraycalls which are immediately passed to other short lived uses of memory in which case ToListis demonstrably better.
总的来说,虽然我发现这是罕见的情况。看到许多ToArray调用立即传递给其他短期内存使用的情况更为常见,在这种情况下ToList显然更好。
The key here is to profile, profile and then profile some more.
这里的关键是配置文件,配置文件,然后再配置文件。
回答by mqp
The performance difference will be insignificant, since List<T>is implemented as a dynamically sized array. Calling either ToArray()(which uses an internal Buffer<T>class to grow the array) or ToList()(which calls the List<T>(IEnumerable<T>)constructor) will end up being a matter of putting them into an array and growing the array until it fits them all.
性能差异将是微不足道的,因为它List<T>是作为动态大小的数组实现的。调用ToArray()(使用内部Buffer<T>类来扩展数组)或ToList()(调用List<T>(IEnumerable<T>)构造函数)最终将成为将它们放入数组并扩展数组直到适合它们全部的问题。
If you desire concrete confirmation of this fact, check out the implementation of the methods in question in Reflector -- you'll see they boil down to almost identical code.
如果您希望具体确认这一事实,请查看 Reflector 中相关方法的实现——您会看到它们归结为几乎相同的代码。
回答by Jeppe Stig Nielsen
(seven years later...)
(七年后……)
A couple of other (good) answers have concentrated on microscopic performance differences that will occur.
其他一些(好的)答案集中在将发生的微观性能差异上。
This post is just a supplement to mention the semantic differencethat exists between the IEnumerator<T>produced by an array (T[]) as compared to that returned by a List<T>.
这篇文章只是补充说明数组 ( )生成的与 a 返回的之间存在的语义差异。IEnumerator<T>T[]List<T>
Best illustrated with by example:
最好举例说明:
IList<int> source = Enumerable.Range(1, 10).ToArray(); // try changing to .ToList()
foreach (var x in source)
{
if (x == 5)
source[8] *= 100;
Console.WriteLine(x);
}
The above code will run with no exception and produces the output:
上面的代码将毫无例外地运行并产生输出:
1 2 3 4 5 6 7 8 900 10
This shows that the IEnumarator<int>returned by an int[]does not keep track on whether the array has been modified since the creation of the enumerator.
这表明IEnumarator<int>an 返回的int[]不会跟踪自枚举数创建以来数组是否已被修改。
Note that I declared the local variable sourceas an IList<int>. In that way I make sure the C# compiler does not optimze the foreachstatement into something which is equivalent to a for (var idx = 0; idx < source.Length; idx++) { /* ... */ }loop. This is something the C# compiler might do if I use var source = ...;instead. In my current version of the .NET framework the actual enumerator used here is a non-public reference-type System.SZArrayHelper+SZGenericArrayEnumerator`1[System.Int32]but of course this is an implementation detail.
请注意,我将局部变量声明source为IList<int>. 通过这种方式,我确保 C# 编译器不会将foreach语句优化为等效于for (var idx = 0; idx < source.Length; idx++) { /* ... */ }循环的内容。如果我改用 C# 编译器可能会这样做var source = ...;。在我当前版本的 .NET 框架中,这里使用的实际枚举器是非公共引用类型,System.SZArrayHelper+SZGenericArrayEnumerator`1[System.Int32]但当然这是一个实现细节。
Now, if I change .ToArray()into .ToList(), I get only:
现在,如果我.ToArray()换成.ToList(),我只会得到:
1 2 3 4 5
followed by a System.InvalidOperationExceptionblow-up saying:
紧接着爆出一句System.InvalidOperationException:
Collection was modified; enumeration operation may not execute.
集合被修改;枚举操作可能无法执行。
The underlying enumerator in this case is the public mutable value-type System.Collections.Generic.List`1+Enumerator[System.Int32](boxed inside an IEnumerator<int>box in this case because I use IList<int>).
在这种情况下,底层枚举器是公共可变值类型System.Collections.Generic.List`1+Enumerator[System.Int32](IEnumerator<int>在这种情况下装箱在一个盒子里,因为我使用了IList<int>)。
In conclusion,the enumerator produced by a List<T>keeps track on whether the list changes during enumeration, while the enumerator produced by T[]does not. So consider this difference when choosing between .ToList()and .ToArray().
总之,a 生成的枚举器List<T>会跟踪列表在枚举过程中是否发生变化,而由 a生成的枚举T[]器不会。因此,在.ToList()和之间进行选择时请考虑这种差异.ToArray()。
People often add one extra.ToArray()or .ToList()to circumvent a collection that keeps track on whether it was modified during the life-time of an enumerator.
人们经常添加一个额外的.ToArray()或.ToList()绕过一个跟踪它是否在枚举器的生命周期内被修改的集合。
(If anybody wants to know howthe List<>keeps track on whether collection was modified, there is a private field _versionin this class which is changed everytime the List<>is updated.)
(如果有人想知道如何在List<>跟踪上收集是否被修改,有一个私人领域_version在这个类,这是改变的每次List<>更新。)
回答by EMP
I agree with @mquander that the performance difference should be insignificant. However, I wanted to benchmark it to be sure, so I did - and it is, insignificant.
我同意@mquander 的观点,即性能差异应该是微不足道的。但是,我想确定地对其进行基准测试,所以我做了 - 而且它微不足道。
Testing with List<T> source:
ToArray time: 1934 ms (0.01934 ms/call), memory used: 4021 bytes/array
ToList time: 1902 ms (0.01902 ms/call), memory used: 4045 bytes/List
Testing with array source:
ToArray time: 1957 ms (0.01957 ms/call), memory used: 4021 bytes/array
ToList time: 2022 ms (0.02022 ms/call), memory used: 4045 bytes/List
Each source array/List had 1000 elements. So you can see that both time and memory differences are negligible.
每个源数组/列表有 1000 个元素。所以你可以看到时间和内存的差异都可以忽略不计。
My conclusion: you might as well use ToList(), since a List<T>provides more functionality than an array, unless a few bytes of memory really matter to you.
我的结论:您不妨使用ToList(),因为 aList<T>提供比数组更多的功能,除非几个字节的内存对您来说真的很重要。
回答by Guffa
The memory will always be allocated twice - or something close to that. As you can not resize an array, both methods will use some sort of mechanism to gather the data in a growing collection. (Well, the List is a growing collection in itself.)
内存将始终分配两次 - 或接近于此。由于您无法调整数组的大小,因此这两种方法都将使用某种机制来收集不断增长的集合中的数据。(嗯,List 本身就是一个不断增长的集合。)
The List uses an array as internal storage, and doubles the capacity when needed. This means that by average 2/3 of the items has been reallocated at least once, half of those reallocated at least twice, half of those at least thrice, and so on. That means that each item has by average been reallocated 1.3 times, which is not very much overhead.
List 使用数组作为内部存储,并在需要时将容量加倍。这意味着平均有 2/3 的项目至少重新分配了一次,其中一半至少重新分配了两次,一半至少重新分配了三次,依此类推。这意味着每个项目平均被重新分配了 1.3 次,这并不是很多开销。
Remember also that if you are collecting strings, the collection itself only contains the references to the strings, the strings themselves aren't reallocated.
还请记住,如果您正在收集字符串,则集合本身仅包含对字符串的引用,字符串本身不会重新分配。
回答by Vitaliy Ulantikov
ToList()is usually preferred if you use it on IEnumerable<T>(from ORM, for instance). If the length of sequence is not known at the beginning, ToArray()creates dynamic-length collection like List and then converts it to array, which takes extra time.
ToList()如果您在IEnumerable<T>(例如,来自 ORM)上使用它,通常是首选。如果一开始不知道序列的长度,则ToArray()创建像List这样的动态长度集合,然后将其转换为数组,这需要额外的时间。
回答by Tyrrrz
It's 2020 outside and everyone is using .NET Core 3.1 so I decided to run some benchmarks with Benchmark.NET.
外面是 2020 年,每个人都在使用 .NET Core 3.1,所以我决定使用 Benchmark.NET 运行一些基准测试。
TL;DR: ToArray() is better performance-wise and does a better job conveying intent if you're not planning to mutate the collection.
TL;DR:ToArray() 在性能方面更好,如果您不打算改变集合,则可以更好地传达意图。
[MemoryDiagnoser]
public class Benchmarks
{
[Params(0, 1, 6, 10, 39, 100, 666, 1000, 1337, 10000)]
public int Count { get; set; }
public IEnumerable<int> Items => Enumerable.Range(0, Count);
[Benchmark(Description = "ToArray()", Baseline = true)]
public int[] ToArray() => Items.ToArray();
[Benchmark(Description = "ToList()")]
public List<int> ToList() => Items.ToList();
public static void Main() => BenchmarkRunner.Run<Benchmarks>();
}
The results are:
结果是:
BenchmarkDotNet=v0.12.0, OS=Windows 10.0.14393.3443 (1607/AnniversaryUpdate/Redstone1)
Intel Core i5-4460 CPU 3.20GHz (Haswell), 1 CPU, 4 logical and 4 physical cores
Frequency=3124994 Hz, Resolution=320.0006 ns, Timer=TSC
.NET Core SDK=3.1.100
[Host] : .NET Core 3.1.0 (CoreCLR 4.700.19.56402, CoreFX 4.700.19.56404), X64 RyuJIT
DefaultJob : .NET Core 3.1.0 (CoreCLR 4.700.19.56402, CoreFX 4.700.19.56404), X64 RyuJIT
| Method | Count | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------- |------ |--------------:|------------:|------------:|--------------:|------:|--------:|--------:|------:|------:|----------:|
| ToArray() | 0 | 7.357 ns | 0.2096 ns | 0.1960 ns | 7.323 ns | 1.00 | 0.00 | - | - | - | - |
| ToList() | 0 | 13.174 ns | 0.2094 ns | 0.1958 ns | 13.084 ns | 1.79 | 0.05 | 0.0102 | - | - | 32 B |
| | | | | | | | | | | | |
| ToArray() | 1 | 23.917 ns | 0.4999 ns | 0.4676 ns | 23.954 ns | 1.00 | 0.00 | 0.0229 | - | - | 72 B |
| ToList() | 1 | 33.867 ns | 0.7350 ns | 0.6876 ns | 34.013 ns | 1.42 | 0.04 | 0.0331 | - | - | 104 B |
| | | | | | | | | | | | |
| ToArray() | 6 | 28.242 ns | 0.5071 ns | 0.4234 ns | 28.196 ns | 1.00 | 0.00 | 0.0280 | - | - | 88 B |
| ToList() | 6 | 43.516 ns | 0.9448 ns | 1.1949 ns | 42.896 ns | 1.56 | 0.06 | 0.0382 | - | - | 120 B |
| | | | | | | | | | | | |
| ToArray() | 10 | 31.636 ns | 0.5408 ns | 0.4516 ns | 31.657 ns | 1.00 | 0.00 | 0.0331 | - | - | 104 B |
| ToList() | 10 | 53.870 ns | 1.2988 ns | 2.2403 ns | 53.415 ns | 1.77 | 0.07 | 0.0433 | - | - | 136 B |
| | | | | | | | | | | | |
| ToArray() | 39 | 58.896 ns | 0.9441 ns | 0.8369 ns | 58.548 ns | 1.00 | 0.00 | 0.0713 | - | - | 224 B |
| ToList() | 39 | 138.054 ns | 2.8185 ns | 3.2458 ns | 138.937 ns | 2.35 | 0.08 | 0.0815 | - | - | 256 B |
| | | | | | | | | | | | |
| ToArray() | 100 | 119.167 ns | 1.6195 ns | 1.4357 ns | 119.120 ns | 1.00 | 0.00 | 0.1478 | - | - | 464 B |
| ToList() | 100 | 274.053 ns | 5.1073 ns | 4.7774 ns | 272.242 ns | 2.30 | 0.06 | 0.1578 | - | - | 496 B |
| | | | | | | | | | | | |
| ToArray() | 666 | 569.920 ns | 11.4496 ns | 11.2450 ns | 571.647 ns | 1.00 | 0.00 | 0.8688 | - | - | 2728 B |
| ToList() | 666 | 1,621.752 ns | 17.1176 ns | 16.0118 ns | 1,623.566 ns | 2.85 | 0.05 | 0.8793 | - | - | 2760 B |
| | | | | | | | | | | | |
| ToArray() | 1000 | 796.705 ns | 16.7091 ns | 19.8910 ns | 796.610 ns | 1.00 | 0.00 | 1.2951 | - | - | 4064 B |
| ToList() | 1000 | 2,453.110 ns | 48.1121 ns | 65.8563 ns | 2,460.190 ns | 3.09 | 0.10 | 1.3046 | - | - | 4096 B |
| | | | | | | | | | | | |
| ToArray() | 1337 | 1,057.983 ns | 20.9810 ns | 41.4145 ns | 1,041.028 ns | 1.00 | 0.00 | 1.7223 | - | - | 5416 B |
| ToList() | 1337 | 3,217.550 ns | 62.3777 ns | 61.2633 ns | 3,203.928 ns | 2.98 | 0.13 | 1.7357 | - | - | 5448 B |
| | | | | | | | | | | | |
| ToArray() | 10000 | 7,309.844 ns | 160.0343 ns | 141.8662 ns | 7,279.387 ns | 1.00 | 0.00 | 12.6572 | - | - | 40064 B |
| ToList() | 10000 | 23,858.032 ns | 389.6592 ns | 364.4874 ns | 23,759.001 ns | 3.26 | 0.08 | 12.6343 | - | - | 40096 B |
// * Hints *
Outliers
Benchmarks.ToArray(): Default -> 2 outliers were removed (35.20 ns, 35.29 ns)
Benchmarks.ToArray(): Default -> 2 outliers were removed (38.51 ns, 38.88 ns)
Benchmarks.ToList(): Default -> 1 outlier was removed (64.69 ns)
Benchmarks.ToArray(): Default -> 1 outlier was removed (67.02 ns)
Benchmarks.ToArray(): Default -> 1 outlier was removed (130.08 ns)
Benchmarks.ToArray(): Default -> 1 outlier was detected (541.82 ns)
Benchmarks.ToArray(): Default -> 1 outlier was removed (7.82 us)
// * Legends *
Count : Value of the 'Count' parameter
Mean : Arithmetic mean of all measurements
Error : Half of 99.9% confidence interval
StdDev : Standard deviation of all measurements
Median : Value separating the higher half of all measurements (50th percentile)
Ratio : Mean of the ratio distribution ([Current]/[Baseline])
RatioSD : Standard deviation of the ratio distribution ([Current]/[Baseline])
Gen 0 : GC Generation 0 collects per 1000 operations
Gen 1 : GC Generation 1 collects per 1000 operations
Gen 2 : GC Generation 2 collects per 1000 operations
Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
1 ns : 1 Nanosecond (0.000000001 sec)
回答by Scott Rippey
Edit: The last part of this answer is not valid. However, the rest is still useful information, so I'll leave it.
编辑:此答案的最后一部分无效。然而,其余的仍然是有用的信息,所以我会留下它。
I know this is an old post, but after having the same question and doing some research, I have found something interesting that might be worth sharing.
我知道这是一篇旧帖子,但在遇到同样的问题并做了一些研究之后,我发现了一些可能值得分享的有趣内容。
First, I agree with @mquander and his answer. He is correct in saying that performance-wise, the two are identical.
首先,我同意@mquander 和他的回答。他说得对,在性能方面,两者是相同的。
However, I have been using Reflector to take a look at the methods in the System.Linq.Enumerableextensions namespace, and I have noticed a very common optimization.
Whenever possible, the IEnumerable<T>source is cast to IList<T>or ICollection<T>to optimize the method. For example, look at ElementAt(int).
但是,我一直在使用 Reflector 来查看System.Linq.Enumerableextensions 命名空间中的方法,并且注意到了一个非常常见的优化。
只要有可能,就将IEnumerable<T>源转换为IList<T>或ICollection<T>优化方法。例如,查看ElementAt(int).
Interestingly, Microsoft chose to only optimize for IList<T>, but not IList. It looks like Microsoft prefers to use the IList<T>interface.
有趣的是,微软选择只针对 优化IList<T>,而不是针对IList. 看起来微软更喜欢使用这个IList<T>界面。
System.Arrayonly implements IList, so it will not benefit from any of these extension optimizations.
Therefore, I submit that the best practice is to use the .ToList()method.
If you use any of the extension methods, or pass the list to another method, there is a chance that it might be optimized for an IList<T>.
System.Array仅实现IList,因此它不会从任何这些扩展优化中受益。
因此,我认为最佳实践是使用该.ToList()方法。
如果您使用任何扩展方法,或将列表传递给另一个方法,则它可能会针对IList<T>.
回答by StriplingWarrior
I found the other benchmarks people have done here lacking, so here's my crack at it. Let me know if you find something wrong with my methodology.
我发现这里缺乏人们在这里完成的其他基准测试,所以这是我的破解方法。如果您发现我的方法有问题,请告诉我。
/* This is a benchmarking template I use in LINQPad when I want to do a
* quick performance test. Just give it a couple of actions to test and
* it will give you a pretty good idea of how long they take compared
* to one another. It's not perfect: You can expect a 3% error margin
* under ideal circumstances. But if you're not going to improve
* performance by more than 3%, you probably don't care anyway.*/
void Main()
{
// Enter setup code here
var values = Enumerable.Range(1, 100000)
.Select(i => i.ToString())
.ToArray()
.Select(i => i);
values.GetType().Dump();
var actions = new[]
{
new TimedAction("ToList", () =>
{
values.ToList();
}),
new TimedAction("ToArray", () =>
{
values.ToArray();
}),
new TimedAction("Control", () =>
{
foreach (var element in values)
{
// do nothing
}
}),
// Add tests as desired
};
const int TimesToRun = 1000; // Tweak this as necessary
TimeActions(TimesToRun, actions);
}
#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
Stopwatch s = new Stopwatch();
int length = actions.Length;
var results = new ActionResult[actions.Length];
// Perform the actions in their initial order.
for (int i = 0; i < length; i++)
{
var action = actions[i];
var result = results[i] = new ActionResult { Message = action.Message };
// Do a dry run to get things ramped up/cached
result.DryRun1 = s.Time(action.Action, 10);
result.FullRun1 = s.Time(action.Action, iterations);
}
// Perform the actions in reverse order.
for (int i = length - 1; i >= 0; i--)
{
var action = actions[i];
var result = results[i];
// Do a dry run to get things ramped up/cached
result.DryRun2 = s.Time(action.Action, 10);
result.FullRun2 = s.Time(action.Action, iterations);
}
results.Dump();
}
public class ActionResult
{
public string Message { get; set; }
public double DryRun1 { get; set; }
public double DryRun2 { get; set; }
public double FullRun1 { get; set; }
public double FullRun2 { get; set; }
}
public class TimedAction
{
public TimedAction(string message, Action action)
{
Message = message;
Action = action;
}
public string Message { get; private set; }
public Action Action { get; private set; }
}
public static class StopwatchExtensions
{
public static double Time(this Stopwatch sw, Action action, int iterations)
{
sw.Restart();
for (int i = 0; i < iterations; i++)
{
action();
}
sw.Stop();
return sw.Elapsed.TotalMilliseconds;
}
}
#endregion
You can download the LINQPad Script here.
您可以在此处下载 LINQPad 脚本。
Tweaking the code above, you will discover that:
调整上面的代码,你会发现:
- The difference is less significant when dealing with smaller arrays.

- The difference is less significant when dealing with
ints rather thanstrings. - Using large
structs instead ofstrings takes a lot more time generally, but doesn't really change the ratio much.
- 在处理较小的数组时,差异不那么显着。

- 在处理
ints 而不是strings时,差异不太显着。 - 使用 large
structs 而不是strings 通常需要更多时间,但实际上并没有太大改变比率。
This agrees with the conclusions of the top-voted answers:
这与最高投票答案的结论一致:
- You're unlikely to notice a performance difference unless your code is frequently producing many large lists of data. (There was only a 200ms difference when creating 1000 lists of 100K strings apiece.)
ToList()consistently runs faster, and would be a better choice if you're not planning to hang on to the results for a long time.
- 除非您的代码经常生成许多大型数据列表,否则您不太可能注意到性能差异。(创建 1000 个 100K 字符串的列表时,只有 200 毫秒的差异。)
ToList()始终运行得更快,如果您不打算长时间保持结果,这将是更好的选择。
Update
更新
@JonHanna pointed out that depending on the implementation of Selectit's possible for a ToList()or ToArray()implementation to predict the resulting collection's size ahead of time. Replacing .Select(i => i)in the code above with Where(i => true)yields very similar resultsat the moment, and is more likely to do so regardless of the .NET implementation.
@JonHanna 指出,根据实现,SelectaToList()或ToArray()实现可能提前预测结果集合的大小。.Select(i => i)目前,用上面的代码替换上面的代码会Where(i => true)产生非常相似的结果,而且无论 .NET 实现如何,都更有可能这样做。
回答by Erdogan Kurtur
A very late answer but I think it will be helpful for googlers.
一个很晚的答案,但我认为这对谷歌员工会有所帮助。
They both suck when they created using linq. They both implement same code to resize buffer if necessary. ToArrayinternally uses a class to convert IEnumerable<>to array, by allocating an array of 4 elements. If that is not enough than it doubles the size by creating a new array double the size of current and copying current array to it. At the end it allocates a new array of count of your items. If your query returns 129 elements then ToArray will make 6 allocations and memory copy operations to create a 256 element array and than am another array of 129 to return. so much for memory efficiency.
当他们使用 linq 创建时,他们都很糟糕。如果需要,它们都实现了相同的代码来调整缓冲区的大小。通过分配一个包含 4 个元素的数组,ToArray内部使用一个类来转换IEnumerable<>为数组。如果这还不够,那么它通过创建一个新的数组,将当前数组的大小加倍并将当前数组复制到它来使大小加倍。最后,它会分配一个新的项目计数数组。如果您的查询返回 129 个元素,则 ToArray 将进行 6 次分配和内存复制操作以创建一个 256 个元素的数组,然后返回另一个 129 个数组。这么多的内存效率。
ToList does the same thing, but it skips the last allocation since you can add items in the future. List does not care if it is created from a linq query or created manually.
ToList 做同样的事情,但它跳过最后一次分配,因为你可以在将来添加项目。List 不关心它是从 linq 查询创建还是手动创建。
for creation List is better with memory but worse with cpu since list is a generic solution every action requires range checks additional to the .net's internal range checks for arrays.
创建 List 内存更好,但 CPU 更糟,因为 list 是通用解决方案,每个操作都需要在 .net 的内部数组范围检查之外进行范围检查。
So if you will iterate through your result set too many times, then arrays are good since it means less range checks than lists, and compilers generally optimizes arrays for sequential access.
因此,如果您将遍历结果集太多次,那么数组是好的,因为它意味着范围检查比列表少,并且编译器通常会优化数组以进行顺序访问。
List's initialization allocation can be better if you specify capacity parameter when you create it. In this case it will allocate array only once, assuming you know the result size. ToListof linq does not specify an overload to provide it, so we have to create our extension method that creates a list with given capacity and then uses List<>.AddRange.
如果在创建时指定容量参数,列表的初始化分配会更好。在这种情况下,假设您知道结果大小,它只会分配数组一次。ToListlinq 没有指定一个重载来提供它,所以我们必须创建我们的扩展方法来创建一个具有给定容量的列表,然后使用List<>.AddRange.
To finish this answer I have to write following sentences
为了完成这个答案,我必须写下以下句子
- At the end, you can use either an ToArray, or ToList, performance will not be so different ( see answer of @EMP ).
- You are using C#. If you need performance then do not worry about writing about high performance code, but worry about not writing bad performance code.
- Always target x64 for high performance code. AFAIK, x64 JIT is based on C++ compiler, and does some funny things like tail recursion optimizations.
- With 4.5 you can also enjoy the profile guided optimization and multi core JIT.
- At last, you can use async/await pattern to process it quicker.
- 最后,您可以使用 ToArray 或 ToList,性能不会有太大差异(请参阅 @EMP 的答案)。
- 您正在使用 C#。如果您需要性能,那么不要担心编写高性能代码,而要担心不要编写性能不佳的代码。
- 始终以 x64 为目标,以获得高性能代码。AFAIK,x64 JIT 基于 C++ 编译器,并做了一些有趣的事情,比如尾递归优化。
- 使用 4.5,您还可以享受配置文件引导优化和多核 JIT。
- 最后,您可以使用 async/await 模式来更快地处理它。

