C# 在一个函数值的 switch vs 字典中,哪个更快,为什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11617091/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-09 18:31:39  来源:igfitidea点击:

In a switch vs dictionary for a value of Func, which is faster and why?

c#dictionaryclrswitch-statementcyclomatic-complexity

提问by cubetwo1729

Suppose there is the following code:

假设有以下代码:

private static int DoSwitch(string arg)
{
    switch (arg)
    {
        case "a": return 0;
        case "b": return 1;
        case "c": return 2;
        case "d": return 3;
    }
    return -1;
}

private static Dictionary<string, Func<int>> dict = new Dictionary<string, Func<int>>
    {
        {"a", () => 0 },
        {"b", () => 1 },
        {"c", () => 2 },
        {"d", () => 3 },
    };

private static int DoDictionary(string arg)
{
    return dict[arg]();
}

By iterating over both methods and comparing, I am getting that the dictionary is slightly faster, even when "a", "b", "c", "d" is expanded to include more keys. Why is this so?

通过迭代这两种方法并进行比较,我发现字典稍微快了一点,即使“a”、“b”、“c”、“d”被扩展为包含更多的键。为什么会这样?

Does this have to do with cyclomatic complexity? Is is because the jitter compiles the return statements in the dictionary to native code only once? Is it because the lookup of the dictionary is O(1), which may not be the case for a switch statement? (These are just guesses)

这与圈复杂度有关吗?是不是因为抖动只将字典中的返回语句编译为本机代码一次?是不是因为字典的查找是 O(1),而switch 语句可能不是这种情况?(这些只是猜测)

采纳答案by KeithS

The short answer is that the switch statement executes linearly, while the dictionary executes logarithmically.

简短的回答是 switch 语句是线性执行的,而字典是对数执行的。

At the IL level, a small switch statement is usually implemented as a series of if-elseif statements comparing equality of the switched variable and each case. So, this statement will execute in a time linearly proportional to the number of valid options for myVar; the cases will be compared in the order they appear, and the worst-case scenario is that all the comparisons are tried and either the last one matches or none do. So, with 32 options, the worst case is that it's none of them, and the code will have made 32 comparisons to determine this.

在 IL 级别,一个小的 switch 语句通常被实现为一系列 if-elseif 语句,比较 switch 变量和每个 case 的相等性。因此,此语句的执行时间与 myVar 的有效选项数成线性比例;案例将按照它们出现的顺序进行比较,最坏的情况是尝试所有比较并且最后一个匹配或都不匹配。因此,对于 32 个选项,最坏的情况是它们都不是,代码将进行 32 次比较来确定这一点。

A Dictionary, on the other hand, uses an index-optimized collection to store values. In .NET, a Dictionary is based on a Hashtable, which has effectively constant access time (the disadvantage being extremely poor space efficiency). Other options commonly used for "mapping" collections like Dictionaries include balanced tree structures like red-black trees, which provide logarithmic access (and linear space efficiency). Any of these will allow the code to find the key corresponding to the proper "case" in the collection (or determine it does not exist) much faster than a switch statement can do the same.

另一方面,字典使用索引优化的集合来存储值。在 .NET 中,Dictionary 基于 Hashtable,它具有有效的恒定访问时间(缺点是空间效率极差)。通常用于“映射”集合(如 Dictionaries)的其他选项包括平衡树结构,如红黑树,它提供对数访问(和线性空间效率)。任何这些都将允许代码比 switch 语句更快地找到与集合中正确的“case”相对应的键(或确定它不存在)。

EDIT: Other answers and commenters have touched on this, so in the interest of completeness I will as well. The Microsoft compiler does notalways compile a switch to an if/elseif as I inferred originally. It typically does so with small numbers of cases, and/or with "sparse" cases (non-incremental values, like 1, 200, 4000). With larger sets of adjacent cases, the compiler will convert the switch into a "jump table" using a CIL statement. With large sets of sparse cases, the compiler can implement a binary search to narrow the field, and then "fall through" a small number of sparse cases or implement a jump table for adjacent cases.

编辑:其他答案和评论者已经谈到了这一点,所以为了完整性,我也会这样做。微软编译器并不能总是编译开关的,如果/ elseif的,因为我原来的推断。它通常使用少量案例和/或“稀疏”案例(非增量值,如 1、200、4000)。对于较大的相邻情况集,编译器将使用 CIL 语句将 switch 转换为“跳转表”。对于大量的稀疏案例,编译器可以实现二分查找来缩小范围,然后“落入”少量稀疏案例或实现相邻案例的跳转表。

However, the compiler will typically choose the implementation that is the best compromise of performance and space efficiency, so it will only use a jump table for a large number of densely-packed cases. This is because a jump table requires a space in memory on the order of the range of cases it must cover, which for sparse cases is terribly inefficient memory-wise. By using a Dictionary in source code, you basically force the compiler's hand; it will do it your way, instead of compromising on performance to gain memory efficiency.

但是,编译器通常会选择性能和空间效率最佳折衷的实现,因此它只会在大量密集的情况下使用跳转表。这是因为跳转表需要内存中的空间,其顺序与它必须涵盖的情况范围相同,对于稀疏情况,这在内存方面是非常低效的。通过在源代码中使用字典,您基本上可以强制编译器使用;它会按照您的方式进行操作,而不是为了提高内存效率而牺牲性能。

So, I would expect most cases in which either a switch statement or a Dictionary could be used in source to perform better when using a Dictionary. Large numbers of cases in switch statements are to be avoided anyway, as they are less maintainable.

因此,我希望在大多数情况下,可以在源代码中使用 switch 语句或字典在使用字典时性能更好。无论如何都要避免 switch 语句中的大量 case,因为它们的可维护性较差。

回答by Brian Rasmussen

This is a good example of why micro-benchmarks can be misleading. The C# compiler generates different IL depending on the size of the switch/case. So switching on a string like this

这是一个很好的例子,说明了微基准测试可能具有误导性的原因。C# 编译器根据 switch/case 的大小生成不同的 IL。所以打开这样的字符串

switch (text) 
{
     case "a": Console.WriteLine("A"); break;
     case "b": Console.WriteLine("B"); break;
     case "c": Console.WriteLine("C"); break;
     case "d": Console.WriteLine("D"); break;
     default: Console.WriteLine("def"); break;
}

produce IL that essentially does the following for each case:

生成 IL 基本上对每种情况执行以下操作:

L_0009: ldloc.1 
L_000a: ldstr "a"
L_000f: call bool [mscorlib]System.String::op_Equality(string, string)
L_0014: brtrue.s L_003f

and later

然后

L_003f: ldstr "A"
L_0044: call void [mscorlib]System.Console::WriteLine(string)
L_0049: ret 

I.e. it's a series of comparisons. So run time is linear.

即它是一系列的比较。所以运行时间是线性的。

However, adding additional cases, e.g. to include all the letters from a-z, changes the IL generated to something like this for each:

但是,添加其他案例,例如包含来自 az 的所有字母,会将每个生成的 IL 更改为如下所示:

L_0020: ldstr "a"
L_0025: ldc.i4.0 
L_0026: call instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::Add(!0, !1)

and

L_0176: ldloc.1 
L_0177: ldloca.s CS
L_01f6: ldstr "A"
L_01fb: call void [mscorlib]System.Console::WriteLine(string)
L_0200: ret 
##代码##01 L_0179: call instance bool [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::TryGetValue(!0, !1&) L_017e: brfalse L_0314

and finally

最后

##代码##

I.e. it now uses a dictionary instead of a series of string compares, and thus gets the performance of a dictionary.

即它现在使用字典而不是一系列字符串比较,从而获得字典的性能。

In other words the IL code generated for these are different and this is just at the IL level. The JIT compiler may optimize further.

换句话说,为这些生成的 IL 代码是不同的,这只是在 IL 级别。JIT 编译器可能会进一步优化。

TL;DR: So the morale of the story is to look at real data and profile instead of trying to optimize based on micro-benchmarks.

TL;DR:所以这个故事的士气是查看真实数据和配置文件,而不是尝试基于微基准进行优化。

回答by Brian Rasmussen

By default, a switch on a string is implemented like an if / else / if / else construct. As suggested by Brian, the compiler will convert the switch to a hashtable when it gets bigger. Bart de Smet shows this in this channel9 video, (switch is discussed at 13:50)

默认情况下,字符串上的开关是像 if / else / if / else 结构一样实现的。正如 Brian 所建议的那样,当它变大时,编译器会将 switch 转换为哈希表。Bart de Smet在这个 channel9 视频中展示了这一点,(切换在 13:50 讨论)

The compiler is not doing it for 4 items because it is being conservative, to prevent that the cost of the optimization outweigh the benefits. Building the hashtable costs a little time and memory.

编译器没有为 4 个项目做这件事,因为它是保守的,以防止优化的成本超过收益。构建哈希表需要一点时间和内存。

回答by dthorpe

As with many questions involving compiler codegen decisions, the answer is "it depends".

与许多涉及编译器代码生成决策的问题一样,答案是“视情况而定”。

Building your own hash table will probably run faster than compiler generated code in many cases because the compiler has other cost metrics it is trying to balance that you are not: primarily, memory consumption.

在许多情况下,构建您自己的哈希表可能比编译器生成的代码运行得更快,因为编译器有其他成本指标,它试图平衡您没有的其他成本指标:主要是内存消耗。

A hash table will use more memory than a handful of if-then-else IL instructions. If the compiler spit out a hash table for every switch statement in a program, memory use would explode.

哈希表将使用比少数 if-then-else IL 指令更多的内存。如果编译器为程序中的每个 switch 语句吐出一个哈希表,内存使用量就会激增。

As the number of case blocks in the switch statement grows, you will probably see the compiler produce different code. With more cases, there is greater justification for the compiler to abandon small and simple if-then-else patterns in favor of faster but fatter alternatives.

随着 switch 语句中 case 块数量的增加,您可能会看到编译器生成不同的代码。在更多情况下,编译器更有理由放弃小而简单的 if-then-else 模式,转而采用更快但更丰富的替代方案。

I don't know offhand if the C# or JIT compilers perform this particular optimization, but a common compiler trick for switch statements when the case selectors are many and mostly sequential is to compute a jump vector. This requires more memory (in the form of compiler generated jump tables embedded in the code stream) but executes in constant time. Subtract arg - "a", use result as index into jump table to jump to appropriate case block. Boom, yer done, regardless of whether there are 20 or 2000 cases.

我不知道 C# 或 JIT 编译器是否会执行这种特定的优化,但是当 case 选择器很多且大部分是连续的时,switch 语句的一个常见编译器技巧是计算跳转向量。这需要更多的内存(以编译器生成的嵌入代码流中的跳转表的形式)但在恒定时间内执行。减去 arg - "a",将结果作为跳转表的索引,跳转到相应的 case 块。繁荣,你完成了,无论是 20 还是 2000 个案例。

A compiler is more likely to shift into jump table mode when the switch selector type is char or int or enum andthe values of the case selectors are mostly sequential ("dense"), since these types can be easily subtracted to create an offset or index. String selectors are a little harder.

当 switch 选择器类型是 char 或 int 或 enum并且case 选择器的值大多是连续的(“密集”)时,编译器更有可能转换到跳转表模式,因为这些类型可以很容易地减去以创建偏移量或指数。字符串选择器有点难。

String selectors are "interned" by the C# compiler, meaning the compiler adds the string selectors values to an internal pool of unique strings. The address or token of an interned string can be used as its identity, allowing for int-like optimizations when comparing intern'd strings for identity / byte-wise equality. With sufficient case selectors, the C# compiler will produce IL code that looks up the intern'd equivalent of the arg string (hash table lookup), then compares (or jump tables) the interned token with the precomputed case selector tokens.

字符串选择器由 C# 编译器“嵌入”,这意味着编译器将字符串选择器值添加到唯一字符串的内部池中。实习字符串的地址或标记可用作其标识,在比较实习字符串的标识/字节相等时允许进行类似 int 的优化。有了足够多的 case 选择器,C# 编译器将生成 IL 代码,用于查找 arg 字符串的 intern 等效项(哈希表查找),然后将 interned 标记与预先计算的 case 选择器标记进行比较(或跳转表)。

If you can coax the compiler into producing jump-table code in the char/int/enum selector case, this can execute faster than using your own hash table.

如果您可以诱使编译器在 char/int/enum 选择器的情况下生成跳转表代码,则这比使用您自己的哈希表执行得更快。

For the string selector case, the IL code still has to do a hash lookup so any performance difference from using your own hash table is probably a wash.

对于字符串选择器的情况,IL 代码仍然需要进行哈希查找,因此使用您自己的哈希表的任何性能差异都可能是一种清洗。

In general, though, you shouldn't dwell on these compiler nuances too much when writing application code. Switch statements are generally much easier to read and understand than a hash table of function pointers. Switch statements that are big enough to push the compiler into jump table mode are often too big to be humanly readable.

不过,一般而言,在编写应用程序代码时,您不应过多地关注这些编译器的细微差别。Switch 语句通常比函数指针的哈希表更容易阅读和理解。足够大以将编译器推入跳转表模式的 switch 语句通常太大而无法人类阅读。

If you find a switch statement is in a performance hotspot of your code, and you have measured with a profiler that it has tangible performance impact, then changing your code to use your own dictionary is a reasonable tradeoff for a performance gain.

如果您发现 switch 语句位于您代码的性能热点中,并且您已经使用分析器测量它对性能有明显影响,那么更改代码以使用您自己的字典是对性能提升的合理权衡。

Writing your code to use a hash table from the start, with no performance measurements to justify this choice, is over-engineering that will lead to unfathomable code with an unnecessarily higher maintenance cost. Keep It Simple.

编写您的代码以从一开始就使用哈希表,没有性能测量来证明这种选择是合理的,这是过度设计的,这将导致深不可测的代码和不必要的更高的维护成本。把事情简单化。