C++ 跳表开关案例问题

Question

提问by Brock Woolf

I am trying to understand some things about jump tables and its relationship between a switch case statement.

我试图了解有关跳转表及其 switch case 语句之间关系的一些事情。

I was told that a jump table is a O(1) structure that the compiler generates which makes lookup of values essentially about as fast as you can get. However in some cases a Hashtable/Dictionary might be faster. I was also told this will only work if the switch case contains ordereddata values.

有人告诉我，跳转表是编译器生成的 O(1) 结构，它使值的查找基本上尽可能快。但是在某些情况下，哈希表/字典可能会更快。我还被告知这仅在 switch case 包含ordered数据值时才有效。

Can someone please confirm or deny this and explain what a jump table is, it's importance and the time complexity versus using a dictionary or hashtable. Thanks.

有人可以确认或否认这一点并解释什么是跳转表，它的重要性以及与使用字典或哈希表相比的时间复杂度。谢谢。

Answer 1

回答by

A jump tableis an abstract structure used to transfer controlto another location. Goto, continue, and break are similar, except they always transfer to a specific location instead of one possibility from many. In particular, this control flow is not the same as a function call. (Wikipedia's article on branch tablesis related.)

甲跳转表是用于将一个抽象结构控制转移到另一个位置。Goto、continue 和 break 是相似的，除了它们总是转移到特定位置，而不是从许多可能中转移。特别是，这个控制流与函数调用不同。（维基百科关于分支表的文章是相关的。）

A switch statementis how to write jump tables in C/C++. Only a limited form is provided (can only switch on integral types) to make implementations easier and faster in this common case. (How to implement jump tables efficiently has been studied much more for integral types than for the general case.) A classic example is Duff's Device.

一个switch语句是怎么写的跳转表在C / C ++。只提供了一种有限的形式（只能打开整数类型），以便在这种常见情况下更容易和更快地实现。（对于整数类型，如何有效地实现跳转表的研究比一般情况要多得多。）一个经典的例子是Duff 的 Device。

However, the full capability of a jump table is often not required, such as when every case would have a break statement. These "limited jump tables" are a different pattern, which is only taking advantage of a jump table's well-studied efficiency, and are common when each "action" is independent of the others.

但是，通常不需要跳转表的全部功能，例如当每个 case 都有一个 break 语句时。这些“有限跳转表”是一种不同的模式，它只是利用了跳转表的充分研究的效率，并且在每个“动作”独立于其他“动作”时很常见。

Actual implementations of jump tables take different forms, mostly differing in how the key to index mapping is done. That mapping is where terms like "dictionary" and "hash table" come in, and those techniques can be used independently of a jump table. Saying that some code "uses a jump table" doesn't imply by itself that you have O(1) lookup.

跳转表的实际实现采用不同的形式，主要区别在于如何完成索引映射的键。这种映射是“字典”和“哈希表”之类的术语出现的地方，这些技术可以独立于跳转表使用。说某些代码“使用跳转表”本身并不意味着您有 O(1) 查找。

The compiler is free to choose the lookup method for each switch statement, and there is no guarantee you'll get one particular implementation; however, compiler options such as optimize-for-speed and optimize-for-size should be taken into account.

编译器可以自由地为每个 switch 语句选择查找方法，并且不能保证您会得到一个特定的实现；但是，应该考虑诸如优化速度和优化大小之类的编译器选项。

You should look into studying data structuresto get a handle on the different complexity requirements imposed by them. Briefly, if by "dictionary" you mean a balanced binary tree, then it is O(log n); and a hash table depends on its hash function and collision strategy. In the particular case of switch statements, since the compiler has full information, it can generate a perfect hash functionwhich means O(1) lookup. However, don't get lost by just looking at overall algorithmic complexity: it hides important factors.

您应该研究数据结构，以了解它们强加的不同复杂性要求。简而言之，如果“字典”是指平衡二叉树，那么它是 O(log n)；一个哈希表取决于它的哈希函数和碰撞策略。在 switch 语句的特殊情况下，由于编译器拥有完整的信息，它可以生成一个完美的哈希函数，这意味着 O(1) 查找。然而，不要仅仅看整体算法的复杂性就迷失了方向：它隐藏了重要的因素。

Answer 2

回答by Jerry Coffin

A jump table is basically an array of pointers to pieces of code to handle the various cases in the switch statement. It's most likely to be generated when your cases are dense (i.e. you have a case for every possible value in a range). For example, given a statement like:

跳转表基本上是一个指向代码段的指针数组，用于处理 switch 语句中的各种情况。它最有可能在您的案例密集时生成（即您有一个范围内每个可能值的案例）。例如，给定如下语句：

switch (i) {
   case 1: printf("case 1"); break;
   case 2: printf("case 2"); break;
   case 3: printf("case 3"); break;
}

it could generate code roughly equivalent to something like this:

它可以生成大致相当于这样的代码：

void case1() { printf("case 1"); }
void case2() { printf("case 2"); }
void case3() { printf("case 3"); }

typedef void (*pfunc)(void);

pfunc functions[3] = {case1, case2, case3};

if ((unsigned)i<3)    
    functions[i]();

This has O(K) complexity. A typical hash table also has roughly O(K) expectedcomplexity, though the worst case is typically O(N). The jump table will usually be faster, but it will usually only be used if the table will be quite dense, whereas a hash table/dictionary works quite well even when the cases would be quite sparse.

这具有 O(K) 复杂度。一个典型的哈希表也有大约 O(K) 的预期复杂度，尽管最坏的情况通常是 O(N)。跳转表通常会更快，但通常只有在表非常密集的情况下才会使用它，而哈希表/字典即使在案例非常稀疏的情况下也能很好地工作。

Answer 3

回答by Jonathan Graehl

Suppose you had an array of procedures:

假设你有一个过程数组：

void fa() { 
 printf("a\n");
}

...

void fz() { 
 printf("it's z!\n");
}



typedef void (*F)();
F table[26]={fa,fb,...,fz};

Suppose you accept a character (from a-z) of input from the user and run fc:

假设您接受来自用户的输入字符（来自 az）并运行 fc：

char c;
switch(c) {
   case 'a': fa();break;
   case 'b': fb();break;
   ...
   case 'z': fz();break;       
   default: exit(-1);
}

Ideally this would be replaced with something like:

理想情况下，这将被替换为：

if (c<'a' || c>'z') exit(-1);
else (*table[c-'a'])();

Naturally, you might make the table bigger so the range check wouldn't be necessary.

自然地，您可以将表变大，这样就不需要范围检查了。

The compiler would do this for arbitrary code, not necessarily function calls only, and would do it by storing the address to jump to (essentially, a goto). C doesn't directly support any sort of computed goto (indexing into a table or otherwise), but the CPU instructions for it are pretty simple.

编译器会针对任意代码执行此操作，不一定只针对函数调用，并且会通过存储要跳转到的地址（本质上是 goto）来执行此操作。C 不直接支持任何类型的计算 goto（索引到表或其他方式），但它的 CPU 指令非常简单。

Answer 4

回答by Richard Pennington

Compiling for a switch statement can take many forms, depending on the cases. If the cases are close together, it is a no brainer: use a jump table. If the cases are far apart, use if (case == value) or use a map. Or a compiler can use a combination: islands of jump tables determined by if checks of the jump table ranges.

编译 switch 语句可以采用多种形式，具体取决于情况。如果案例很接近，那就不用动脑子了：使用跳转表。如果案例相距很远，请使用 if (case == value) 或使用地图。或者编译器可以使用组合：由跳转表范围的 if 检查确定的跳转表岛。

Answer 5

回答by Dave

A jump table is simple an array of function pointers, you can picture a jump table roughly like so:

跳转表是一个简单的函数指针数组，你可以大致像这样描绘一个跳转表：

int (*functions[10])(); /* Array of 10 Function Pointers */

From my understanding, this is used with a case statement like so: each condition, case _, will be an index into this array, so for example:

根据我的理解，这与这样的 case 语句一起使用：每个条件 case _ 将是该数组的索引，例如：

switch( a ) {
    case 1:  // (*functions[1])() // Call function containing actions in case of 1
        ...  
    case 2:  // (*functions[2])() // Call function containing actions in case of 2
        ...

Each case, transforms to become simply functions[a]. This means that accessing functions[9] is just as quick as accessing functions[1]. Giving you the O(1) time you mentioned.

每种情况都转换为简单的函数[a]。这意味着访问函数[9] 和访问函数[1] 一样快。给你你提到的 O(1) 时间。

Obviously, if you have case 1, and case 4907, this isn't going to be a good method, and the hash table/dictionary methods you mentioned may come into play.

显然，如果您有案例 1 和案例 4907，这将不是一个好方法，您提到的哈希表/字典方法可能会起作用。

Answer 6

回答by Glenn Teitelbaum

To further elaborate on Jerry's answerand others

进一步详细说明杰瑞的回答和其他人

Given:

鉴于：

int x=1;
switch (i) {
   case 1: x=6; break;
   case 2: x++;
   // Fall through
   case 3: x+=7; break;
}

you could have something like the following:

你可以有如下内容：

int f1() {return 6;}
int f2() {return 1+f3();}
int f3() {return 8;}

The the compiler could use a jump table to index {f1, f2, f3}

编译器可以使用跳转表来索引 {f1, f2, f3}

The compiler can do inlining when creating the table having f1, f2, f3setting xdirectly to 6,9,8

编译器可以在创建直接f1, f2, f3设置x为的表时进行内联6,9,8

But if you wrote the functions, and rolled your own jump table, f1,f2,f3could be anywhere, but the compiler will know to put them close to the switchcreating much better code locality than you could.

但是，如果您编写函数并滚动您自己的跳转表，则f1,f2,f3可以在任何地方，但编译器将知道将它们置于switch创建比您更好的代码位置附近。

Note that in many cases the compiler will generate a guard to check if iis in range (or to handle the default) and if you are sure that it always is one of the cases, you could skip that

请注意，在许多情况下，编译器会生成一个守卫来检查是否i在范围内（或处理default），如果您确定它始终是其中一种情况，您可以跳过它

The interesting thing is that for under a small number of cases, and under different compiler flags (compiler dependent) the switchwould not use a table, but would just do ifs, similar to:

有趣的是，在少数情况下，在不同的编译器标志（依赖于编译器）下switch，不会使用表，而只会执行 ifs，类似于：

if (i==1) x=f1();
else if (i==2) x=f2();
else if (i==3) x=f3();

or it might optimize this (where simple tests are one instruction) to:

或者它可能会优化它（其中简单的测试是一个指令）：

x=(i==1) ? f1()
: (i==2) ? f2()
: (i==3) ? f3()
: x;

The best advice is to look at the assembly generated to see what the compiler did to your code on your architecture, g++ on Linux/intel will generate something like the following, if there is a jump table

最好的建议是查看生成的程序集，看看编译器对您的架构上的代码做了什么，Linux/intel 上的 g++ 将生成如下内容，如果有跳转表

(note I had to go to 5 casestatements to force the jump table, it used ifs below that number of casestatements)

（请注意，我必须转到 5 个case语句来强制跳转表，它使用了低于该case语句数的 ifs）

Note that small holes will be in the jump table to do the default

注意跳转表中会有小孔来做 default

int foo(int i)
{
   int x=1;
   switch (i) {
       case 1: x=6; break;
       case 2: x++;
        // Fall through
       case 3: x+=7; break;
       case 4: x+=2; break;
       case 5: x+=9; break;
    }
  return x;
}

would generate the following assembly code (// comments are mine):

将生成以下汇编代码（// 注释是我的）：

        cmp     edi, 5                     //make sure it is not over 5
        ja      .L2                        //jump to default case
        mov     edi, edi
        jmp     [QWORD PTR .L4[0+rdi*8]]   // use the jump table at label L4:
.L4:
        .quad   .L2                        // if i=0, set x=1 (default)
        .quad   .L9                        // f1() see below
        .quad   .L10                       // f2() see below
        .quad   .L6                        // f3() see below
        .quad   .L7                        // f4() see below
        .quad   .L8                        // f5() see below
.L10:
        mov     eax, 9                     // x=9
        ret
.L9:
        mov     eax, 6                     // x=6
        ret
.L8:
        mov     eax, 10                    // x=10
        ret
.L6:
        mov     eax, 8                     // x=8
        ret
.L7:
        mov     eax, 3                     // x=3
        ret
.L2:
        mov     eax, 1                     // default, x was 1, noop is: x=1
        ret

C++ 跳表开关案例问题

提问by Brock Woolf

回答by

回答by Jerry Coffin

回答by Jonathan Graehl

回答by Richard Pennington

回答by Dave

回答by Glenn Teitelbaum

相关推荐

最近更新

标签

C++ 跳表开关案例问题

提问by Brock Woolf

回答by

回答by Jerry Coffin

回答by Jonathan Graehl

回答by Richard Pennington

回答by Dave

回答by Glenn Teitelbaum

相关推荐

C++ 初始化程序无法确定字符数组的大小

C++ 在C下通过引用传递指针参数？

C++ 链接：致命错误 LNK1104：无法打开文件“glut32.lib”。如何修复这个错误？

C/C++ 程序的最大堆栈大小

相关推荐

最近更新

标签