C#:虚函数调用甚至比委托调用还要快?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/216008/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 18:28:43  来源:igfitidea点击:

C#: Virtual Function invocation is even faster than a delegate invocation?

c#performancedelegatesvirtual

提问by Morgan Cheng

It just happens to me about one code design question. Say, I have one "template" method that invokes some functions that may "alter". A intuitive design is to follow "Template Design Pattern". Define the altering functions to be "virtual" functions to be overridden in subclasses. Or, I can just use delegate functions without "virtual". The delegate functions is injected so that they can be customized too.

我就遇到了一个代码设计问题。比如说,我有一个“模板”方法可以调用一些可能“改变”的函数。一个直观的设计是遵循“模板设计模式”。将更改函数定义为要在子类中覆盖的“虚拟”函数。或者,我可以只使用没有“虚拟”的委托函数。委托函数被注入,以便它们也可以被自定义。

Originally, I thought the second "delegate" way would be faster than "virtual" way, but some coding snippet proves it is not correct.

最初,我认为第二种“委托”方式会比“虚拟”方式更快,但一些代码片段证明它不正确。

In below code, the first DoSomething method follows "template pattern". It calls on the virtual method IsTokenChar. The second DoSomthing method doesn't depend on virtual function. Instead, it has a pass-in delegate. In my computer, the first DoSomthing is always faster than the second. The result is like 1645:1780.

在下面的代码中,第一个 DoSomething 方法遵循“模板模式”。它调用虚拟方法 IsTokenChar。第二个 DoSomthing 方法不依赖于虚函数。相反,它有一个传入委托。在我的电脑中,第一个 DoSomthing 总是比第二个快。结果就像 1645:1780。

"Virtual invocation" is dynamic binding and should be more time-costing than direct delegation invocation, right? but the result shows it is not.

“虚拟调用”是动态绑定,应该比直接委托调用更耗时,对吗?但结果表明并非如此。

Anybody can explain this?

任何人都可以解释这个?

using System;
using System.Diagnostics;

class Foo
{
    public virtual bool IsTokenChar(string word)
    {
        return String.IsNullOrEmpty(word);
    }

    // this is a template method
    public int DoSomething(string word)
    {
        int trueCount = 0;
        for (int i = 0; i < repeat; ++i)
        {
            if (IsTokenChar(word))
            {
                ++trueCount;
            }
        }
        return trueCount;
    }

    public int DoSomething(Predicate<string> predicator, string word)
    {
        int trueCount = 0;
        for (int i = 0; i < repeat; ++i)
        {
            if (predicator(word))
            {
                ++trueCount;
            }
        }
        return trueCount;
    }

    private int repeat = 200000000;
}

class Program
{
    static void Main(string[] args)
    {
        Foo f = new Foo();

        {
            Stopwatch sw = Stopwatch.StartNew();
            f.DoSomething(null);
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }

        {
            Stopwatch sw = Stopwatch.StartNew();
            f.DoSomething(str => String.IsNullOrEmpty(str), null);
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }
    }
}

回答by Daniel Plaisted

It is possible that since you don't have any methods that override the virtual method that the JIT is able to recognize this and use a direct call instead.

有可能因为您没有任何覆盖虚拟方法的方法,所以 JIT 能够识别这一点并改用直接调用。

For something like this it's generally better to test it out as you have done than try to guess what the performance will be. If you want to know more about how delegate invocation works, I suggest the excellent book "CLR Via C#" by Jeffrey Richter.

对于这样的事情,通常最好像您所做的那样对其进行测试,而不是尝试猜测性能会如何。如果您想更多地了解委托调用的工作原理,我建议您阅读 Jeffrey Richter 所著的优秀书籍“CLR Via C#”。

回答by Yes Fish...

virtual overrides have some sort of redirection table or something which is hardcoded and fully optimized at compile time. It's set in stone, very fast.

虚拟覆盖有某种重定向表或一些在编译时硬编码和完全优化的东西。它是一成不变的,非常快。

Delegates are dynamic which will always have an overhead and they seem to be objects too so that adds up.

代表是动态的,总是会有开销,而且他们似乎也是对象,所以加起来。

You shouldn't worry about these small performance differences (unless developing performance critical software for the military), for most purposes good code structure wins over optimization.

您不应该担心这些小的性能差异(除非为军事开发性能关键的软件),在大多数情况下,良好的代码结构胜过优化。

回答by Michael Burr

I doubt it accounts for all of your difference, but one thing off the top of my head that may account for some of the difference is that virtual method dispatch already has the thispointer ready to go. When calling through a delegate the thispointer has to be fetched from the delegate.

我怀疑它可以解释您的所有差异,但我头脑中的一件事可能可以解释某些差异是虚拟方法调度已经this准备好了指针。通过委托调用时this,必须从委托中获取指针。

Note that according to this blog articlethe difference was even greater in .NET v1.x.

请注意,根据这篇博客文章,.NET v1.x 中的差异更大。

回答by Franci Penov

A virtual call is dereferencing two pointers at a well-known offset in the memory. It's not actually dynamic binding; there is no code at runtime to reflect over the metadata to discover the right method. The compiler generates couple of instructions to do the call, based on the this pointer. in fact, the virtual call is a single IL instruction.

虚拟调用正在取消引用内存中已知偏移量处的两个指针。它实际上不是动态绑定;在运行时没有代码来反映元数据以发现正确的方法。编译器根据 this 指针生成几个指令来进行调用。事实上,虚拟调用是一个单一的 IL 指令。

A predicate call is creating an anonymous class to encapsulate the predicate. That class has to be instantiated and there is some code generated to actually check whether the predicate function pointer is null or not.

谓词调用正在创建一个匿名类来封装谓词。该类必须被实例化,并且生成了一些代码来实际检查谓词函数指针是否为空。

I would suggest you look at the IL constructs for both. Compile a simplified version of your source above with a single call to each of the two DoSomthing. Then use ILDASM to see what is the actual code for each pattern.

我建议您查看两者的 IL 结构。通过对两个 DoSomthing 中的每一个的单个调用来编译上面源代码的简化版本。然后使用ILDASM查看每个模式的实际代码是什么。

(And I am sure I'll get downvoted for not using the right terminology :-))

(而且我确信我会因为没有使用正确的术语而被否决:-))

回答by Jon Skeet

Think about what's required in each case:

想想每种情况下需要什么:

Virtual call

虚拟通话

  • Check for nullity
  • Navigate from object pointer to type pointer
  • Look up method address in instruction table
  • (Not sure - even Richter doesn't cover this) Go to base type if method isn't overridden? Recurse until we find the right method address. (I don't think so - see edit at bottom.)
  • Push original object pointer onto stack ("this")
  • Call method
  • 检查无效性
  • 从对象指针导航到类型指针
  • 在指令表中查找方法地址
  • (不确定 - 即使 Richter 也没有涵盖这一点)如果方法未被覆盖,请转到基类型?递归直到找到正确的方法地址。(我不这么认为 - 请参阅底部的编辑。)
  • 将原始对象指针推入堆栈(“this”)
  • 调用方法

Delegate call

委托通话

  • Check for nullity
  • Navigate from object pointer to array of invocations (all delegates are potentially multicast)
  • Loop over array, and for each invocation:
    • Fetch method address
    • Work out whether or not to pass the target as first argument
    • Push arguments onto stack (may have been done already - not sure)
    • Optionally (depending on whether the invocation is open or closed) push the invocation target onto the stack
    • Call method
  • 检查无效性
  • 从对象指针导航到调用数组(所有委托都可能是多播的)
  • 循环遍历数组,并为每次调用:
    • 获取方法地址
    • 确定是否将目标作为第一个参数传递
    • 将参数推入堆栈(可能已经完成 - 不确定)
    • 可选(取决于调用是打开还是关闭)将调用目标推入堆栈
    • 调用方法

There may be some optimisation so that there's no looping involved in the single-call case, but even so that will take a very quick check.

可能会有一些优化,以便在单次调用的情况下不涉及循环,但即便如此,这也需要非常快速的检查。

But basically there's just as much indirection involved with a delegate. Given the bit I'm unsure of in the virtual method call, it's possible that a call to an unoverridden virtual method in a massively deep type hierarchy would be slower... I'll give it a try and edit with the answer.

但基本上,委托涉及的间接性也一样多。鉴于我在虚方法调用中不确定的一点,在大量深类型层次结构中调用未覆盖的虚方法可能会更慢......我会尝试并编辑答案。

EDIT: I've tried playing around with both the depth of inheritance hierarchy (up to 20 levels), the point of "most derived overriding" and the declared variable type - and none of them seems to make a difference.

编辑:我已经尝试处理继承层次结构的深度(最多 20 个级别)、“最派生覆盖”的点和声明的变量类型 - 它们似乎都没有什么区别。

EDIT: I've just tried the original program using an interface (which is passed in) - that ends up having about the same performance as the delegate.

编辑:我刚刚使用接口(传入)尝试了原始程序 - 最终具有与委托大致相同的性能。

回答by Jon Skeet

Just wanted to add a few corrections to john skeet's response:

只是想对 john skeet 的回复添加一些更正:

A virtual method call does not need to do a null check (automatically handled with hardware traps).

虚拟方法调用不需要做空检查(用硬件陷阱自动处理)。

It also does not need to walk up inheritance chain to find non-overriden methods (that's what the virtual method table is for).

它也不需要沿着继承链查找非覆盖的方法(这就是虚方法表的用途)。

A virtual method call is essentially one extra level of indirection when invoking. It is slower than a normal call because of the table look-up and subsequent function pointer call.

虚拟方法调用本质上是调用时的一个额外的间接级别。由于查表和随后的函数指针调用,它比普通调用慢。

A delegate call also involves an extra level of indirection.

委托调用还涉及额外的间接级别。

Calls to a delegate do not involve putting arguments in an array unless you are performing a dynamic invoke using the DynamicInvoke method.

调用委托不涉及将参数放入数组中,除非您使用 DynamicInvoke 方法执行动态调用。

A delegate call involves the calling method calling a compiler generated Invoke method on the delegate type in question. A call to predicator(value) is turned into predicator.Invoke(value).

委托调用涉及调用编译器生成的对相关委托类型的 Invoke 方法的调用方法。对 predicator(value) 的调用变成了 predicator.Invoke(value)。

The Invoke method in turn is implemented by the JIT to call the function pointer(s) (stored internally in the delegate object).

Invoke 方法反过来由 JIT 实现以调用函数指针(内部存储在委托对象中)。

In your example, the delegate you passed should have been implemented as a compiler generated static method as the implementation does not access any instance variables or locals so therefore the need to access the "this" pointer from the heap should not be an issue.

在您的示例中,您传递的委托应该已实现为编译器生成的静态方法,因为该实现不访问任何实例变量或本地变量,因此需要从堆访问“this”指针应该不是问题。

The performance difference between delegate and virtual function calls should be mostly the same and your performance tests show that they are very close.

委托和虚函数调用之间的性能差异应该大致相同,您的性能测试表明它们非常接近。

The difference could be due to the need to additional checks+branches because of multicast (as suggested by John). Another reason could be that the JIT compiler does not inline the Delegate.Invoke method and the implementation of Delegate.Invoke does not handle arguments as well as the implementation when performming virtual method calls.

差异可能是由于多播需要额外的检查+分支(如约翰所建议的)。另一个原因可能是 JIT 编译器没有内联 Delegate.Invoke 方法,并且在执行虚拟方法调用时,Delegate.Invoke 的实现不处理参数以及实现。