虚函数和性能 - C++

Question

提问by Navaneeth K N

In my class design, I use abstract classes and virtual functions extensively. I had a feeling that virtual functions affects the performance. Is this true? But I think this performance difference is not noticeable and looks like I am doing premature optimization. Right?

在我的类设计中，我广泛使用抽象类和虚函数。我有一种感觉，虚函数会影响性能。这是真的？但我认为这种性能差异并不明显，看起来我正在做过早的优化。对？

Answer 1

采纳答案by Greg Hewgill

A good rule of thumb is:

一个好的经验法则是：

It's not a performance problem until you can prove it.

在您证明之前，这不是性能问题。

The use of virtual functions will have a very slight effect on performance, but it's unlikely to affect the overall performance of your application. Better places to look for performance improvements are in algorithms and I/O.

使用虚函数会对性能产生非常轻微的影响，但不太可能影响应用程序的整体性能。寻找性能改进的更好的地方是算法和 I/O。

An excellent article that talks about virtual functions (and more) is Member Function Pointers and the Fastest Possible C++ Delegates.

一篇关于虚函数（以及更多）的优秀文章是Member Function Pointers and the Fastest possible C++ Delegates。

Answer 2

回答by Crashworks

Your question made me curious, so I went ahead and ran some timings on the 3GHz in-order PowerPC CPU we work with. The test I ran was to make a simple 4d vector class with get/set functions

你的问题让我很好奇，所以我继续在我们使用的 3GHz 有序 PowerPC CPU 上运行了一些计时。我运行的测试是使用 get/set 函数制作一个简单的 4d 矢量类

class TestVec 
{
    float x,y,z,w; 
public:
    float GetX() { return x; }
    float SetX(float to) { return x=to; }  // and so on for the other three 
}

Then I set up three arrays each containing 1024 of these vectors (small enough to fit in L1) and ran a loop that added them to one another (A.x = B.x + C.x) 1000 times. I ran this with the functions defined as inline, virtual, and regular function calls. Here are the results:

然后我设置了三个数组，每个数组包含 1024 个这些向量（小到足以放入 L1）并运行一个循环，将它们彼此相加 (Ax = Bx + Cx) 1000 次。我定义为功能跑到这inline，virtual和普通函数调用。结果如下：

inline: 8ms (0.65ns per call)
direct: 68ms (5.53ns per call)
virtual: 160ms (13ns per call)

内联：8ms（每次调用 0.65ns）
直接：68ms（每次调用 5.53ns）
虚拟：160ms（每次调用 13ns）

So, in this case (where everything fits in cache) the virtual function calls were about 20x slower than the inline calls. But what does this really mean? Each trip through the loop caused exactly 3 * 4 * 1024 = 12,288function calls (1024 vectors times four components times three calls per add), so these times represent 1000 * 12,288 = 12,288,000function calls. The virtual loop took 92ms longer than the direct loop, so the additional overhead per call was 7 nanosecondsper function.

因此，在这种情况下（一切都适合缓存），虚函数调用比内联调用慢 20 倍左右。但这究竟意味着什么？循环中的每次行程都会导致精确的3 * 4 * 1024 = 12,288函数调用（1024 个向量乘以四个分量乘以每次加法的三个调用），因此这些时间代表1000 * 12,288 = 12,288,000函数调用。虚拟循环比直接循环长 92 毫秒，因此每次调用的额外开销是每个函数7纳秒。

From this I conclude: yes, virtual functions are much slower than direct functions, and no, unless you're planning on calling them ten million times per second, it doesn't matter.

由此我得出结论：是的，虚函数比直接函数慢得多，不，除非您计划每秒调用它们一千万次，否则没关系。

See also: comparison of the generated assembly.

另请参阅：生成的程序集的比较。

Answer 3

回答by Chuck

When Objective-C (where all methods are virtual) is the primary language for the iPhone and freakin' Javais the main language for Android, I think it's pretty safe to use C++ virtual functions on our 3 GHz dual-core towers.

当 Objective-C（所有方法都是虚拟的）是 iPhone 的主要语言，而该死的Java是 Android 的主要语言时，我认为在我们的 3 GHz 双核塔上使用 C++ 虚拟函数是相当安全的。

Answer 4

回答by Mark James

In very performance critical applications (like video games) a virtual function call can be too slow. With modern hardware, the biggest performance concern is the cache miss. If data isn't in the cache, it may be hundreds of cycles before it's available.

在非常注重性能的应用程序（如视频游戏）中，虚拟函数调用可能太慢。对于现代硬件，最大的性能问题是缓存未命中。如果数据不在缓存中，则可能需要数百个周期才能使用。

A normal function call can generate an instruction cache miss when the CPU fetches the first instruction of the new function and it's not in the cache.

当 CPU 获取新函数的第一条指令并且它不在缓存中时，正常的函数调用会产生指令缓存未命中。

A virtual function call first needs to load the vtable pointer from the object. This can result in a data cache miss. Then it loads the function pointer from the vtable which can result in another data cache miss. Then it calls the function which can result in an instruction cache miss like a non-virtual function.

虚函数调用首先需要从对象加载 vtable 指针。这可能会导致数据缓存未命中。然后它从 vtable 加载函数指针，这可能导致另一个数据缓存未命中。然后它调用可能像非虚拟函数一样导致指令缓存未命中的函数。

In many cases, two extra cache misses are not a concern, but in a tight loop on performance critical code it can dramatically reduce performance.

在许多情况下，两个额外的缓存未命中不是问题，但在性能关键代码的紧密循环中，它会显着降低性能。

Answer 5

回答by Boojum

From page 44 of Agner Fog's "Optimizing Software in C++" manual:

从Agner Fog 的“用 C++ 优化软件”手册的第 44 页：

The time it takes to call a virtual member function is a few clock cycles more than it takes to call a non-virtual member function, provided that the function call statement always calls the same version of the virtual function. If the version changes then you will get a misprediction penalty of 10 - 30 clock cycles. The rules for prediction and misprediction of virtual function calls is the same as for switch statements...

调用虚成员函数所花费的时间比调用非虚成员函数所花费的时间多几个时钟周期，前提是函数调用语句始终调用相同版本的虚函数。如果版本更改，那么您将收到 10 - 30 个时钟周期的错误预测惩罚。虚函数调用的预测和错误预测规则与 switch 语句相同......

Answer 6

回答by Jason S

There's another performance criteria besides execution time. A Vtable takes up memory space as well, and in some cases can be avoided: ATL uses compile-time "simulated dynamic binding" with templatesto get the effect of "static polymorphism", which is sort of hard to explain; you basically pass the derived class as a parameter to a base class template, so at compile time the base class "knows" what its derived class is in each instance. Won't let you store multiple different derived classes in a collection of base types (that's run-time polymorphism) but from a static sense, if you want to make a class Y that is the same as a preexisting template class X which has the hooks for this kind of overriding, you just need to override the methods you care about, and then you get the base methods of class X without having to have a vtable.

除了执行时间之外，还有另一个性能标准。Vtable 也会占用内存空间，在某些情况下可以避免：ATL 使用带有模板的编译时“模拟动态绑定”获得“静态多态”的效果，这有点难以解释；您基本上将派生类作为参数传递给基类模板，因此在编译时基类“知道”其派生类在每个实例中是什么。不会让您将多个不同的派生类存储在一组基本类型（即运行时多态性）中，但从静态意义上讲，如果您想创建一个与预先存在的模板类 X 相同的类 Y，它具有这种覆盖的钩子，你只需要覆盖你关心的方法，然后你就可以得到类X的基方法，而不必有vtable。

In classes with large memory footprints, the cost of a single vtable pointer is not much, but some of the ATL classes in COM are very small, and it's worth the vtable savings if the run-time polymorphism case is never going to occur.

在内存占用较大的类中，单个vtable指针的开销并不大，但是COM中的一些ATL类非常小，如果永远不会发生运行时多态情况，那么vtable节省是值得的。

回答by gbjbaanb

absolutely. It was a problem way back when computers ran at 100Mhz, as every method call required a lookup on the vtable before it was called. But today.. on a 3Ghz CPU that has 1st level cache with more memory than my first computer had? Not at all. Allocating memory from main RAM will cost you more time than if all your functions were virtual.

绝对地。当计算机以 100Mhz 运行时，这是一个问题，因为每个方法调用都需要在调用之前在 vtable 上查找。但是今天..在一个 3Ghz CPU 上，它具有比我的第一台计算机更多的内存的一级缓存？一点也不。与所有函数都是虚拟的相比，从主 RAM 分配内存会花费更多的时间。

Its like the old, old days where people said structured programming was slow because all the code was split into functions, each function required stack allocations and a function call!

就像过去人们说结构化编程很慢，因为所有代码都被拆分成函数，每个函数都需要堆栈分配和函数调用！

The only time I would even think of bothering to consider the performance impact of a virtual function, is if it was very heavily used and instantiated in templated code that ended up throughout everything. Even then, I wouldn't spend too much effort on it!

我唯一一次想到要费心考虑虚拟函数的性能影响，是它是否在模板化代码中被大量使用和实例化，最终贯穿所有内容。即便如此，我也不会花太多精力在上面！

PS think of other 'easy to use' languages - all their methods are virtual under the covers and they don't crawl nowadays.

PS 想想其他“易于使用”的语言——它们的所有方法都是虚拟的，现在它们不会爬行。

Answer 8

回答by Serge

Yes, you're right and if you curious about the cost of virtual function call you might find this postinteresting.

是的，你是对的，如果你对虚函数调用的成本感到好奇，你可能会发现这篇文章很有趣。

Answer 9

回答by Daemin

The only ever way that I can see that a virtual function will become a performance problem is if many virtual functions are called within a tight loop, and if and only ifthey cause a page fault or other "heavy" memory operation to occur.

我认为虚函数将成为性能问题的唯一方法是，如果在一个紧密循环中调用许多虚函数，并且当且仅当它们导致页面错误或其他“重”内存操作发生时。

Though like other people have said it's pretty much never going to be a problem for you in real life. And if you think it is, run a profiler, do some tests, and verify if this really is a problem before trying to "undesign" your code for a performance benefit.

尽管就像其他人所说的那样，在现实生活中这对您来说几乎永远不会成为问题。如果您认为是，请运行分析器，进行一些测试，并在尝试“取消设计”您的代码以获得性能优势之前验证这是否真的是一个问题。

Answer 10

回答by Evgueny Sedov

When class method is not virtual, compiler usually does in-lining. In contrary, when you use pointer to some class with virtual function, the real address will be known only at runtime.

当类方法不是虚拟的时，编译器通常会内联。相反，当您使用指向某个具有虚函数的类的指针时，只有在运行时才能知道真实地址。

This is well illustrated by test, time difference ~700% (!):

测试很好地说明了这一点，时间差 ~700% (!)：

#include <time.h>

class Direct
{
public:
    int Perform(int &ia) { return ++ia; }
};

class AbstrBase
{
public:
    virtual int Perform(int &ia)=0;
};

class Derived: public AbstrBase
{
public:
    virtual int Perform(int &ia) { return ++ia; }
};


int main(int argc, char* argv[])
{
    Direct *pdir, dir;
    pdir = &dir;

    int ia=0;
    double start = clock();
    while( pdir->Perform(ia) );
    double end = clock();
    printf( "Direct %.3f, ia=%d\n", (end-start)/CLOCKS_PER_SEC, ia );

    Derived drv;
    AbstrBase *ab = &drv;

    ia=0;
    start = clock();
    while( ab->Perform(ia) );
    end = clock();
    printf( "Virtual: %.3f, ia=%d\n", (end-start)/CLOCKS_PER_SEC, ia );

    return 0;
}

The impact of virtual function call highly depends on situation. If there are few calls and significant amount of work inside function - it could be negligible.

虚函数调用的影响高度取决于情况。如果函数内部调用很少且工作量很大 - 它可能可以忽略不计。

Or, when it is a virtual call repeatedly used many times, while doing some simple operation - it could be really big.

或者，当它是一个重复使用多次的虚拟调用，同时做一些简单的操作时——它可能真的很大。

虚函数和性能 - C++

提问by Navaneeth K N

采纳答案by Greg Hewgill

回答by Crashworks

回答by Chuck

回答by Mark James

回答by Boojum

回答by Jason S

回答by gbjbaanb

回答by Serge

回答by Daemin

回答by Evgueny Sedov

相关推荐

最近更新

标签

虚函数和性能 - C++

提问by Navaneeth K N

采纳答案by Greg Hewgill

回答by Crashworks

回答by Chuck

回答by Mark James

回答by Boojum

回答by Jason S

回答by gbjbaanb

回答by Serge

回答by Daemin

回答by Evgueny Sedov

相关推荐

什么是最有效的线程安全 C++ 记录器？

非常基本的 C++ 程序问题 - 二进制表达式的无效操作数

C++ 如何在 Windows 上为 NetBeans 和 gcc 添加库包含路径？

如何在 C++ 中创建一个随机的字母数字字符串？

相关推荐

最近更新

标签