C++ 传值比传引用快

Question

提问by Bj?rn Hallstr?m

I made a simple program in c++ to compare performance between two approaches - pass by value and pass by reference. Actually pass by value performed better than pass by reference.

我用 C++ 编写了一个简单的程序来比较两种方法之间的性能 - 按值传递和按引用传递。实际上按值传递比按引用传递更好。

The conclusion should be that passing by value require fewer clock-cycles (instructions)

结论应该是按值传递需要更少的时钟周期（指令）

I would be really glad if someone could explain in detail whypass by value require fewer clock-cycles.

如果有人能详细解释为什么按值传递需要更少的时钟周期，我会很高兴。

#include <iostream>
#include <stdlib.h>
#include <time.h>

using namespace std;

void function(int *ptr);
void function2(int val);

int main() {

   int nmbr = 5;

   clock_t start, stop;
   start = clock();
   for (long i = 0; i < 1000000000; i++) {
       function(&nmbr);
       //function2(nmbr);
   }
   stop = clock();

   cout << "time: " << stop - start;

   return 0;
}

/**
* pass by reference
*/
void function(int *ptr) {
    *ptr *= 5;
}

/**
* pass by value
*/
void function2(int val) {
   val *= 5;
}

Answer 1

回答by jliv902

A good way to find out why there are any differences is to check the disassembly. Here are the results I got on my machine with Visual Studio 2012.

找出差异原因的一个好方法是检查反汇编。这是我使用 Visual Studio 2012 在我的机器上得到的结果。

With optimization flags, both functions generate the same code:

使用优化标志，两个函数生成相同的代码：

009D1270 57                   push        edi  
009D1271 FF 15 D4 30 9D 00    call        dword ptr ds:[9D30D4h]  
009D1277 8B F8                mov         edi,eax  
009D1279 FF 15 D4 30 9D 00    call        dword ptr ds:[9D30D4h]  
009D127F 8B 0D 48 30 9D 00    mov         ecx,dword ptr ds:[9D3048h]  
009D1285 2B C7                sub         eax,edi  
009D1287 50                   push        eax  
009D1288 E8 A3 04 00 00       call        std::operator<<<std::char_traits<char> > (09D1730h)  
009D128D 8B C8                mov         ecx,eax  
009D128F FF 15 2C 30 9D 00    call        dword ptr ds:[9D302Ch]  
009D1295 33 C0                xor         eax,eax  
009D1297 5F                   pop         edi  
009D1298 C3                   ret

This is basically equivalent to:

这基本上相当于：

int main ()
{
    clock_t start, stop ;
    start = clock () ;
    stop = clock () ;
    cout << "time: " << stop - start ;
    return 0 ;
}

Without optimization flags, you will probably get different results.

如果没有优化标志，您可能会得到不同的结果。

function (no optimizations):

功能（无优化）：

00114890 55                   push        ebp  
00114891 8B EC                mov         ebp,esp  
00114893 81 EC C0 00 00 00    sub         esp,0C0h  
00114899 53                   push        ebx  
0011489A 56                   push        esi  
0011489B 57                   push        edi  
0011489C 8D BD 40 FF FF FF    lea         edi,[ebp-0C0h]  
001148A2 B9 30 00 00 00       mov         ecx,30h  
001148A7 B8 CC CC CC CC       mov         eax,0CCCCCCCCh  
001148AC F3 AB                rep stos    dword ptr es:[edi]  
001148AE 8B 45 08             mov         eax,dword ptr [ptr]  
001148B1 8B 08                mov         ecx,dword ptr [eax]  
001148B3 6B C9 05             imul        ecx,ecx,5  
001148B6 8B 55 08             mov         edx,dword ptr [ptr]  
001148B9 89 0A                mov         dword ptr [edx],ecx  
001148BB 5F                   pop         edi  
001148BC 5E                   pop         esi  
001148BD 5B                   pop         ebx  
001148BE 8B E5                mov         esp,ebp  
001148C0 5D                   pop         ebp  
001148C1 C3                   ret

function2 (no optimizations)

function2（无优化）

00FF4850 55                   push        ebp  
00FF4851 8B EC                mov         ebp,esp  
00FF4853 81 EC C0 00 00 00    sub         esp,0C0h  
00FF4859 53                   push        ebx  
00FF485A 56                   push        esi  
00FF485B 57                   push        edi  
00FF485C 8D BD 40 FF FF FF    lea         edi,[ebp-0C0h]  
00FF4862 B9 30 00 00 00       mov         ecx,30h  
00FF4867 B8 CC CC CC CC       mov         eax,0CCCCCCCCh  
00FF486C F3 AB                rep stos    dword ptr es:[edi]  
00FF486E 8B 45 08             mov         eax,dword ptr [val]  
00FF4871 6B C0 05             imul        eax,eax,5  
00FF4874 89 45 08             mov         dword ptr [val],eax  
00FF4877 5F                   pop         edi  
00FF4878 5E                   pop         esi  
00FF4879 5B                   pop         ebx  
00FF487A 8B E5                mov         esp,ebp  
00FF487C 5D                   pop         ebp  
00FF487D C3                   ret

Why is pass by value faster (in the no optimization case)?

为什么按值传递更快（在没有优化的情况下）？

Well, function()has two extra movoperations. Let's take a look at the first extra movoperation:

嗯，function()有两个额外的mov操作。我们来看看第一个额外的mov操作：

001148AE 8B 45 08             mov         eax,dword ptr [ptr]  
001148B1 8B 08                mov         ecx,dword ptr [eax]  
001148B3 6B C9 05             imul        ecx,ecx,5

Here we are dereferencing the pointer. In function2 (), we already have the value, so we avoid this step. We first move the address of the pointer into register eax. Then we move the value of the pointer into register ecx. Finally, we multiply the value by five.

这里我们取消引用指针。在中function2 ()，我们已经有了值，所以我们避免了这一步。我们首先将指针的地址移动到寄存器 eax 中。然后我们将指针的值移动到寄存器 ecx 中。最后，我们将该值乘以 5。

Let's look at the second extra movoperation:

让我们看看第二个额外的mov操作：

001148B3 6B C9 05             imul        ecx,ecx,5  
001148B6 8B 55 08             mov         edx,dword ptr [ptr]  
001148B9 89 0A                mov         dword ptr [edx],ecx

Now we are moving backwards. We have just finished multiplying the value by 5, and we need to place the value back into the memory address.

现在我们正在倒退。我们刚刚完成将值乘以 5，我们需要将值放回内存地址。

Because function2 ()does not have to deal with referencing and dereferencing a pointer, it gets to skip these two extra movoperations.

因为function2 ()不必处理引用和取消引用指针，所以它可以跳过这两个额外的mov操作。

Answer 2

回答by green lantern

Overhead with passing by reference:

通过引用传递的开销：

each access needs a dereference, i.e., there is one more memory read

每次访问都需要取消引用，即多读一次内存

Overhead with passing by value:

按值传递的开销：

the value needs to be copied on stack or into registers

该值需要复制到堆栈或寄存器中

For small objects, such as an integer, passing by value will be faster. For bigger objects (for example a large structure), the copying would create too much overhead so passing by reference will be faster.

对于小对象，例如整数，按值传递会更快。对于较大的对象（例如大型结构），复制会产生过多的开销，因此通过引用传递会更快。

Answer 3

回答by Cosmic Bacon

Imagine you walk into a function and you're supposed to come in with an int value. The code in the function wants to do stuff with that int value.

想象一下，你走进一个函数，你应该输入一个 int 值。函数中的代码想要使用该 int 值执行操作。

Pass by value is like walking into the function and when someone asks for the int foo value, you just give it to them.

按值传递就像走进函数一样，当有人要求提供 int foo 值时，您只需将其提供给他们。

Pass by reference is walking into the function with the address of the int foo value. Now whenever someone needs the value of foo they have to go and look it up. Everyone's gonna complain about having to dereference foo all the freaking time. I've been in this function for 2 milliseconds now and I must have looked up foo a thousand times! Why didn't you just give me the value in the first place? Why didn't you pass by value?

通过引用传递是使用 int foo 值的地址进入函数。现在，每当有人需要 foo 的值时，他们就必须去查找。每个人都会抱怨不得不一直取消引用 foo 。我已经在这个函数中使用了 2 毫秒，我一定已经查找了 foo 一千次！你为什么不首先给我价值？你为什么不传值？

This analogy helped me see why passing by value is often the fastest choice.

这个类比帮助我明白了为什么按值传递通常是最快的选择。

Answer 4

回答by Kahler

To some reasoning: In most popular machines, an integer is 32bits, and a pointer is 32 or 64bits

出于某种原因：在大多数流行的机器中，整数是 32 位，指针是 32 或 64 位

So you have to pass that much information.

所以你必须传递这么多信息。

To multiply an integer you have to:

要乘以整数，您必须：

Multiply it.

乘以它。

To multiply an integer pointed by a pointer you have to:

要将指针指向的整数相乘，您必须：

Deference the pointer. Multiply it.

尊重指针。乘以它。

Hope it's clear enough :)

希望它足够清楚:)

Now to some more specific stuff:

现在来一些更具体的东西：

As it's been pointed out, your by-value function does nothing with the result, but the by-pointer one actually saves the result in memory. Why you so unfair with poor pointer? :( (just kidding)

正如已经指出的那样，您的按值函数对结果没有任何作用，但按指针实际上将结果保存在内存中。为什么你对可怜的指针如此不公平？：（（只是在开玩笑）

It's hard to say how valid your benchmark is, since compilers come packed with all kind of optimization. (of course you can control the compiler freedom, but you haven't provided info on that)

很难说您的基准测试有多有效，因为编译器包含各种优化。（当然你可以控制编译器的自由，但你没有提供相关信息）

And finally (and probably most important), pointers, values or references does not have an speed associated to it. Who knows, you may find a machine that is faster with pointers and take a hard time with values, or the opposite. Okay, okay, there is some pattern in hardware and we make all this assumptions, the most widely accepted seems to be:

最后（可能也是最重要的），指针、值或引用没有与之相关的速度。谁知道呢，你可能会发现一台机器在使用指针时速度更快，而在使用值时却很困难，或者相反。好吧，好吧，硬件中有一些模式，我们做出所有这些假设，最广泛接受的似乎是：

Pass simple objects by value and more complex ones by reference (or pointer) (but then again, what's complex? What's simple? It changes with time as hardware follows)

通过值传递简单对象，通过引用（或指针）传递更复杂的对象（但话说回来，什么是复杂的？什么是简单的？随着硬件的发展，它会随着时间而变化）

So recently I sense the standard opinion is becoming: pass by value and trust the compiler. And that's cool. Compilers are backed up with years of expertise development and angry users demanding it to be always better.

所以最近我感觉到标准意见正在变成：传递价值并信任编译器。这很酷。编译器以多年的专业知识发展和愤怒的用户为后盾，要求它总是更好。

Answer 5

回答by Spacemoose

When you pass by value, you are telling the compiler to make a copy of the entity you are passing by value.

当您按值传递时，您是在告诉编译器复制您按值传递的实体。

When you are passing by reference, you are telling the compiler that it must use the actual memory that the reference is pointing to. The compiler does not know if you are doing this in an attempt to optimize, or because the referenced value might be changing in some other thread (for example). It has to use that area of memory.

当您通过引用传递时，您是在告诉编译器它必须使用引用指向的实际内存。编译器不知道您这样做是为了优化，还是因为引用的值可能在其他线程中发生变化（例如）。它必须使用该内存区域。

Passing by reference means the processor has to access that specific memory block. That may or may not be the most efficient process, depending on the state of the registers. When you pass by reference, the memory on the stack can be used, which increases the chance of accessing cache (much faster) memory.

通过引用传递意味着处理器必须访问该特定内存块。这可能是也可能不是最有效的过程，具体取决于寄存器的状态。当您通过引用传递时，可以使用堆栈上的内存，这增加了访问缓存（更快）内存的机会。

Finally, depending on the architecture of your machine and the type you are passing, a reference may actually be larger than the value you are copying. Copying a 32 bit integer involves copying less than passing a reference on a 64 bit machine.

最后，根据您机器的架构和您传递的类型，引用实际上可能大于您正在复制的值。复制 32 位整数涉及的复制少于在 64 位机器上传递引用。

So passing by reference should only be done when you need a reference (to mutate the value, or because the value might be mutated elsewhere), or when copying the referenced object is more expensive than dereferencing the necessary memory.

因此，仅当您需要引用（改变值，或者因为该值可能在其他地方发生变异）时，或者复制引用的对象比取消引用必要的内存更昂贵时，才应该通过引用传递。

While that last point is non-trivial, a good rule of thumb is to do what Java does: pass fundamental types by value, and complex types by (const) reference.

虽然最后一点很重要，但一个好的经验法则是做 Java 所做的事情：通过值传递基本类型，通过（const）引用传递复杂类型。

Answer 6

回答by Mark Ransom

In this case, the compiler probably realized that the result of the multiply wasn't being used in the pass-by-value case and optimized it out entirely. Without seeing the disassembled code it's impossible to be sure.

在这种情况下，编译器可能意识到乘法的结果没有用于按值传递的情况，并对其进行了完全优化。如果没有看到反汇编的代码，就不可能确定。

Answer 7

回答by PureW

Passing by value is often very quick for small types since most of them are smaller than the pointer on modern systems (64bit). There may also be certain optimizations done when passed by value.

对于小类型，按值传递通常非常快，因为它们中的大多数都小于现代系统（64 位）上的指针。按值传递时也可能会进行某些优化。

As a general rule, pass builtin-types by value.

作为一般规则，按值传递内置类型。

Answer 8

回答by Drunken Code Monkey

Quite often executing 32 bit memory manipulation instructions is slower on a native 64 bit platform, because the processor has to run 64 bit instructions regardless. If it is done correctly by the compiler, 32 bit instructions get "paired" at the instruction cache, but if a 32 bit read is executed with a 64 bit instruction 4 additional bytes are copied as filling and then discarded. In short, value being smaller than pointer size does not necessarily mean it's faster. It depends on the situation and on the compiler, and should absolutely not be taken into consideration for performance except for composite types where the value is definitely larger than the pointer by a magnitude of 1, or in cases where you need the absolute best performance for one particular platform without regards to portability. The choice between passing by reference or by value should depend only on whether or not you want the called procedure to be able to modify the object passed. If it's only a read for a type smaller than 128 bits, pass by value, it's safer.

在本机 64 位平台上执行 32 位内存操作指令通常较慢，因为无论如何处理器都必须运行 64 位指令。如果编译器正确完成，则 32 位指令在指令缓存中“配对”，但如果使用 64 位指令执行 32 位读取，则会复制 4 个额外字节作为填充，然后将其丢弃。简而言之，值小于指针大小并不一定意味着它更快。这取决于情况和编译器，并且绝对不应该考虑性能，除非复合类型的值肯定比指针大 1 的数量级，或者在您需要绝对最佳性能的情况下一个不考虑可移植性的特定平台。选择通过引用传递还是通过值传递应该仅取决于您是否希望被调用的过程能够修改传递的对象。如果只是读取小于 128 位的类型，按值传递，则更安全。

C++ 传值比传引用快

提问by Bj?rn Hallstr?m

回答by jliv902

回答by green lantern

回答by Cosmic Bacon

回答by Kahler

回答by Spacemoose

回答by Mark Ransom

回答by PureW

回答by Drunken Code Monkey

相关推荐

最近更新

标签

C++ 传值比传引用快

提问by Bj?rn Hallstr?m

回答by jliv902

回答by green lantern

回答by Cosmic Bacon

回答by Kahler

回答by Spacemoose

回答by Mark Ransom

回答by PureW

回答by Drunken Code Monkey

相关推荐

C++ - 十进制到二进制的转换

C++ ELF：链接：为什么我在 .so 文件中得到未定义的引用

C++ unique_ptr & vector，尝试访问已删除的函数，Visual Studio 2013

C++ 如何正确接收和发送来自 Arduino 的原始 IR 数据

相关推荐

最近更新

标签