在什么情况下我应该使用 memcpy 而不是 C++ 中的标准运算符？

Question

提问by Patryk Czachurski

When can I get better performance using memcpyor how do I benefit from using it? For example:

我什么时候可以memcpy使用它获得更好的性能，或者我如何从使用中受益？例如：

float a[3]; float b[3];

is code:

是代码：

memcpy(a, b, 3*sizeof(float));

fasterthan this one?

比这个还快？

a[0] = b[0];
a[1] = b[1];
a[2] = b[2];

Answer 1

回答by Martin York

Efficiency should not be your concern.
Write clean maintainable code.

效率不应该是你关心的问题。
编写干净的可维护代码。

It bothers me that so many answers indicate that the memcpy() is inefficient. It is designed to be the most efficient way of copy blocks of memory (for C programs).

令我困扰的是，这么多答案表明 memcpy() 效率低下。它被设计为复制内存块的最有效方式（对于 C 程序）。

So I wrote the following as a test:

所以我写了以下作为测试：

#include <algorithm>

extern float a[3];
extern float b[3];
extern void base();

int main()
{
    base();

#if defined(M1)
    a[0] = b[0];
    a[1] = b[1];
    a[2] = b[2];
#elif defined(M2)
    memcpy(a, b, 3*sizeof(float));    
#elif defined(M3)
    std::copy(&a[0], &a[3], &b[0]);
 #endif

    base();
}

Then to compare the code produces:

然后比较代码产生：

g++ -O3 -S xr.cpp -o s0.s
g++ -O3 -S xr.cpp -o s1.s -DM1
g++ -O3 -S xr.cpp -o s2.s -DM2
g++ -O3 -S xr.cpp -o s3.s -DM3

echo "=======" >  D
diff s0.s s1.s >> D
echo "=======" >> D
diff s0.s s2.s >> D
echo "=======" >> D
diff s0.s s3.s >> D

This resulted in: (comments added by hand)

这导致：（手动添加评论）

=======   // Copy by hand
10a11,18
>   movq    _a@GOTPCREL(%rip), %rcx
>   movq    _b@GOTPCREL(%rip), %rdx
>   movl    (%rdx), %eax
>   movl    %eax, (%rcx)
>   movl    4(%rdx), %eax
>   movl    %eax, 4(%rcx)
>   movl    8(%rdx), %eax
>   movl    %eax, 8(%rcx)

=======    // memcpy()
10a11,16
>   movq    _a@GOTPCREL(%rip), %rcx
>   movq    _b@GOTPCREL(%rip), %rdx
>   movq    (%rdx), %rax
>   movq    %rax, (%rcx)
>   movl    8(%rdx), %eax
>   movl    %eax, 8(%rcx)

=======    // std::copy()
10a11,14
>   movq    _a@GOTPCREL(%rip), %rsi
>   movl    , %edx
>   movq    _b@GOTPCREL(%rip), %rdi
>   call    _memmove

Added Timing results for running the above inside a loop of 1000000000.

添加了在1000000000.

   g++ -c -O3 -DM1 X.cpp
   g++ -O3 X.o base.o -o m1
   g++ -c -O3 -DM2 X.cpp
   g++ -O3 X.o base.o -o m2
   g++ -c -O3 -DM3 X.cpp
   g++ -O3 X.o base.o -o m3
   time ./m1

   real 0m2.486s
   user 0m2.478s
   sys  0m0.005s
   time ./m2

   real 0m1.859s
   user 0m1.853s
   sys  0m0.004s
   time ./m3

   real 0m1.858s
   user 0m1.851s
   sys  0m0.006s

Answer 2

回答by crazylammer

You can use memcpyonly if the objects you're copying have no explicit constructors, so as their members (so-called POD, "Plain Old Data"). So it is OK to call memcpyfor float, but it is wrong for, e.g., std::string.

memcpy仅当您复制的对象没有显式构造函数时才能使用，因此它们的成员（所谓的 POD，“Plain Old Data”）。所以调用是可以memcpy的float，但是调用是错误的，例如，std::string。

But part of the work has already been done for you: std::copyfrom <algorithm>is specialized for built-in types (and possibly for every other POD-type - depends on STL implementation). So writing std::copy(a, a + 3, b)is as fast (after compiler optimization) as memcpy, but is less error-prone.

但是已经为您完成了部分工作：std::copyfrom<algorithm>专门用于内置类型（并且可能适用于所有其他 POD 类型 - 取决于 STL 实现）。因此，编写std::copy(a, a + 3, b)与一样快（在编译器优化之后）memcpy，但不易出错。

Answer 3

回答by ismail

Compilers specifically optimize memcpycalls, at least clang & gcc does. So you should prefer it wherever you can.

编译器专门优化memcpy调用，至少 clang 和 gcc 是这样。所以你应该尽可能喜欢它。

Answer 4

回答by Paul R

Don't go for premature micro-optimisations such as using memcpy like this. Using assignment is clearer and less error-prone and any decent compiler will generate suitably efficient code. If, and only if, you have profiled the code and found the assignments to be a significant bottleneck then you can consider some kind of micro-optimisation, but in general you should always write clear, robust code in the first instance.

不要过早地进行微优化，例如像这样使用 memcpy。使用赋值更清晰且不易出错，任何体面的编译器都会生成适当高效的代码。如果且仅当您分析了代码并发现分配是一个重要的瓶颈，那么您可以考虑某种微优化，但一般来说，您应该始终首先编写清晰、健壮的代码。

Answer 5

回答by Thanatos

Use std::copy(). As the header file for g++notes:

使用std::copy(). 作为g++笔记的头文件：

This inline function will boil down to a call to @c memmove whenever possible.

只要有可能，这个内联函数将归结为对@c memmove 的调用。

Probably, Visual Studio's is not much different. Go with the normal way, and optimize once you're aware of a bottle neck. In the case of a simple copy, the compiler is probably already optimizing for you.

可能，Visual Studio 的差别不大。按照常规方式进行，并在意识到瓶颈后进行优化。在简单副本的情况下，编译器可能已经在为您优化了。

Answer 6

回答by Jamie

The benefits of memcpy? Probably readability. Otherwise, you would have to either do a number of assignments or have a for loop for copying, neither of which are as simple and clear as just doing memcpy (of course, as long as your types are simple and don't require construction/destruction).

memcpy 的好处？大概是可读性。否则，您将不得不进行一些分配或使用 for 循环进行复制，这两者都不像仅执行 memcpy 那样简单明了（当然，只要您的类型简单且不需要构造/破坏）。

Also, memcpy is generally relatively optimized for specific platforms, to the point that it won't be all that much slower than simple assignment, and may even be faster.

此外，memcpy 通常针对特定平台进行了相对优化，以至于它不会比简单赋值慢多少，甚至可能更快。

Answer 7

回答by Simone

Supposedly, as Nawaz said, the assignment version shouldbe faster on most platform. That's because memcpy()will copy byte by byte while the second version could copy 4 bytes at a time.

据说，正如 Nawaz 所说，分配版本在大多数平台上应该更快。那是因为memcpy()将逐字节复制，而第二个版本一次可以复制 4 个字节。

As it's always the case, you should always profile applications to be sure that what you expect to be the bottleneck matches the reality.

与往常一样，您应该始终对应用程序进行概要分析，以确保您期望的瓶颈与现实相匹配。

Edit
Same applies to dynamic array. Since you mention C++ you should use std::copy()algorithm in that case.

编辑
同样适用于动态数组。既然你提到了 C++，你应该std::copy()在这种情况下使用算法。

Edit
This is code output for Windows XP with GCC 4.5.0, compiled with -O3 flag:

编辑
这是带有 GCC 4.5.0 的 Windows XP 的代码输出，使用 -O3 标志编译：

extern "C" void cpy(float* d, float* s, size_t n)
{
    memcpy(d, s, sizeof(float)*n);
}

I have done this function because OP specified dynamic arrays too.

我已经完成了这个功能，因为 OP 也指定了动态数组。

Output assembly is the following:

输出汇编如下：

_cpy:
LFB393:
    pushl   %ebp
LCFI0:
    movl    %esp, %ebp
LCFI1:
    pushl   %edi
LCFI2:
    pushl   %esi
LCFI3:
    movl    8(%ebp), %eax
    movl    12(%ebp), %esi
    movl    16(%ebp), %ecx
    sall    , %ecx
    movl    %eax, %edi
    rep movsb
    popl    %esi
LCFI4:
    popl    %edi
LCFI5:
    leave
LCFI6:
    ret

of course, I assume all of the experts here knows what rep movsbmeans.

当然，我假设这里的所有专家都知道是什么rep movsb意思。

This is the assignment version:

这是作业版本：

extern "C" void cpy2(float* d, float* s, size_t n)
{
    while (n > 0) {
        d[n] = s[n];
        n--;
    }
}

which yields the following code:

这产生以下代码：

_cpy2:
LFB394:
    pushl   %ebp
LCFI7:
    movl    %esp, %ebp
LCFI8:
    pushl   %ebx
LCFI9:
    movl    8(%ebp), %ebx
    movl    12(%ebp), %ecx
    movl    16(%ebp), %eax
    testl   %eax, %eax
    je  L2
    .p2align 2,,3
L5:
    movl    (%ecx,%eax,4), %edx
    movl    %edx, (%ebx,%eax,4)
    decl    %eax
    jne L5
L2:
    popl    %ebx
LCFI10:
    leave
LCFI11:
    ret

Which moves 4 bytes at a time.

一次移动 4 个字节。

在什么情况下我应该使用 memcpy 而不是 C++ 中的标准运算符？

提问by Patryk Czachurski

回答by Martin York

回答by crazylammer

回答by ismail

回答by Paul R

回答by Thanatos

回答by Jamie

回答by Simone

相关推荐

最近更新

标签

在什么情况下我应该使用 memcpy 而不是 C++ 中的标准运算符？

提问by Patryk Czachurski

回答by Martin York

回答by crazylammer

回答by ismail

回答by Paul R

回答by Thanatos

回答by Jamie

回答by Simone

相关推荐

C++：访问父方法和变量

C++ 使用find方法后如何更新std::map？

C++ 如何使用 Visual Studio 2017 在 Windows 上构建 OpenSSL？

C++ 将 int 附加到 std::string

相关推荐

最近更新

标签