C++ memcpy vs for 循环 - 从指针复制数组的正确方法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4729046/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
memcpy vs for loop - What's the proper way to copy an array from a pointer?
提问by mjn12
I have a function foo(int[] nums)
which I understand is essentially equivalent to foo(int* nums)
. Inside foo
I need to copy the contents of the array pointed to by nums
into some int[10]
declared within the scope of foo
. I understand the following is invalid:
我有一个函数foo(int[] nums)
,我理解它本质上等同于foo(int* nums)
. 里面foo
我需要给指向数组的内容复制nums
到一些int[10]
的范围内声明foo
。我了解以下内容无效:
void foo (int[] nums)
{
myGlobalArray = *nums
}
What is the proper way to copy the array? Should I use memcpy like so:
复制数组的正确方法是什么?我应该像这样使用 memcpy 吗:
void foo (int[] nums)
{
memcpy(&myGlobalArray, nums, 10);
}
or should I use a for loop?
还是应该使用 for 循环?
void foo(int[] nums)
{
for(int i =0; i < 10; i++)
{
myGlobalArray[i] = nums[i];
}
}
Is there a third option that I'm missing?
我缺少第三种选择吗?
采纳答案by Jay
Memcpy will probably be faster, but it's more likely you will make a mistake using it. It may depend on how smart your optimizing compiler is.
Memcpy 可能会更快,但使用它的可能性更大。这可能取决于您的优化编译器的智能程度。
Your code is incorrect though. It should be:
不过你的代码不正确。它应该是:
memcpy(myGlobalArray, nums, 10 * sizeof(int) );
回答by Oliver Charlesworth
Yes, the third option is to use a C++ construct:
是的,第三种选择是使用 C++ 构造:
std::copy(&nums[0], &nums[10], myGlobalArray);
With any sane compiler, it:
使用任何健全的编译器,它:
- should be optimum in the majority of cases (will compile to
memcpy()
where possible), - is type-safe,
- gracefully copes when you decide to change the data-type to a non-primitive (i.e. it calls copy constructors, etc.),
- gracefully copes when you decide to change to a container class.
- 在大多数情况下应该是最佳的(将
memcpy()
尽可能编译), - 是类型安全的,
- 当您决定将数据类型更改为非原始类型(即它调用复制构造函数等)时,可以优雅地应对,
- 当您决定更改为容器类时,可以优雅地应对。
回答by kfsone
Generally speaking, the worst case scenario will be in an un-optimized debug build where memcpy
is not inlined and may perform additional sanity/assert checks amounting to a small number of additional instructions vs a for loop.
一般来说,最坏的情况是在未优化的调试版本中,其中memcpy
没有内联,并且可能会执行额外的健全性/断言检查,相当于少量的额外指令与 for 循环。
However memcpy
is generally well implemented to leverage things like intrinsics etc, but this will vary with target architecture and compiler. It is unlikely that memcpy
will ever be worse than a for-loop implementation.
然而memcpy
,通常可以很好地实现以利用内在函数等,但这会因目标架构和编译器而异。这不太可能memcpy
比 for 循环实现更糟糕。
People often trip over the fact that memcpy sizes in bytes, and they write things like these:
人们经常被 memcpy 以字节为单位的大小这一事实绊倒,他们会写这样的东西:
// wrong unless we're copying bytes.
memcpy(myGlobalArray, nums, numNums);
// wrong if an int isn't 4 bytes or the type of nums changed.
memcpy(myGlobalArray, nums, numNums);
// wrong if nums is no-longer an int array.
memcpy(myGlobalArray, nums, numNums * sizeof(int));
You can protect yourself here by using language features that let you do some degree of reflection, that is: do things in terms of the data itself rather than what you know about the data, because in a generic function you generally don't know anything about the data:
您可以通过使用让您进行某种程度反思的语言功能来保护自己,即:根据数据本身而不是您对数据的了解来做事,因为在通用函数中您通常什么都不知道关于数据:
void foo (int* nums, size_t numNums)
{
memcpy(myGlobalArray, nums, numNums * sizeof(*nums));
}
Note that you don't want the "&" infront of "myGlobalArray" because arrays automatically decay to pointers; you were actually copying "nums" to the address in memory where the pointer to the myGlobalArray[0] was being held.
请注意,您不希望“myGlobalArray”前面有“&”,因为数组会自动衰减为指针;您实际上是将“nums”复制到指向 myGlobalArray[0] 的指针所在的内存地址。
(Edit note: I'd typo'd int[] nums
when I mean't int nums[]
but I decided that adding C array-pointer-equivalencechaos helped nobody, so now it's int *nums
:))
(编辑注意:int[] nums
当我不是故意的时候我会打错字,int nums[]
但我决定添加C 数组指针等价混乱对任何人都没有帮助,所以现在是int *nums
:))
Using memcpy
on objects can be dangerous, consider:
memcpy
在对象上使用可能很危险,请考虑:
struct Foo {
std::string m_string;
std::vector<int> m_vec;
};
Foo f1;
Foo f2;
f2.m_string = "hello";
f2.m_vec.push_back(42);
memcpy(&f1, &f2, sizeof(f2));
This is the WRONG way to copy objects that aren't POD (plain old data). Both f1 and f2 now have a std::string that thinks it owns "hello". One of them is going to crash when they destruct, and they both think they own the same vector of integers that contains 42.
这是复制非 POD(纯旧数据)对象的错误方法。f1 和 f2 现在都有一个 std::string 认为它拥有“hello”。其中一个在销毁时会崩溃,并且他们都认为他们拥有相同的包含 42 的整数向量。
The best practice for C++ programmers is to use std::copy
:
C++ 程序员的最佳实践是使用std::copy
:
std::copy(nums, nums + numNums, myGlobalArray);
Note per Remy Lebeauor since C++11
根据 Remy Lebeau或从 C++11 开始的注释
std::copy_n(nums, numNums, myGlobalArray);
This can make compile time decisions about what to do, including using memcpy
or memmove
and potentially using SSE/vector instructions if possible. Another advantage is that if you write this:
这可以在编译时决定要做什么,包括使用memcpy
或memmove
可能使用 SSE/vector 指令(如果可能)。另一个优点是,如果你这样写:
struct Foo {
int m_i;
};
Foo f1[10], f2[10];
memcpy(&f1, &f2, sizeof(f1));
and later on change Foo to include a std::string
, your code will break. If you instead write:
稍后更改 Foo 以包含 a std::string
,您的代码将中断。如果你改为写:
struct Foo {
int m_i;
};
enum { NumFoos = 10 };
Foo f1[NumFoos], f2[NumFoos];
std::copy(f2, f2 + numFoos, f1);
the compiler will switch your code to do the right thing without any additional work for you, and your code is a little more readable.
编译器会切换你的代码来做正确的事情,而不需要为你做任何额外的工作,你的代码更易读。
回答by James
Essentially, as long as you are dealing with POD types (Plain Ol' Data), such as int, unsigned int, pointers, data-only structs, etc... you are safe to use mem*.
本质上,只要您处理 POD 类型(Plain Ol' Data),例如 int、unsigned int、指针、仅数据结构等……您就可以安全地使用 mem*。
If your array contains objects, use the for loop, as the = operator may be required to ensure proper assignment.
如果您的数组包含对象,请使用 for 循环,因为可能需要 = 运算符以确保正确分配。
回答by Jason Williams
For performance, use memcpy (or equivalents). It's highly optimised platform-specific code for shunting lots of data around fast.
为了性能,请使用 memcpy(或等效物)。它是高度优化的特定于平台的代码,用于快速分流大量数据。
For maintainability, consider what you're doing - the for loop may be more readable and easier to understand. (Getting a memcpy wrong is a fast route to a crash or worse)
为了可维护性,请考虑您在做什么 - for 循环可能更具可读性且更易于理解。(弄错 memcpy 是导致崩溃或更糟的快速途径)
回答by yyny
A simple loop is slightly faster for about 10-20 bytes and less (It's a single compare+branch, see OP_T_THRES
), but for larger sizes, memcpy
is faster and portable.
一个简单的循环在大约 10-20 个字节或更少(它是单个比较+分支,请参阅 参考资料)时稍微快一点OP_T_THRES
,但对于更大的大小,memcpy
更快且可移植。
Additionally, if the amount of memory you want to copy is constant, you can use memcpy
to let the compiler decide what method to use.
此外,如果您要复制的内存量是恒定的,您可以使用memcpy
让编译器决定使用什么方法。
Side note: the optimizations that memcpy
uses may significantly slow your program down in a multithreaded environment when you're copying a lot of data above the OP_T_THRES
size mark since the instructions this invokes are not atomic and the speculative execution and caching behavior for such instructions doesn't behave nicely when multiple threads are accessing the same memory. Easiest solution is to not share memory between threads and only merge the memory at the end. This is good multi-threading practice anyway.
旁注:memcpy
当您复制超过OP_T_THRES
大小标记的大量数据时,在多线程环境中使用的优化可能会显着减慢您的程序速度,因为此调用的指令不是原子的,并且此类指令的推测执行和缓存行为不会“当多个线程访问同一内存时,t 表现良好。最简单的解决方案是不在线程之间共享内存,而只在最后合并内存。无论如何,这是一个很好的多线程实践。