C语言 理解memcpy()的源代码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17591624/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 06:53:37  来源:igfitidea点击:

Understanding the source code of memcpy()

c

提问by Angus

00018 void *memcpy(void *dst, const void *src, size_t len)
00019 {
00020         size_t i;
00021 
00022         /*
00023          * memcpy does not support overlapping buffers, so always do it
00024          * forwards. (Don't change this without adjusting memmove.)
00025          *
00026          * For speedy copying, optimize the common case where both pointers
00027          * and the length are word-aligned, and copy word-at-a-time instead
00028          * of byte-at-a-time. Otherwise, copy by bytes.
00029          *
00030          * The alignment logic below should be portable. We rely on
00031          * the compiler to be reasonably intelligent about optimizing
00032          * the divides and modulos out. Fortunately, it is.
00033          */
00034 
00035         if ((uintptr_t)dst % sizeof(long) == 0 &&
00036             (uintptr_t)src % sizeof(long) == 0 &&
00037             len % sizeof(long) == 0) {
00038 
00039                 long *d = dst;
00040                 const long *s = src;
00041 
00042                 for (i=0; i<len/sizeof(long); i++) {
00043                         d[i] = s[i];
00044                 }
00045         }
00046         else {
00047                 char *d = dst;
00048                 const char *s = src;
00049 
00050                 for (i=0; i<len; i++) {
00051                         d[i] = s[i];
00052                 }
00053         }
00054 
00055         return dst;
00056 }

I was just going through an implementation of memcpy, to understand how it differs from using a loop. But I couldn't see any difference between using a loop rather than memcpy, as memcpyuses loop again internally to copy.

我只是通过 , 的实现memcpy来了解它与使用循环的区别。但是我看不出使用循环而不是使用循环之间的任何区别memcpy,因为在memcpy内部再次使用循环进行复制。

I couldn't understand ifpart they do for integers — i < len/sizeof(long). Why is this calculation required?

我无法理解if他们为整数所做的部分 - i < len/sizeof(long)。为什么需要这个计算?

回答by Andreas Fester

I couldn't understand if part they do for integers. i < len/sizeof(long). Why is this calculation required ?

我不明白他们是否为整数做部分。i < len/sizeof(long)。为什么需要这个计算?

Because they are copying words, not individual bytes, in this case (as the comment says, it is an optimization - it requires less iterations and the CPU can handle word aligned data more efficiently).

因为在这种情况下,它们复制的是字,而不是单个字节(正如评论所说,这是一种优化 - 它需要更少的迭代并且 CPU 可以更有效地处理字对齐的数据)。

lenis the number of bytesto copy, and sizeof(long)is the size of a single word, so the number of elements to copy (means, loop iterations to execute) is len / sizeof(long).

len是要复制的字节数sizeof(long)单个字大小,因此要复制的元素数(意味着要执行的循环迭代)是len / sizeof(long)

回答by m0skit0

to understand how it differs from using a loop. But I couldn't any difference of using a loop rather than memcpy, as memcpy uses loop again internally to copy

了解它与使用循环有何不同。但是我无法使用循环而不是 memcpy,因为 memcpy 在内部再次使用循环来复制

Well then it uses a loop. Maybe other implementations of libc doesn't do it like that. Anyway, what's the problem/question if it does use a loop? Also as you see it does more than a loop: it checks for alignment and performs a different kind of loop depending on the alignment.

那么它使用一个循环。也许 libc 的其他实现不会那样做。无论如何,如果它确实使用循环,有什么问题/问题?同样如您所见,它不仅仅是一个循环:它检查对齐并根据对齐执行不同类型的循环。

I couldn't understand if part they do for integers. i < len/sizeof(long). Why is this calculation required ?

我不明白他们是否为整数做部分。i < len/sizeof(long)。为什么需要这个计算?

This is checking for memory word alignment. If the destination and source addresses are word-aligned, and the length copy is multiple of word-size, then it performs an aligned copy by word (long), which is faster than using bytes (char), not only because of the size, but also because most architectures do word-aligned copies much faster.

这是检查内存字对齐。如果目的地址和源地址是字对齐的,并且长度复制是字大小的倍数,那么它执行一个字(long)对齐的复制,这比使用字节(char)要快,不仅因为大小,而且还因为大多数架构进行字对齐复制的速度要快得多。

回答by huseyin tugrul buyukisik

len%sizeof(long)checks if you are trying to copy full-longs not a part of long.

len%sizeof(long)检查您是否尝试复制不是long.

00035    if ((uintptr_t)dst % sizeof(long) == 0 &&
00036             (uintptr_t)src % sizeof(long) == 0 &&
00037             len % sizeof(long) == 0) {
00038 
00039                 long *d = dst;
00040                 const long *s = src;
00041 
00042                 for (i=0; i<len/sizeof(long); i++) {
00043                         d[i] = s[i];
00044                 }

checks for alignment and if true, copies fast(sizeof(long)bytes at a time).

检查对齐情况,如果为真,则快速复制(一次sizeof(long)字节)。

00046    else {
00047                 char *d = dst;
00048                 const char *s = src;
00049 
00050                 for (i=0; i<len; i++) {
00051                         d[i] = s[i];
00052                 }
00053    }

this is for the mis-aligned arrays (slow copy (1 byte at a time))

这是用于未对齐的数组(慢速复制(一次 1 个字节))

回答by Yu Hao

for (i=0; i<len/sizeof(long); i++) {
    d[i] = s[i];
}

In this for loop, every time a longis copied, there are a total size of lento copy, that's why it needs i<len/sizeof(long)as the condition to terminate the loop.

在这个 for 循环中,每次long复制 a 时,总大小为len,这就是为什么它需要i<len/sizeof(long)作为终止循环的条件。

回答by Subham Sarda

I was just going through an implementation of memcpy, to understand how it differs from using a loop. But I couldn't see any difference between using a loop rather than memcpy, as memcpyuses loop again internally to copy.

我只是通过 , 的实现memcpy来了解它与使用循环的区别。但是我看不出使用循环而不是 memcpy 有什么区别,因为在memcpy内部再次使用循环进行复制。

Loop (control statements) is one of the basic elements adjacent to if (decision statements) and few other such things. So the question here is not about what is the difference between normal looping and using memcpy.

循环(控制语句)是与 if(决策语句)和其他一些类似的东西相邻的基本元素之一。所以这里的问题不是关于普通循环和使用memcpy.

memcpyjust aids your task by providing you with a ready to use API call, instead of having you to write 20 lines of code for a petty thing. If you wish so, you can choose to write your own code to provide you with the same functionality.

memcpy只是通过为您提供随时可用的 API 调用来帮助您完成任务,而不是让您为一件小事编写 20 行代码。如果您愿意,您可以选择编写自己的代码来为您提供相同的功能。

Second point as already pointed out earlier is that, the optimizationit provides between longdata type and other types. Because in longit is copying a block of dataat once what we call a word instead of copying byte by byte which would take longer time. In case of long, the same operation that would require 8 iterations to complete, memcpydoes it in a single iteration by copying the word at once.

前面已经指出的第二点是,它提供了数据类型和其他类型之间的优化long。因为long它一次复制一个数据块,我们称之为一个字,而不是一个字节一个字节地复制,这需要更长的时间。在很长的情况下,同样的操作,将需要8次迭代来完成memcpy通过一次抄写单词做它在一个单一的迭代

回答by Pankaj Suryawanshi

As if you see assembly code of memcpy it show that in 32 bit system each register is 32 bit it can store 4 byte at a time, if you will copy only one byte in 32 bit register, CPU need extra Instruction cycle.

就像你看到memcpy的汇编代码一样,它表明在32位系统中每个寄存器都是32位,一次可以存储4个字节,如果你只复制32位寄存器中的一个字节,CPU需要额外的指令周期。

If len/count is aliged in the multiple of 4 , we can copy 4 byte in one cycle

如果 len/count 以 4 的倍数对齐,我们可以在一个周期内复制 4 个字节

    MOV FROM, R2
    MOV TO,   R3
    MOV R2,   R4
    ADD LEN,  R4
CP: MOV (R2+), (R3+) ; "(Rx+)" means "*Rx++" in C
    CMP R2, R4
    BNE CP