具体来说,fork() 如何在 Linux 中处理从 malloc() 动态分配的内存?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4597893/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 02:22:41  来源:igfitidea点击:

Specifically, how does fork() handle dynamically allocated memory from malloc() in Linux?

clinuxmallocheapfork

提问by pcd6623

I have a program with a parent and a child process. Before the fork(), the parent process called malloc() and filled in an array with some data. After the fork(), the child needs that data. I know that I could use a pipe, but the following code appears to work:

我有一个带有父进程和子进程的程序。在 fork() 之前,父进程调用 malloc() 并用一些数据填充数组。在 fork() 之后,孩子需要该数据。我知道我可以使用管道,但以下代码似乎有效:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main( int argc, char *argv[] ) {
    char *array;
    array = malloc( 20 );
    strcpy( array, "Hello" );
    switch( fork() ) {
    case 0:
        printf( "Child array: %s\n", array );
        strcpy( array, "Goodbye" );
        printf( "Child array: %s\n", array );
        free( array );
        break;
    case -1:
        printf( "Error with fork()\n" );
        break;
    default:
        printf( "Parent array: %s\n", array );
        sleep(1);
        printf( "Parent array: %s\n", array );
        free( array );
    }
    return 0;
}

The output is:

输出是:

Parent array: Hello
Child array: Hello
Child array: Goodbye
Parent array: Hello

I know that data allocated on the stack is available in the child, but it appears that data allocated on the heap is also available to the child. And similarly, the child cannot modify the parent's data on the stack, the child cannot modify the parent's data on the heap. So I assume the child gets its own copy of both stack and heap data.

我知道分配在堆栈上的数据在子级中可用,但似乎在堆上分配的数据也可用于子级。同样,子不能修改父在栈上的数据,子不能修改父在堆上的数据。所以我假设孩子获得了自己的堆栈和堆数据副本。

Is this always the case in Linux? If so, where the is the documentation that supports this? I checked the fork() man page, but it didn't specifically mention dynamically allocated memory on the heap.

在 Linux 中总是这样吗?如果是这样,支持这一点的文档在哪里?我检查了 fork() 手册页,但它没有特别提到堆上动态分配的内存。

Thank you

谢谢

采纳答案by abyx

Each page that is allocated for the process (be it a virtual memory page that has the stack on it or the heap) is copied for the forked process to be able to access it.

为进程分配的每个页面(无论是具有堆栈或堆的虚拟内存页面)都被复制,以便分叉进程能够访问它。

Actually, it is not copied right at the start, it is set to Copy-on-Write, meaning once one of the processes (parent or child) try to modify a page it is copied so that they will not harm one-another, and still have all the data from the point of fork() accessible to them.

实际上,它不是一开始就被复制,而是设置为写时复制,这意味着一旦其中一个进程(父进程或子进程)尝试修改它被复制的页面,这样它们就不会相互伤害,并且仍然可以访问来自 fork() 点的所有数据。

For example, the code pages, those the actual executable was mapped to in memory, are usually read-only and thus are reused among all the forked processes - they will not be copied again, since no one writes there, only read, and so copy-on-write will never be needed.

例如,代码页,即实际的可执行文件被映射到内存中的那些,通常是只读的,因此在所有分叉的进程中重用——它们不会被再次复制,因为没有人在那里写,只有读,等等永远不需要写时复制。

More information is available hereand here.

此处此处提供更多信息。

回答by Noah Watkins

After a fork the child is completely independent from the parent, but may inherit certain things that are copies of the parent. In the case of the heap, the child will conceptually have a copy of the parents heap at the time of the fork. However, modifications to the head in the child's address space will only modify the child's copy (e.g. through copy-on-write).

分叉后,子进程完全独立于父进程,但可以继承父进程的某些副本。在堆的情况下,子节点在概念上将在分叉时拥有父堆的副本。然而,对子地址空间中的头的修改只会修改子地址空间的副本(例如通过写时复制)。

As for the documentation: I've noticed that documentation will usually state that everythingis copied, exceptfor blah, blah blah.

至于文档:我注意到文档通常会说明所有内容都已复制,除了等等等等。

回答by Dirk-Willem van Gulik

The short answer is 'dirty on write' - the longer answer is .. a lot longer.

简短的回答是“写时很脏”——较长的回答是……更长的时间。

But for all intends and purposes - the working model which at C level is safe to assume is that just after the fork() the two processes are absolutely identical -- i.e. the child gets a 100% exact copy -- (but for a wee bit around the return value of fork()) - and then start to diverge as each side modifies its memory, stack and heaps.

但出于所有意图和目的 - 在 C 级别可以安全假设的工作模型是,就在 fork() 之后,两个进程完全相同 - 即孩子获得 100% 精确副本 - (但对于一个小围绕 fork()) 的返回值 - 然后随着每一方修改其内存、堆栈和堆而开始发散。

So your conclusion is slightly off - child starts off with the same data as parent copied into its own space - then modifies it - and see s it as modified - while the parent continues with its own copy.

所以你的结论有点偏离 - 子级从与父级复制到自己空间中的相同数据开始 - 然后修改它 - 并将其视为已修改 - 而父级继续使用自己的副本。

In reality things are bit more complex - as it tries to avoid a complete copy by doing something dirty; avoiding to copy until it has to.

实际上,事情要复杂一些——因为它试图通过做一些肮脏的事情来避免完整的副本;避免复制,直到必须复制。

Dw.

体重。