C语言 fork() 是如何工作的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15102328/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How does fork() work?
提问by PhoonOne
Im really new to forking, what is the pid doing in this code? Can someone please explain what comes out at line X and line Y ?
我对分叉真的很陌生,这段代码中的pid是做什么的?有人可以解释一下 X 行和 Y 行的结果吗?
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#define SIZE 5
int nums[SIZE] = {0,1,2,3,4};
int main()
{
int i;
pid_t pid;
pid = fork();
if (pid == 0) {
for (i = 0; i < SIZE; i++) {
nums[i] *= -i;
printf("CHILD: %d ",nums[i]); /* LINE X */
}
}
else if (pid > 0) {
wait(NULL);
for (i = 0; i < SIZE; i++)
printf("PARENT: %d ",nums[i]); /* LINE Y */
}
return 0;
}
回答by MatthewD
fork()duplicates the process, so after calling fork there are actually 2 instances of your program running.
fork()重复该过程,因此在调用 fork 之后,实际上有 2 个程序正在运行。
How do you know which process is the original (parent) one, and which is the new (child) one?
你怎么知道哪个进程是原始(父)进程,哪个是新(子)进程?
In the parent process, the PID of the child process (which will be a positive integer) is returned from fork(). That's why the if (pid > 0) { /* PARENT */ }code works. In the child process, fork()just returns 0.
在父进程中,子进程的 PID(将是一个正整数)从 返回fork()。这就是if (pid > 0) { /* PARENT */ }代码有效的原因。在子进程中,fork()只返回0.
Thus, because of the if (pid > 0)check, the parent process and the child process will produce different output, which you can see here(as provided by @jxh in the comments).
因此,由于if (pid > 0)检查,父进程和子进程将产生不同的输出,您可以在这里看到(由@jxh 在评论中提供)。
回答by Punit Vara
Simplest example for fork()
fork() 的最简单示例
printf("I'm printed once!\n");
fork();
// Now there are two processes running one is parent and another child.
// and each process will print out the next line.
printf("You see this line twice!\n");
The return value of fork(). Return value -1= failed; 0= in child process; positive = in parent process (and the return value is the child process id)
fork() 的返回值。返回值 -1= 失败;0=在子进程中;正 = 在父进程中(返回值是子进程 id)
pid_t id = fork();
if (id == -1) exit(1); // fork failed
if (id > 0)
{
// I'm the original parent and
// I just created a child process with id 'id'
// Use waitpid to wait for the child to finish
} else { // returned zero
// I must be the newly made child process
}
What is different in the child process than the parent process?
子进程与父进程有什么不同?
- The parent is notified via a signal when the child process finishes but not vice versa.
- The child does not inherit pending signals or timer alarms. For a complete list see the fork()
- Here the process id can be returned by getpid(). The parent process id can be returned by getppid().
- 当子进程完成时,通过信号通知父进程,反之则不然。
- 孩子不会继承挂起的信号或计时器警报。有关完整列表,请参阅fork()
- 这里的进程id可以通过getpid()返回。父进程 ID 可以通过 getppid() 返回。
Now let's visualize your program code
现在让我们可视化您的程序代码
pid_t pid;
pid = fork();
Now OS make two identical copies of address spaces, one for the parent and the other for the child.
现在操作系统制作了两个相同的地址空间副本,一个用于父级,另一个用于子级。
Both parent and child process start their execution right after the system call fork(). Since both processes have identical but separate address spaces, those variables initialized before the fork() call have the same values in both address spaces. Every process has its own address space so any modifications will be independent of the others. If the parent changes the value of its variable, the modification will only affect the variable in the parent process's address space. Other address spaces created by fork() sysem calls will not be affected even though they have identical variable names .
父进程和子进程都在系统调用 fork() 之后立即开始执行。由于两个进程具有相同但独立的地址空间,因此在 fork() 调用之前初始化的那些变量在两个地址空间中具有相同的值。每个进程都有自己的地址空间,因此任何修改都将独立于其他进程。如果父进程更改了其变量的值,则修改只会影响父进程地址空间中的变量。由 fork() 系统调用创建的其他地址空间不会受到影响,即使它们具有相同的变量名称。
Here parent pid is non-zero, it calls function ParentProcess(). On the other hand, the child has a zero pid and it calls ChildProcess() as shown below:

这里的父 pid 非零,它调用函数 ParentProcess()。另一方面,子进程的 pid 为零,它调用 ChildProcess() ,如下所示:

In your code parent process call wait()it pauses at that point until the child exits. So the child's output appears first.
在您的代码中,父进程调用wait()它在那一点暂停,直到子进程退出。所以孩子的输出首先出现。
if (pid == 0) {
// The child runs this part because fork returns 0 to the child
for (i = 0; i < SIZE; i++) {
nums[i] *= -i;
printf("CHILD: %d ",nums[i]); /* LINE X */
}
}
OUTPUT from child process
子进程的输出
what comes out at line X
第 X 行出现了什么
CHILD: 0 CHILD: -1 CHILD: -4 CHILD: -9 CHILD: -16
Then after the child exits, the parent continues from after the wait() call and prints its output next.
然后在子进程退出后,父进程从 wait() 调用之后继续,然后打印其输出。
else if (pid > 0) {
wait(NULL);
for (i = 0; i < SIZE; i++)
printf("PARENT: %d ",nums[i]); /* LINE Y */
}
OUTPUT from parent process:
父进程的输出:
what comes out at line Y
Y 线出现了什么
PARENT: 0 PARENT: 1 PARENT: 2 PARENT: 3 PARENT: 4
At last both output combined by child and parent process will be shown on terminal as follow:
最后,子进程和父进程组合的输出将显示在终端上,如下所示:
CHILD: 0 CHILD: -1 CHILD: -4 CHILD: -9 CHILD: -16 PARENT: 0 PARENT: 1 PARENT: 2 PARENT: 3 PARENT: 4
For more info refer this link
有关更多信息,请参阅此链接
回答by dbush
The fork()function is special because it actually returns twice: once to the parent process and once to the child process. In the parent process, fork()returns the pid of the child. In the child process, it returns 0. In the event of an error, no child process is created and -1 is returned to the parent.
这个fork()函数很特别,因为它实际上返回了两次:一次到父进程,一次到子进程。在父进程中,fork()返回子进程的pid。在子进程中,返回0。发生错误时,不创建子进程,返回-1给父进程。
After a successful call to fork(), the child process is basically an exact duplicate of the parent process. Both have their own copies of all local and global variables, and their own copies of any open file descriptors. Both processes run concurrently, and because they share the same file descriptors, the output of each process will likely interleave with each other.
成功调用 后fork(),子进程基本上是父进程的完全副本。两者都有自己的所有局部和全局变量的副本,以及任何打开的文件描述符的副本。两个进程同时运行,并且因为它们共享相同的文件描述符,所以每个进程的输出可能会相互交错。
Taking a closer look at the example in the question:
仔细看看问题中的例子:
pid_t pid;
pid = fork();
// When we reach this line, two processes now exist,
// with each one continuing to run from this point
if (pid == 0) {
// The child runs this part because fork returns 0 to the child
for (i = 0; i < SIZE; i++) {
nums[i] *= -i;
printf("CHILD: %d ",nums[i]); /* LINE X */
}
}
else if (pid > 0) {
// The parent runs this part because fork returns the child's pid to the parent
wait(NULL); // this causes the parent to wait until the child exits
for (i = 0; i < SIZE; i++)
printf("PARENT: %d ",nums[i]); /* LINE Y */
}
This will output the following:
这将输出以下内容:
CHILD: 0 CHILD: -1 CHILD: -4 CHILD: -9 CHILD: -16 PARENT: 0 PARENT: 1 PARENT: 2 PARENT: 3 PARENT: 4
Because the parent process calls wait()it pauses at that point until the child exits. So the child's output appears first. Then after the child exits, the parent continues from after the wait()call and prints its output next.
因为父进程调用wait()它在那个点暂停,直到子进程退出。所以孩子的输出首先出现。然后在子进程退出后,父进程从wait()调用后继续,然后打印其输出。
回答by Jonathan Leffler
In the simplest cases, the behaviour of fork()is very simple — if a bit mind-blowing on your first encounter with it. It either returns once with an error, or it returns twice, once in the original (parent) process, and once in a brand new almost exact duplicate of the original process (the child process). After return, the two processes are nominally independent, though they share a lot of resources.
在最简单的情况下, 的行为fork()非常简单 - 如果您第一次遇到它时有点令人兴奋。它要么返回一次并出现错误,要么返回两次,一次在原始(父)进程中,一次在与原始进程(子进程)几乎完全相同的全新副本中。返回后,两个进程虽然共享大量资源,但名义上是独立的。
pid_t original = getpid();
pid_t pid = fork();
if (pid == -1)
{
/* Failed to fork - one return */
…handle error situation…
}
else if (pid == 0)
{
/* Child process - distinct from original process */
assert(original == getppid() || getppid() == 1);
assert(original != getpid());
…be childish here…
}
else
{
/* Parent process - distinct from child process */
assert(original != pid);
…be parental here…
}
The child process is a copy of the parent. It has the same set of open file descriptors, for example; each file descriptor N that was open in the parent is open in the child, and they share the same open file description. That means that if one of the processes alters the read or write position in a file, that also affects the other process. On the other hand, if one of the processes closes a file, that has no direct effect on the file in the other process.
子进程是父进程的副本。例如,它具有相同的打开文件描述符集;在父级中打开的每个文件描述符 N 在子级中打开,并且它们共享相同的打开文件描述。这意味着如果其中一个进程改变了文件中的读或写位置,也会影响另一个进程。另一方面,如果其中一个进程关闭了一个文件,则对另一个进程中的文件没有直接影响。
It also means that if there was data buffered in the standard I/O package in the parent process (e.g. some data had been read from the standard input file descriptor (STDIN_FILENO) into the data buffer for stdin, then that data is available to both the parent and the child, and both can read that buffered data without affecting the other, which will also see the same data. On the other hand, once the buffered data is read, if the parent reads another buffer full, that moves the current file position for both the parent and the child, so the child won't then see the data that the parent just read (but if the child also reads a block of data, the parent won't see that). This can be confusing. Consequently, it is usually a good idea to make sure that there's no pending standard I/O before forking — fflush(0)is one way to do that.
这也意味着,如果父进程的标准 I/O 包中缓冲了数据(例如,某些数据已从标准输入文件描述符 ( STDIN_FILENO) 读入数据缓冲区)stdin,则该数据对父进程可用和孩子,两者都可以读取缓冲数据而不会影响另一个,这也会看到相同的数据。另一方面,一旦读取缓冲数据,如果父级读取另一个缓冲区已满,则移动当前文件位置对于父级和子级,因此子级将不会看到父级刚刚读取的数据(但如果子级也读取了一个数据块,则父级将看不到)。这可能会令人困惑。因此,在分叉之前确保没有挂起的标准 I/O 通常是个好主意 —fflush(0)是做到这一点的一种方法。
In the code fragment, assert(original == getppid() || getppid() == 1);allows for the possibility that by the time the child executes the statement, the parent process may have exited, in which case the child will have been inherited by a system process — which normally has PID 1 (I know of no POSIX system where orphaned children are inherited by a different PID, but there probably is one).
在代码片段中,assert(original == getppid() || getppid() == 1);考虑到在子进程执行语句时父进程可能已经退出的可能性,在这种情况下子进程将被系统进程继承——通常具有 PID 1(我知道没有POSIX 系统,其中孤儿由不同的 PID 继承,但可能有一个)。
Other shared resources, such as memory-mapped files or shared memory, continue to be available in both. The subsequent behaviour of a memory-mapped file depends on the options used to create the mapping; MAP_PRIVATE means that the two processes have independent copies of the data, and MAP_SHARED means that they share the same copy of the data and changes made by one process will be visible in the other.
其他共享资源,例如内存映射文件或共享内存,在两者中继续可用。内存映射文件的后续行为取决于用于创建映射的选项;MAP_PRIVATE 意味着两个进程拥有独立的数据副本,而 MAP_SHARED 意味着它们共享相同的数据副本,并且一个进程所做的更改将在另一个进程中可见。
However, not every program that forks is a simple as the story described so far. For example, the parent process might have acquired some (advisory) locks; those locks are not inherited by the child. The parent may have been multi-threaded; the child has a single thread of execution — and there are constraints placed on what the child may do safely.
然而,并不是每个分叉的程序都像目前所描述的那样简单。例如,父进程可能获得了一些(咨询)锁;这些锁不是由孩子继承的。父级可能是多线程的;孩子有一个单一的执行线程——并且对孩子可以安全地做什么有限制。
The POSIX specification for fork()specifies the differences in detail:
POSIX 规范fork()详细说明了差异:
The
fork()function shall create a new process. The new process (child process) shall be an exact copy of the calling process (parent process) except as detailed below:
The child process shall have a unique process ID.
The child process ID also shall not match any active process group ID.
The child process shall have a different parent process ID, which shall be the process ID of the calling process.
The child process shall have its own copy of the parent's file descriptors. Each of the child's file descriptors shall refer to the same open file description with the corresponding file descriptor of the parent.
The child process shall have its own copy of the parent's open directory streams. Each open directory stream in the child process may share directory stream positioning with the corresponding directory stream of the parent.
The child process shall have its own copy of the parent's message catalog descriptors.
The child process values of
tms_utime,tms_stime,tms_cutime, andtms_cstimeshall be set to 0.The time left until an alarm clock signal shall be reset to zero, and the alarm, if any, shall be canceled; see alarm.
[XSI] ? All semadj values shall be cleared. ?
File locks set by the parent process shall not be inherited by the child process.
The set of signals pending for the child process shall be initialized to the empty set.
[XSI] ? Interval timers shall be reset in the child process. ?
Any semaphores that are open in the parent process shall also be open in the child process.
[ML] ? The child process shall not inherit any address space memory locks established by the parent process via calls to
mlockall()ormlock(). ?Memory mappings created in the parent shall be retained in the child process. MAP_PRIVATE mappings inherited from the parent shall also be MAP_PRIVATE mappings in the child, and any modifications to the data in these mappings made by the parent prior to calling
fork()shall be visible to the child. Any modifications to the data in MAP_PRIVATE mappings made by the parent afterfork()returns shall be visible only to the parent. Modifications to the data in MAP_PRIVATE mappings made by the child shall be visible only to the child.[PS] ? For the SCHED_FIFO and SCHED_RR scheduling policies, the child process shall inherit the policy and priority settings of the parent process during a
fork()function. For other scheduling policies, the policy and priority settings onfork()are implementation-defined. ?Per-process timers created by the parent shall not be inherited by the child process.
[MSG] ? The child process shall have its own copy of the message queue descriptors of the parent. Each of the message descriptors of the child shall refer to the same open message queue description as the corresponding message descriptor of the parent. ?
No asynchronous input or asynchronous output operations shall be inherited by the child process. Any use of asynchronous control blocks created by the parent produces undefined behavior.
A process shall be created with a single thread. If a multi-threaded process calls
fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called. Fork handlers may be established by means of thepthread_atfork()function in order to maintain application invariants acrossfork()calls.When the application calls
fork()from a signal handler and any of the fork handlers registered bypthread_atfork()calls a function that is not async-signal-safe, the behavior is undefined.[OB TRC TRI] ? If the Trace option and the Trace Inherit option are both supported:
If the calling process was being traced in a trace stream that had its inheritance policy set to POSIX_TRACE_INHERITED, the child process shall be traced into that trace stream, and the child process shall inherit the parent's mapping of trace event names to trace event type identifiers. If the trace stream in which the calling process was being traced had its inheritance policy set to POSIX_TRACE_CLOSE_FOR_CHILD, the child process shall not be traced into that trace stream. The inheritance policy is set by a call to the
posix_trace_attr_setinherited()function. ?[OB TRC] ? If the Trace option is supported, but the Trace Inherit option is not supported:
The child process shall not be traced into any of the trace streams of its parent process. ?
[OB TRC] ? If the Trace option is supported, the child process of a trace controller process shall not control the trace streams controlled by its parent process. ?
[CPT] ? The initial value of the CPU-time clock of the child process shall be set to zero. ?
[TCT] The initial value of the CPU-time clock of the single thread of the child process shall be set to zero.?
All other process characteristics defined by POSIX.1-2008 shall be the same in the parent and child processes. The inheritance of process characteristics not defined by POSIX.1-2008 is unspecified by POSIX.1-2008.
After
fork(), both the parent and the child processes shall be capable of executing independently before either one terminates.
该
fork()函数应创建一个新进程。新进程(子进程)应是调用进程(父进程)的精确副本,以下详述除外:
子进程应具有唯一的进程 ID。
子进程 ID 也不应与任何活动进程组 ID 匹配。
子进程应该有一个不同的父进程ID,它应该是调用进程的进程ID。
子进程应该有它自己的父文件描述符的副本。每个子文件描述符都应引用与父文件描述符相同的打开文件描述。
子进程应该有它自己的父进程打开目录流的副本。子进程中每个打开的目录流都可以与父进程对应的目录流共享目录流定位。
子进程应该有它自己的父进程消息目录描述符的副本。
的子进程值
tms_utime,tms_stime,tms_cutime,并tms_cstime应设置为0。闹钟信号响起的剩余时间应归零,如有闹钟应取消;见警报。
[XSI] ? 应清除所有 semadj 值。?
父进程设置的文件锁不能被子进程继承。
子进程未决的信号集应初始化为空集。
[XSI] ? 间隔计时器应在子进程中重置。?
在父进程中打开的任何信号量也应在子进程中打开。
[ML] ? 子进程不应继承父进程通过调用
mlockall()或建立的任何地址空间内存锁mlock()。?在父进程中创建的内存映射应保留在子进程中。从父级继承的 MAP_PRIVATE 映射也应该是子级中的 MAP_PRIVATE 映射,并且在调用之前父级对这些映射中的数据所做的任何修改都
fork()应该对子级可见。fork()返回后父级对 MAP_PRIVATE 映射中的数据所做的任何修改应仅对父级可见。子进程对 MAP_PRIVATE 映射中的数据所做的修改仅对子进程可见。[PS] ? 对于 SCHED_FIFO 和 SCHED_RR 调度策略,子进程在
fork()函数执行期间应继承父进程的策略和优先级设置。对于其他调度策略,策略和优先级设置fork()是实现定义的。?由父进程创建的每进程计时器不应由子进程继承。
[味精] ? 子进程应该有它自己的父进程的消息队列描述符的副本。子进程的每个消息描述符都应引用与父进程的相应消息描述符相同的开放消息队列描述。?
子进程不应继承异步输入或异步输出操作。任何由父创建的异步控制块的使用都会产生未定义的行为。
进程应使用单个线程创建。如果多线程进程调用
fork(),则新进程应包含调用线程及其整个地址空间的副本,可能包括互斥锁和其他资源的状态。因此,为了避免错误,子进程可能只执行异步信号安全操作,直到调用 exec 函数之一。Fork 处理程序可以通过pthread_atfork()函数建立,以便在fork()调用之间保持应用程序不变性。当应用程序
fork()从信号处理程序调用并且通过pthread_atfork()调用非异步信号安全的函数注册的任何 fork 处理程序时,行为是未定义的。[OB TRC TRI] ? 如果同时支持 Trace 选项和 Trace Inherit 选项:
如果调用进程在其继承策略设置为 POSIX_TRACE_INHERITED 的跟踪流中被跟踪,则子进程应被跟踪到该跟踪流中,并且子进程应继承父进程的跟踪事件名称到跟踪事件类型标识符的映射。如果在其中跟踪调用进程的跟踪流将其继承策略设置为 POSIX_TRACE_CLOSE_FOR_CHILD,则不应将子进程跟踪到该跟踪流中。继承策略是通过调用
posix_trace_attr_setinherited()函数来设置的。?[OB TRC] ? 如果支持 Trace 选项,但不支持 Trace Inherit 选项:
子进程不应被跟踪到其父进程的任何跟踪流中。?
[OB TRC] ? 如果支持 Trace 选项,则跟踪控制器进程的子进程不应控制由其父进程控制的跟踪流。?
[CPT] ? 子进程的 CPU 时间时钟的初始值应设置为零。?
[TCT] 子进程单线程的CPU时间时钟的初始值应设置为零。?
POSIX.1-2008 定义的所有其他进程特征在父进程和子进程中应相同。POSIX.1-2008 未定义的过程特性的继承在 POSIX.1-2008 中未指定。
之后
fork(),父进程和子进程都应能够在任一进程终止之前独立执行。
Most of these issues do not affect most programs, but multi-threaded programs that fork need to be very careful. It is worth reading the Rationale section of the POSIX definition of fork().
这些问题大部分不会影响大多数程序,但是fork的多线程程序需要非常小心。值得阅读 POSIX 定义的基本原理部分fork()。
Inside the kernel, the system manages all the issues highlighted in the definition above. Memory page mapping tables have to be replicated. The kernel will typically mark the (writable) memory pages as COW — copy on write — so that until one or the other process modifies memory, they can access the same memory. This minimizes the cost of replicating the process; memory pages are only made distinct when they're modified. Many resources, though, such as file descriptors, have to be replicated, so fork()is quite an expensive operation (though not as expensive as the exec*()functions). Note that replicating a file descriptor leaves both descriptors referring to the same open file description — see the open()and dup2()system calls for a discussion of the distinctions between file descriptors and open file descriptions.
在内核内部,系统管理上述定义中突出显示的所有问题。必须复制内存页映射表。内核通常会将(可写的)内存页标记为 COW — 写时复制 — 以便在一个或另一个进程修改内存之前,它们可以访问相同的内存。这最大限度地减少了复制过程的成本;内存页只有在修改时才会被区分。但是,许多资源(例如文件描述符)必须被复制,因此这fork()是一项非常昂贵的操作(尽管不如exec*()函数昂贵)。请注意,复制文件描述符会使两个描述符都指向同一个打开的文件描述——请参阅open()和dup2()系统要求讨论文件描述符和打开文件描述之间的区别。

