C语言 waitpid - 虽然孩子正常退出,但 WIFEXITED 返回 0

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23262887/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 11:02:20  来源:igfitidea点击:

waitpid - WIFEXITED returning 0 although child exited normally

cforkwaitpidexecv

提问by Andreas Grapentin

I have been writing a program that spawns a child process, and calls waitpidto wait for the termination of the child process. The code is below:

我一直在编写一个生成子进程的程序,并调用waitpid等待子进程的终止。代码如下:

  // fork & exec the child
  pid_t pid = fork();
  if (pid == -1)
    // here is error handling code that is **not** triggered

  if (!pid)
    {
      // binary_invocation is an array of the child process program and its arguments
      execv(args.binary_invocation[0], (char * const*)args.binary_invocation);
      // here is some error handling code that is **not** triggered
    }
  else
    {
      int status = 0;
      pid_t res = waitpid(pid, &status, 0);

      // here I see pid_t being a positive integer > 0
      // and status being 11, which means WIFEXITED(status) is 0.
      // this triggers a warning in my programs output.
    }

The manpage of waitpidstates for WIFEXITED:

waitpid状态的联机帮助页WIFEXITED

WIFEXITED(status)
    returns  true  if  the child terminated normally, that is, by calling exit(3) or
    _exit(2), or by returning from main().

Which I intepret to mean it should return an integer != 0 on success, which is not happening in the execution of my program, since I observe WIFEXITED(status) == 0

我认为它应该在成功时返回一个整数 != 0,这在我的程序执行中不会发生,因为我观察到 WIFEXITED(status) == 0

However, executing the same program from the command line results in $? == 0, and starting from gdb results in:

但是,从命令行执行相同的程序会导致$? == 0,并且从 gdb 开始会导致:

[Inferior 1 (process 31934) exited normally]

The program behaves normally, except for the triggered warning, which makes me think something else is going on here, that I am missing.

程序运行正常,除了触发警告,这让我觉得这里发生了其他事情,我错过了。

EDIT:
as suggested below in the comments, I checked if the child is terminated via segfault, and indeed, WIFSIGNALED(status)returns 1, and WTERMSIG(status)returns 11, which is SIGSEGV.

编辑:
正如下面评论中所建议的,我检查了孩子是否通过段错误终止,并且确实WIFSIGNALED(status)返回 1,并WTERMSIG(status)返回 11,即SIGSEGV.

What I don't understand though, is why a call via execv would fail with a segfault while the same call via gdb, or a shell would succeed?

但我不明白的是,为什么通过 execv 的调用会因段错误而失败,而通过 gdb 或 shell 的相同调用会成功?

EDIT2:
The behaviour of my application heavily depends on the behaviour of the child process, in particular on a file the child writes in a function declared __attribute__ ((destructor)). After the waitpidcall returns, this file exists and is generated correctlywhich means the segfault occurs somewhere in another destructor, or somewhere outside of my control.

EDIT2:
我的应用程序的行为在很大程度上取决于子进程的行为,特别是子进程在声明的函数中写入的文件__attribute__ ((destructor))。在之后waitpid调用返回,该文件存在,并正确生成,这意味着该段错误的另一个析构函数,或者我的控制之外的某处某处。

回答by rob mayoff

On Unix and Linux systems, the status returned from waitor waitpid(or any of the other waitvariants) has this structure:

在 Unix 和 Linux 系统上,从waitwaitpid(或任何其他wait变体)返回的状态具有以下结构:

bits   meaning

0-6    signal number that caused child to exit,
       or 0177 if child stopped / continued
       or zero if child exited without a signal

 7     1 if core dumped, else 0

8-15   low 8 bits of value passed to _exit/exit or returned by main,
       or signal that caused child to stop/continue

(Note that Posix doesn't define the bits, just macros, but these are the bit definitions used by at least Linux, Mac OS X/iOS, and Solaris. Also note that waitpidonly returns for stop events if you pass it the WUNTRACEDflag and for continue events if you pass it the WCONTINUEDflag.)

(请注意,Posix 不定义位,只是定义宏,但这些是至少 Linux、Mac OS X/iOS 和 Solaris 使用的位定义。另外请注意,waitpid如果您将WUNTRACED标志和如果您将WCONTINUED标志传递给它,则用于继续事件。)

So a status of 11 means the child exited due to signal 11, which is SIGSEGV(again, not Posix but conventionally).

所以状态 11 意味着孩子由于信号 11 而退出,这也是SIGSEGV(同样,不是 Posix,而是传统)。

Either your program is passing invalid arguments to execv(which is a C library wrapper around execveor some other kernel-specific call), or the child runs differently when you execvit and when you run it from the shell or gdb.

要么您的程序将无效参数传递给execv(它是一个 C 库包装器execve或一些其他特定于内核的调用),要么当您使用execv它和从 shell 或 gdb 运行它时,子程序的运行方式不同。

If you are on a system that supports strace, run your (parent) program under strace -fto see whether execvis causing the signal.

如果您在支持 的系统上strace,请运行您的(父)程序strace -f以查看是否execv导致信号。