python 通过 fork() 运行多个子进程的最佳方法是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/174853/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 19:36:48  来源:igfitidea点击:

What is the best way to run multiple subprocesses via fork()?

pythonlinux

提问by victorz

A python script need to spawn multiple sub-processes via fork(). All of those child processes should run simultaneously and the parent process should be waiting for all of them to finish. Having an ability to set some timeout on a "slow" child would be nice. The parent process goes on processing the rest of the script after all kids are collected.

python 脚本需要通过 fork() 生成多个子进程。所有这些子进程应该同时运行,而父进程应该等待它们全部完成。能够为“缓慢”的孩子设置一些超时会很好。在收集所有孩子之后,父进程继续处理脚本的其余部分。

What is the best way to work it out? Thanks.

解决问题的最佳方法是什么?谢谢。

回答by ephemient

Simple example:

简单的例子:

import os
chidren = []
for job in jobs:
    child = os.fork()
    if child:
        children.append(child)
    else:
        pass  # really should exec the job
for child in children:
    os.waitpid(child, 0)

Timing out a slow child is a little more work; you can use waitinstead of waitpid, and cull the returned values from the list of children, instead of waiting on each one in turn (as here). If you set up an alarmwith a SIGALRMhandler, you can terminate the waiting after a specified delay. This is all standard UNIX stuff, not Python-specific...

让一个慢的孩子超时需要更多的工作;您可以使用wait而不是waitpid,并从子项列表中剔除返回的值,而不是依次等待每个值(如此处)。如果你设置了一个alarm带有SIGALRM处理程序的,你可以在指定的延迟后终止等待。这是所有标准的 UNIX 内容,而不是特定于 Python 的......

回答by Federico A. Ramponi

Ephemient: each child in your code will stay in the for loop after his job ends. He will fork again and again. Moreover, the children that start when children[] is not empty will try to wait for some of their brothers at the end of the loop. Eventually someone will crash. This is a workaround:

Ephemient:代码中的每个孩子在他的工作结束后都将留在 for 循环中。他会一次又一次地分叉。此外,当 children[] 不为空时开始的孩子将尝试在循环结束时等待他们的一些兄弟。最终有人会崩溃。这是一种解决方法:

import os, time

def doTheJob(job):
    for i in xrange(10):
        print job, i
        time.sleep(0.01*ord(os.urandom(1)))
        # random.random() would be the same for each process

jobs = ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J"]
imTheFather = True
children = []

for job in jobs:
    child = os.fork()
    if child:
        children.append(child)
    else:
        imTheFather = False
        doTheJob(job)
        break

# in the meanwhile 
# ps aux|grep python|grep -v grep|wc -l == 11 == 10 children + the father

if imTheFather:
    for child in children:
        os.waitpid(child, 0)

回答by Bryan Oakley

Have you looked at the pyprocessingmodule?

你看过pyprocessing模块吗?

回答by Dan Lenski

The traditional, UNIX-y way to communicate with sub-processes is to open pipes to their standard input/output, and use the select()system call to multiplex the communications in the parent process (available in Python via... the selectmodule).

与子进程通信的传统 UNIX-y 方式是打开通往它们的标准输入/输出的管道,并使用select()系统调用来多路复用父进程中的通信(在 Python 中可通过...select模块获得)。

If you need to kill a slow-running child process, you can just save its process ID (returned by the os.fork()call) and then use os.kill()to kill it when not needed anymore. Of course, it would probably be cleaner to be able to communicate with the child process explicitly and tell itto shut itself down.

如果你需要杀死一个运行缓慢的子进程,你可以保存它的进程 ID(由os.fork()调用返回),然后os.kill()在不再需要时使用它来杀死它。当然,能够明确地与子进程通信并告诉它自己关闭可能会更干净。

回答by Radiumcola

I have done this in perl a time or two. Learning python and wanted to replicate the function. A scheduler for an unknown number of forked tasks must keep track of running tasks, ended tasks, and return codes. This code includes the code for the SIGCHLD handler, the parent task, and a simple child task.

我已经用 perl 做过一两次了。学习python并想复制该功能。未知数量的分叉任务的调度程序必须跟踪正在运行的任务、已结束的任务和返回代码。此代码包括 SIGCHLD 处理程序的代码、父任务和一个简单的子任务。

#!/usr/bin/env python3
import signal, traceback
import os, subprocess
import time
#
#   sigchild handler for reaping dead children
#
def handler(signum, frame):
#
#   report stat of child tasks  
    print(children)
#
#   use waitpid to collect the dead task pid and status
    pid, stat = os.waitpid(-1, 0)
    term=(pid,stat)
    print('Reaped: pid=%d stat=%d\n' % term)
#
#   add pid and return code to dead kids list for post processing
    ripkids.append(term)
    print(ripkids)
    print('\n')
#
#   update children to remove pid just reaped
    index = children.index(pid)
    children.pop(index)
    print(children)   
    print('\n')

# Set the signal handler 
signal.signal(signal.SIGCHLD, handler)

def child():
   print('\nA new child ',  os.getpid())
   print('\n')
   time.sleep(15)
   os._exit(0)  

def parent():
#
# lists for started and dead children
   global children
   children = []
   global ripkids
   ripkids = []

   while True:
      newpid = os.fork()
      if newpid == 0:
         child()
      else:
         pidx = (os.getpid(), newpid)
         children = children+[newpid]
         print("parent: %d, child: %d\n" % pidx)
         print(children)
         print('\n')
      reply = input("q for quit / c for new fork")
      if reply == 'c': 
          continue
      else:
          break

parent()