Python线程化多个bash子进程?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14533458/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:44:48  来源:igfitidea点击:

Python threading multiple bash subprocesses?

pythonmultithreadingsubprocess

提问by Andrew

How does one use the threading and subprocess modules to spawn parallel bash processes? When I start threads ala the first answer here: How to use threading in Python?, the bash processes run sequentially instead of in parallel.

如何使用线程和子进程模块来产生并行的 bash 进程?当我开始线程时,这里的第一个答案是:How to use threading in Python? ,bash 进程按顺序运行,而不是并行运行。

回答by rzzzwilson

A simple threading example:

一个简单的线程示例:

import threading
import Queue
import commands
import time

# thread class to run a command
class ExampleThread(threading.Thread):
    def __init__(self, cmd, queue):
        threading.Thread.__init__(self)
        self.cmd = cmd
        self.queue = queue

    def run(self):
        # execute the command, queue the result
        (status, output) = commands.getstatusoutput(self.cmd)
        self.queue.put((self.cmd, output, status))

# queue where results are placed
result_queue = Queue.Queue()

# define the commands to be run in parallel, run them
cmds = ['date; ls -l; sleep 1; date',
        'date; sleep 5; date',
        'date; df -h; sleep 3; date',
        'date; hostname; sleep 2; date',
        'date; uname -a; date',
       ]
for cmd in cmds:
    thread = ExampleThread(cmd, result_queue)
    thread.start()

# print results as we get them
while threading.active_count() > 1 or not result_queue.empty():
    while not result_queue.empty():
        (cmd, output, status) = result_queue.get()
        print('%s:' % cmd)
        print(output)
        print('='*60)
    time.sleep(1)

Note that there are better ways to do some of this, but this is not too complicated. The example uses one thread for each command. Complexity starts to creep in when you want to do things like use a limited number of threads to handle an unknown number of commands. Those more advanced techniques don't seem too complicated once you have a grasp of threading basics. And multiprocessing gets easier once you have a handle on those techniques.

请注意,有一些更好的方法可以做到这一点,但这并不太复杂。该示例为每个命令使用一个线程。当您想要执行诸如使用有限数量的线程来处理未知数量的命令之类的事情时,复杂性开始蔓延。一旦您掌握了线程基础知识,那些更高级的技术似乎不会太复杂。一旦你掌握了这些技术,多处理就会变得更容易。

回答by jfs

You don't need threads to run subprocesses in parallel:

您不需要线程来并行运行子进程:

from subprocess import Popen

commands = [
    'date; ls -l; sleep 1; date',
    'date; sleep 5; date',
    'date; df -h; sleep 3; date',
    'date; hostname; sleep 2; date',
    'date; uname -a; date',
]
# run in parallel
processes = [Popen(cmd, shell=True) for cmd in commands]
# do other things here..
# wait for completion
for p in processes: p.wait()


To limit number of concurrent commands you could use multiprocessing.dummy.Poolthat uses threads and provides the same interface as multiprocessing.Poolthat uses processes:

要限制可以使用multiprocessing.dummy.Pool线程并提供与multiprocessing.Pool使用进程相同的接口的并发命令的数量:

from functools import partial
from multiprocessing.dummy import Pool
from subprocess import call

pool = Pool(2) # two concurrent commands at a time
for i, returncode in enumerate(pool.imap(partial(call, shell=True), commands)):
    if returncode != 0:
       print("%d command failed: %d" % (i, returncode))

This answer demonstrates various techniques to limit number of concurrent subprocesses: it shows multiprocessing.Pool, concurrent.futures, threading + Queue -based solutions.

这个答案展示了限制并发子进程数量的各种技术:它展示了 multiprocessing.Pool、concurrent.futures、基于线程 + 队列的解决方案。



You could limit the number of concurrent child processes without using a thread/process pool:

您可以在不使用线程/进程池的情况下限制并发子进程的数量:

from subprocess import Popen
from itertools import islice

max_workers = 2  # no more than 2 concurrent processes
processes = (Popen(cmd, shell=True) for cmd in commands)
running_processes = list(islice(processes, max_workers))  # start new processes
while running_processes:
    for i, process in enumerate(running_processes):
        if process.poll() is not None:  # the process has finished
            running_processes[i] = next(processes, None)  # start new process
            if running_processes[i] is None: # no new processes
                del running_processes[i]
                break

On Unix, you could avoid the busy loop and block on os.waitpid(-1, 0), to wait for any child process to exit.

在 Unix 上,您可以避免繁忙循环并阻塞 on os.waitpid(-1, 0),以等待任何子进程退出

回答by Magaly Alonzo

this is because it is supposed to do, the thing you want to do is not multithreadind but multiprocessing see this stack page

这是因为它应该做的,你想做的不是多线程而是多处理看到这个堆栈页面