Python:同时运行多个进程
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22700164/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: running multiple processes simultaneously
提问by user3470516
I am attempting to create a program in python that runs multiple instances (15) of a function simultaneously over different processors. I have been researching this, and have the below program set up using the Process tool from multiprocessing.
我正在尝试在 python 中创建一个程序,该程序在不同的处理器上同时运行一个函数的多个实例 (15)。我一直在研究这个,并使用 multiprocessing 中的 Process 工具设置了以下程序。
Unfortunately, the program executes each instance of the function sequentially (it seems to wait for one to finish before moving onto the next part of the loop).
不幸的是,程序按顺序执行函数的每个实例(它似乎在进入循环的下一部分之前等待一个完成)。
from __future__ import print_function
from multiprocessing import Process
import sys
import os
import re
for i in range(1,16):
exec("path%d = 0" % (i))
exec("file%d = open('%d-path','a', 1)" % (i, i))
def stat(first, last):
for j in range(1,40000):
input_string = "water" + str(j) + ".xyz.geocard"
if os.path.exists('./%s' % input_string) == True:
exec("out%d = open('output%d', 'a', 1)" % (first, first))
exec('print("Processing file %s...", file=out%d)' % (input_string, first))
with open('./%s' % input_string,'r') as file:
for line in file:
for i in range(first,last):
search_string = " " + str(i) + " path:"
for result in re.finditer(r'%s' % search_string, line):
exec("path%d += 1" % i)
for i in range(first,last):
exec("print(path%d, file=file%d)" % (i, i))
processes = []
for m in range(1,16):
n = m + 1
p = Process(target=stat, args=(m, n))
p.start()
processes.append(p)
for p in processes:
p.join()
I am reasonably new to programming, and have no experience with parallelization - any help would be greatly appreciated.
我对编程相当陌生,并且没有并行化经验 - 任何帮助将不胜感激。
I have included the entire program above, replacing "Some Function" with the actual function, to demonstrate that this is not a timing issue. The program can take days to cycle through all 40,000 files (each of which is quite large).
我已经包含了上面的整个程序,用实际函数替换了“Some Function”,以证明这不是时间问题。该程序可能需要几天时间来循环浏览所有 40,000 个文件(每个文件都很大)。
采纳答案by ruscur
Are you sure? I just tried it and it worked for me; the results are out of order on every execution, so they're being executed concurrently.
你确定吗?我刚刚尝试过,它对我有用;每次执行的结果都是乱序的,所以它们是并发执行的。
Have a look at your function. It takes "first" and "last", so is its execution time smaller for lower values? In this case, you could expect the smaller numbered arguments to make runtime lower, so it would appear to run in parallel.
看看你的功能。它需要“第一个”和“最后一个”,那么对于较低的值,它的执行时间是否更短?在这种情况下,您可以预期编号较小的参数会降低运行时间,因此它看起来是并行运行的。
ps ux | grep python | grep -v grep | wc -l
> 16
If you execute the code repeatedly (i.e. using a bash script) you can see that every process is starting up. If you want to confirm this, import osand have the function print out os.getpid()so you can see they have a different process ID.
如果您重复执行代码(即使用 bash 脚本),您可以看到每个进程都在启动。如果您想确认这一点,请导入os并打印该函数,os.getpid()以便您可以看到它们具有不同的进程 ID。
So yeah, double check your results because it seems to me like you've written it concurrently just fine!
所以,是的,仔细检查你的结果,因为在我看来你同时写的很好!
回答by mdadm
I think what is happening is that you are not doing enough in some_function to observe work happening in parallel. It spawns a process, and it completes before the next one gets spawned. If you introduce a random sleep time into some_function, you'll see that they are in fact running in parallel.
我认为正在发生的事情是您在 some_function 中做得不够,无法观察并行发生的工作。它生成一个进程,并在生成下一个进程之前完成。如果您将随机睡眠时间引入some_function,您会看到它们实际上是并行运行的。
from multiprocessing import Process
import random
import time
def some_function(first, last):
time.sleep(random.randint(1, 3))
print first, last
processes = []
for m in range(1,16):
n = m + 1
p = Process(target=some_function, args=(m, n))
p.start()
processes.append(p)
for p in processes:
p.join()
Output
输出
2 3
3 4
5 6
12 13
13 14
14 15
15 16
1 2
4 5
6 7
9 10
8 9
7 8
11 12
10 11

