C++ OpenMP:获取正在运行的线程总数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4706494/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
OpenMP: Get total number of running threads
提问by Konrad Rudolph
I need to know the total number of threads that my application has spawned via OpenMP. Unfortunately, the omp_get_num_threads()
function does notwork here since it only yields the number of threads in the current team.
我需要知道我的应用程序通过 OpenMP 产生的线程总数。不幸的是,该omp_get_num_threads()
功能也不会在这里,因为它只产生的线程在目前球队数量的工作。
However, my code runs recursively (divide and conquer, basically) and I want to spawn new threads as long as there are still idle processors, but no more.
但是,我的代码以递归方式运行(基本上是分而治之),只要仍有空闲处理器,我就想生成新线程,但仅此而已。
Is there a way to get around the limitations of omp_get_num_threads
and get the totalnumber of running threads?
有没有办法绕过限制omp_get_num_threads
并获得正在运行的线程总数?
If more detail is required, consider the following pseudo-code that models my workflow quite closely:
如果需要更多详细信息,请考虑以下伪代码,它非常接近地模拟我的工作流程:
function divide_and_conquer(Job job, int total_num_threads):
if job.is_leaf(): # Recurrence base case.
job.process()
return
left, right = job.divide()
current_num_threads = omp_get_num_threads()
if current_num_threads < total_num_threads: # (1)
#pragma omp parallel num_threads(2)
#pragma omp section
divide_and_conquer(left, total_num_threads)
#pragma omp section
divide_and_conquer(right, total_num_threads)
else:
divide_and_conquer(left, total_num_threads)
divide_and_conquer(right, total_num_threads)
job = merge(left, right)
If I call this code with a total_num_threads
value of 4, the conditional annotated with (1)
will alwaysevaluate to true
(because each thread team will contain at most two threads) and thus the code will always spawn two new threads, no matter how many threads are already running at a higher level.
如果我total_num_threads
以 4的值调用此代码,则注释的条件(1)
将始终评估为true
(因为每个线程组最多包含两个线程),因此代码将始终生成两个新线程,无论有多少线程已经在运行在更高的水平。
I am searching for a platform-independentway of determining the total number of threads that are currently running in my application.
我正在寻找一种独立于平台的方法来确定当前在我的应用程序中运行的线程总数。
采纳答案by jweyrich
Having in mind you know the exact amount of threads being created, the simplest solution I come up with is keeping your own thread counter.
考虑到您知道正在创建的线程的确切数量,我想出的最简单的解决方案是保留您自己的线程计数器。
Be aware I'm completely in the dark about OpenMP as I've never really used it.
请注意,我对 OpenMP 一无所知,因为我从未真正使用过它。
回答by Jonathan Dursi
I think there isn't any such routine in at least OpenMP 3; and if there was, I'm not sure it would help, as there's obviously a huge race condition in between the counting of the number of threads and the forking. You could end up overshooting your target number of threads by almost a factor of 2 if everyone sees that there's room for one thread left and then everyone spawns a thread.
我认为至少在 OpenMP 3 中没有任何这样的例程;如果有,我不确定它会有所帮助,因为在计算线程数和分叉之间显然存在巨大的竞争条件。如果每个人都看到剩下一个线程的空间,然后每个人都生成一个线程,那么您最终可能会超过目标线程数几乎 2 倍。
If this really is the structure of your program, though, and you just want to limit the total number of threads, there are options (all of these are OpenMP 3.0):
但是,如果这确实是您程序的结构,并且您只想限制线程总数,则有一些选项(所有这些都是 OpenMP 3.0):
- Use the
OMP_THREAD_LIMIT
environment variable to limit the total number of OpenMP threads - Use
OMP_MAX_ACTIVE_LEVELS
, oromp_set_max_active_levels()
, or test againstomp_get_level()
, to limit how deeply nested your threads are; if you only want 16 threads, limit to 4 levels of nesting - If you want finer control than powers of two, you can use
omp_get_level()
to find your level, and callomp_get_ancestor_thread_num(int level)
at various levels to find out which thread was your parent, grandparent, etc and from that (using this simple left-right forking) determine a global thread ID. (I think in this case it would go something like ∑l=0..L-1al2L-lwhere l is the level number starting at 0 and a is the ancestor thread number at that level). This would let you (say) allow threads 0-3 to fork but not 4-7, so that you'd end up with 12 rather than 16 threads. I think this only works in such a regular situation; if each parent thread forked a different number of child threads, I don't think you could determine a unique global thread ID because it looks like you can only query your direct ancestors.
- 使用
OMP_THREAD_LIMIT
环境变量限制 OpenMP 线程总数 - 使用
OMP_MAX_ACTIVE_LEVELS
、 或omp_set_max_active_levels()
、 或测试omp_get_level()
来限制线程嵌套的深度;如果您只需要 16 个线程,则限制为 4 级嵌套 - 如果您想要比 2 的幂更精细的控制,您可以使用
omp_get_level()
来找到您的级别,并omp_get_ancestor_thread_num(int level)
在各个级别上调用以找出哪个线程是您的父、祖父等,并从中(使用这个简单的左右分叉)确定一个全局线程标识。(我认为在这种情况下它会像 ∑ l=0..L-1a l2 L-l其中 l 是从 0 开始的级别编号,a 是该级别的祖先线程编号)。这会让你(比如说)允许线程 0-3 分叉而不是 4-7,这样你最终会得到 12 个而不是 16 个线程。我认为这仅适用于这种常规情况;如果每个父线程分叉不同数量的子线程,我认为您无法确定唯一的全局线程 ID,因为看起来您只能查询您的直接祖先。
回答by ejd
The code you have shown has a problem in that an "omp section" has to be within the lexical scope of an "omp sections". I am assuming that you meant the "omp parallel" to be an "omp parallel sections". The other way to do this, is to use "omp task" and then you don't have to keep count of the number of threads. You would just assign the threads to the parallel region and allow the OpenMP implementation to assign the tasks to the threads.
您显示的代码有一个问题,即“omp 部分”必须在“omp 部分”的词法范围内。我假设您的意思是“omp 并行”是“omp 并行部分”。另一种方法是使用“omp 任务”,然后您不必计算线程数。您只需将线程分配给并行区域并允许 OpenMP 实现将任务分配给线程。