C语言 如何“多线程”C 代码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3908031/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to "multithread" C code
提问by Open the way
I have a number crunching application written in C. It is kind of a main loop that for each value calls, for increasing values of "i", a function that performs some calculations. I read about multithreading, and I am considering learning a bit about it, in C. I wonder if somehow general code like mine could be automatically multithreaded and how.
我有一个用 C 编写的数字运算应用程序。它是一种主循环,对于每个值调用,用于增加“i”的值,这是一个执行一些计算的函数。我阅读了多线程,我正在考虑在 C 中学习一点。我想知道像我这样的通用代码是否可以自动进行多线程处理以及如何实现。
Thanks
谢谢
P.D. To get an idea about my code, let's say that it is something like this:
PD 要了解我的代码,假设它是这样的:
main(...)
for(i=0;i<=ntimes;i++)get_result(x[i],y[i],result[i]);
...
...
void get_result(float x,float y,float result){
result=sqrt(log (x) + log (y) + cos (exp (x + y));
(and some more similar mathematical operations)
}
采纳答案by Novikov
One alternative to multithread your code would be using pthreads( provides more precise control than OpenMP ).
多线程的一种替代方法是使用pthreads(提供比 OpenMP 更精确的控制)。
Assuming x, y& resultare global variable arrays,
假设x, y&result是全局变量数组,
#include <pthread.h>
...
void *get_result(void *param) // param is a dummy pointer
{
...
}
int main()
{
...
pthread_t *tid = malloc( ntimes * sizeof(pthread_t) );
for( i=0; i<ntimes; i++ )
pthread_create( &tid[i], NULL, get_result, NULL );
... // do some tasks unrelated to result
for( i=0; i<ntimes; i++ )
pthread_join( tid[i], NULL );
...
}
(Compile your code with gcc prog.c -lpthread)
(编译你的代码gcc prog.c -lpthread)
回答by Novikov
If the task is highly parallelizable and your compiler is modern, you could try OpenMP. http://en.wikipedia.org/wiki/OpenMP
如果任务是高度可并行化的并且您的编译器是现代的,您可以尝试 OpenMP。http://en.wikipedia.org/wiki/OpenMP
回答by nategoose
You should have a look at openMP for this. The C/C++ example on this page is similar to your code: https://computing.llnl.gov/tutorials/openMP/#SECTIONS
为此,您应该查看 openMP。此页面上的 C/C++ 示例与您的代码类似:https: //computing.llnl.gov/tutorials/openMP/#SECTIONS
#include <omp.h>
#define N 1000
main ()
{
int i;
float a[N], b[N], c[N], d[N];
/* Some initializations */
for (i=0; i < N; i++) {
a[i] = i * 1.5;
b[i] = i + 22.35;
}
#pragma omp parallel shared(a,b,c,d) private(i)
{
#pragma omp sections nowait
{
#pragma omp section
for (i=0; i < N; i++)
c[i] = a[i] + b[i];
#pragma omp section
for (i=0; i < N; i++)
d[i] = a[i] * b[i];
} /* end of sections */
} /* end of parallel section */
}
If you prefer not to use openMP you could use either pthreads or clone/wait directly.
如果您不想使用 openMP,您可以直接使用 pthreads 或 clone/wait。
No matter which route you choose you are just dividing up your arrays into chunks which each thread will process. If all of your processing is purely computational (as suggested by your example function) then you should do well to have only as many threads as you have logical processors.
无论您选择哪条路线,您都只是将数组划分为每个线程将处理的块。如果您的所有处理都是纯粹的计算(如您的示例函数所建议的那样),那么您应该只拥有与逻辑处理器一样多的线程。
There is some overhead with adding threads to do parallel processing, so make sure that you give each thread enough work to make up for it. Usually you will, but if each thread only ends up with 1 computation to do and the computations aren't that difficult to do then you may actually slow things down. You can always have fewer threads than you have processors if that is the case.
添加线程进行并行处理会产生一些开销,因此请确保为每个线程提供足够的工作来弥补它。通常你会这样做,但如果每个线程最终只需要执行 1 个计算并且计算并不难,那么你实际上可能会减慢速度。如果是这种情况,您的线程数总是可以少于处理器数。
If you do have some IO going on in your work then you may find that having more threads than processors is a win because while one thread may be blocking waiting for some IO to complete another thread can be doing its computations. You have to be careful doing IO to the same file in threads, though.
如果您的工作中确实有一些 IO 正在进行,那么您可能会发现拥有比处理器更多的线程是一种胜利,因为虽然一个线程可能正在阻塞等待某个 IO 完成另一个线程可以进行计算。但是,您必须小心对线程中的同一个文件执行 IO。
回答by asveikau
If you are hoping to provide concurrency for a single loop for some kind of scientific computing or similar, OpenMP as @Novikov says really is your best bet; this is what it was designed for.
如果您希望为某种科学计算或类似的单个循环提供并发性,@Novikov 所说的 OpenMP 确实是您最好的选择;这就是它的设计目的。
If you're looking to learn the more classical approach that you would more typically see in an application written in C... On POSIX you want pthread_create()et al. I'm not sure what your background might be with concurrency in other languages, but before going too deeply into that, you will want to know your synchronization primitives (mutexes, semaphores, etc.) fairly well, as well as understanding when you will need to use them. That topic could be a whole book or set of SO questions unto itself.
如果您想学习在用 C 编写的应用程序中更常见的更经典的方法......在 POSIX 上,您想要pthread_create()等。我不确定您在其他语言中的并发背景可能是什么,但在深入研究之前,您需要很好地了解您的同步原语(互斥体、信号量等),以及了解何时需要使用它们。该主题本身可以是一整本书或一组 SO 问题。
回答by darklon
Intel's C++ compiler is actually capable of automatically paralellizing your code. It's just a compiler switch you need to enable. It doesn't work as well as OpenMP though (ie. it doesn't always succeed or resulting program is slower). From Intel's website: "Auto-parallelization, which is triggered by the -parallel (Linux* OS and Mac OS* X) or /Qparallel (Windows* OS) option, automatically identifies those loop structures that contain parallelism. During compilation, the compiler automatically attempts to deconstruct the code sequences into separate threads for parallel processing. No other effort by the programmer is needed."
英特尔的 C++ 编译器实际上能够自动并行化您的代码。这只是您需要启用的编译器开关。虽然它不如 OpenMP 工作得好(即它并不总是成功或生成的程序较慢)。来自英特尔网站:“由 -parallel(Linux* 操作系统和 Mac OS* X)或 /Qparallel(Windows* 操作系统)选项触发的自动并行化自动识别那些包含并行性的循环结构。在编译期间,编译器自动尝试将代码序列解构为单独的线程以进行并行处理。不需要程序员的其他工作。”
回答by Marcin
a good exercise for learning concurrent programming in any language would be to work on a thread pool implementation.
In this pattern you create some threads in advance. Those threads are treated as an resource. A thread pool object/structure is used to assign user defined task to those threads for execution. When the task is finished you can collect it's results. You can use thread pool as a general purpose design pattern for concurrency.
The main idea could look similar to
学习任何语言的并发编程的一个很好的练习是研究线程池实现。
在此模式中,您提前创建了一些线程。这些线程被视为资源。线程池对象/结构用于将用户定义的任务分配给这些线程以供执行。任务完成后,您可以收集它的结果。您可以将线程池用作并发的通用设计模式。主要思想可能类似于
#define number_of_threads_to_be_created 42
// create some user defined tasks
Tasks_list_t* task_list_elem = CreateTasks();
// Create the thread pool with 42 tasks
Thpool_handle_t* pool = Create_pool(number_of_threads_to_be_created);
// populate the thread pool with tasks
for ( ; task_list_elem; task_list_elem = task_list_elem->next) {
add_a_task_to_thpool (task_list_elem, pool);
}
// kick start the thread pool
thpool_run (pool);
// Now decide on the mechanism for collecting the results from tasks list.
// Some of the candidates are:
// 1. sleep till all is done (naive)
// 2. pool the tasks in the list for some state variable describing that the task has
// finished. This can work quite well in some situations
// 3. Implement signal/callback mechanism that a task can use to signal that it has
// finished executing.
The mechanism for collecting data from tasks and the amount of threads used in pool should be chosen to reflect your requirements and the capabilities of the hardware and runtime environment.
Also please note that this pattern does not say anything how you should "synchronize" your tasks with each other/outside surroundings. Also error handling can be a bit tricky (example: what to do when one task fails). Those two aspects need to be thought in advance - they can restrict usage of thread pool pattern.
应选择从任务收集数据的机制和池中使用的线程数量,以反映您的要求以及硬件和运行时环境的功能。
另请注意,此模式并未说明您应该如何与彼此/外部环境“同步”您的任务。此外,错误处理可能有点棘手(例如:当一项任务失败时该怎么办)。这两个方面需要提前考虑——它们可以限制线程池模式的使用。
About thread pool:
http://en.wikipedia.org/wiki/Thread_pool_pattern
http://docs.oracle.com/cd/E19253-01/816-5137/ggedn/index.html
关于线程池:
http: //en.wikipedia.org/wiki/Thread_pool_pattern
http://docs.oracle.com/cd/E19253-01/816-5137/ggedn/index.html
A good literature about pthreads to get going:
http://www.advancedlinuxprogramming.com/alp-folder/alp-ch04-threads.pdf
关于 pthreads 的好文献:http:
//www.advancedlinuxprogramming.com/alp-folder/alp-ch04-threads.pdf
回答by ValiRossi
Depending on the OS, you could use posix threads. You could instead implement stack-less multithreading using state machines. There is a really good book entitled "embedded multitasking" by Keith E. Curtis. It's just a neatly crafted set of switch case statements. Works great, I've used it on everything from apple macs, rabbit semiconductor, AVR, PC.
根据操作系统,您可以使用 posix 线程。您可以改为使用状态机实现无堆栈多线程。Keith E. Curtis 有一本非常好的书,名为“嵌入式多任务处理”。它只是一组精心制作的 switch case 语句。效果很好,我在苹果 Mac、兔子半导体、AVR、PC 等所有设备上都使用过它。
Vali
瓦利
回答by AlcubierreDrive
To specifically address the "automaticallymultithreaded" part of the OP's question:
要专门解决OP 问题的“自动多线程”部分:
One really interesting view of how to program parallelism was designed into a language called Cilk Plusinvented by MIT and now owned by Intel. To quote Wikipedia, the idea is that
关于如何编程并行性的一个非常有趣的观点被设计成一种称为Cilk Plus的语言,由麻省理工学院发明,现在归英特尔所有。引用维基百科,这个想法是
"the programmer should be responsible for exposing the parallelism, identifying elements that can safely be executed in parallel; it should then be left to the run-time environment, particularly the scheduler, to decide during execution how to actually divide the work between processors."
“程序员应该负责暴露并行性,识别可以安全地并行执行的元素;然后应该留给运行时环境,特别是调度程序,在执行期间决定如何在处理器之间实际分配工作。 ”
Cilk Plus is a superset of standard C++. It just contains a few extra keywords (_Cilk_spawn, _Cilk_sync, and _Cilk_for) that allow the programmer to tag parts of their program as parallelizable. The programmer does not mandatethat any code be run on a new thread, they just allowthe lightweight runtime scheduler to spawn a new thread if and only if it is actually the right thing to do under the particular runtime conditions.
Cilk Plus 是标准 C++ 的超集。它只包含一些额外的关键字(_Cilk_spawn、_Cilk_sync和_Cilk_for),允许程序员将其程序的一部分标记为可并行化。程序员不要求任何代码在新线程上运行,他们只是允许轻量级运行时调度程序产生一个新线程,当且仅当在特定运行时条件下它实际上是正确的事情时。
To use Cilk Plus, just add its keywords into your code, and build with Intel's C++ compiler.
要使用 Cilk Plus,只需将其关键字添加到您的代码中,并使用英特尔的 C++ 编译器进行构建。
回答by Mecki
Your code is not automatically multi-threaded by the compiler if that was your question. Please note that the C standards themselves know nothing about multi-threading, since whether you can use multi-threading or not does not depend on the language you use for coding, but on the destination platform you are coding for. Code written in C can run on pretty much anything for that a C compiler exists for. A C compiler even exists for a C64 computer (almost completely ISO-99 conform); however, to support multiple threads, the platform must have an operating system supporting this and usually this means that at least certain CPU functionality must be present. An operating system can do multithreading almost exclusively in software, this will be awfully slow and there won't be memory protection, but it is possible, however even in that case you need at least programmable interrupts.
如果这是您的问题,您的代码不会被编译器自动多线程处理。请注意,C 标准本身对多线程一无所知,因为您是否可以使用多线程并不取决于您用于编码的语言,而是取决于您编码的目标平台。用 C 编写的代码几乎可以在 C 编译器存在的任何东西上运行。AC 编译器甚至适用于 C64 计算机(几乎完全符合 ISO-99);然而,为了支持多线程,平台必须有一个支持它的操作系统,这通常意味着至少必须存在某些 CPU 功能。操作系统几乎完全可以在软件中进行多线程处理,这会非常慢并且没有内存保护,但这是可能的,
So how to write multi-threaded C code depends entirely on the operating system of your target platform. There exists POSIX conform systems (OS X, FreeBSD, Linux, etc.) and systems that have their own library for that (Windows). Some systems have more than library for it (e.g. OS X has the POSIX Library, but there is also the Carbon Thread Manager you can use in C (though I think it is rather legacy nowadays).
所以如何编写多线程 C 代码完全取决于您的目标平台的操作系统。存在符合 POSIX 的系统(OS X、FreeBSD、Linux 等)和具有自己的库的系统(Windows)。有些系统不仅有库(例如,OS X 有 POSIX 库,但也有可以在 C 中使用的 Carbon 线程管理器(尽管我认为它现在已经很旧了)。
Of course there exists cross-platform thread libraries and some modern compilers have support for things like OpenMP, where the compiler will automatically build code to create threads on your chosen target platform; but not many compilers do support it and those that do support it are usually not feature complete. Usually you get the widest system support by using POSIX threads, more often called "pthreads". The only major platform not supporting it is Windows and here you can use free 3rd party libraries like this one. Several other ports exists as well (Cygwinhas one for sure). If you will have a UI one day of some kind, you may want to use a cross-platform library like wxWidgetsor SDL, both offering consistent multi-thread support on all supported platforms.
当然,存在跨平台线程库,一些现代编译器支持 OpenMP 之类的东西,编译器将自动构建代码以在您选择的目标平台上创建线程;但支持它的编译器并不多,支持它的编译器通常功能不完整。通常您通过使用 POSIX 线程获得最广泛的系统支持,通常称为“pthreads”。唯一不支持它的主要平台是 Windows,在这里您可以使用像这样的免费 3rd 方库。还存在其他几个端口(Cygwin肯定有一个)。如果有一天你会有一个 UI,你可能想要使用像wxWidgets或SDL这样的跨平台库,两者都在所有支持的平台上提供一致的多线程支持。
回答by ifyes
If an iteration in loop is independent of the ones before it, then there's a very simple approach: try multi-processing, rather than multi-threading.
如果循环中的迭代独立于它之前的迭代,那么有一个非常简单的方法:尝试多处理,而不是多线程。
Say you have 2 cores and ntimesis 100, then 100/2=50, so create 2 versions of the program where the first iterates from 0 to 49, the other from 50 to 99. Run them both, your cores should be kept quite busy.
假设你有 2 个内核并且ntimes是 100,那么 100/2=50,所以创建程序的 2 个版本,其中第一个从 0 到 49 迭代,另一个从 50 到 99。运行它们,你的内核应该保持非常忙碌.
This is a very simplistic approach, yet you don't have to mess with thread creation, synchronization, etc
这是一种非常简单的方法,但您不必搞乱线程创建、同步等

