C++ 多线程与多处理
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6388031/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Multithreading vs multiprocessing
提问by bdelmas
I am new to this kind of programming and need your point of view.
我是这种编程的新手,需要您的观点。
I have to build an application but I can't get it to compute fast enough. I have already tried Intel TBB, and it is easy to use, but I have never used other libraries.
我必须构建一个应用程序,但我无法让它足够快地计算。我已经尝试过Intel TBB,它很容易使用,但我从未使用过其他库。
In multiprocessor programming, I am reading about OpenMP and Boost for the multithreading, but I don't know their pros and cons.
在多处理器编程中,我正在阅读有关多线程的 OpenMP 和 Boost,但我不知道它们的优缺点。
In C++, when is multi threaded programming advantageous compared to multiprocessor programming and vice versa?Which is best suited to heavy computations or launching many tasks...? What are their pros and cons when we build an application designed with them? And finally, which library is best to work with?
在 C++ 中,与多处理器编程相比,多线程编程何时具有优势,反之亦然?哪个最适合大量计算或启动许多任务......?当我们构建使用它们设计的应用程序时,它们的优缺点是什么?最后,哪个库最适合使用?
回答by Jason
Multithreading means exactly that, running multiple threads. This can be done on a uni-processor system, or on a multi-processor system.
多线程意味着运行多个线程。这可以在单处理器系统或多处理器系统上完成。
On a single-processor system, when running multiple threads, the actual observation of the computer doing multiple things at the same time (i.e., multi-tasking) is an illusion, because what's really happening under the hood is that there is a software scheduler performing time-slicing on the single CPU. So only a single task is happening at any given time, but the scheduler is switching between tasks fast enough so that you never notice that there are multiple processes, threads, etc., contending for the same CPU resource.
在单处理器系统上,当运行多个线程时,实际观察到计算机同时做多件事(即多任务处理)是一种幻觉,因为幕后真正发生的是有一个软件调度程序在单个 CPU 上执行时间分片。因此,在任何给定时间只有一个任务发生,但调度程序在任务之间切换的速度足够快,因此您永远不会注意到有多个进程、线程等争用相同的 CPU 资源。
On a multi-processor system, the need for time-slicing is reduced. The time-slicing effect is still there, because a modern OS could have hundred's of threads contending for two or more processors, and there is typically never a 1-to-1 relationship in the number of threads to the number of processing cores available. So at some point, a thread will have to stop and another thread starts on a CPU that the two threads are sharing. This is again handled by the OS's scheduler. That being said, with a multiprocessors system, you canhave two things happening at the same time, unlike with the uni-processor system.
在多处理器系统上,减少了对时间分片的需求。时间切片效应仍然存在,因为现代操作系统可能有数百个线程争用两个或多个处理器,并且线程数量与可用处理核心数量之间通常从来没有一对一的关系。因此,在某些时候,一个线程将不得不停止,而另一个线程将在两个线程共享的 CPU 上启动。这再次由操作系统的调度程序处理。话虽如此,与单处理器系统不同,对于多处理器系统,您可以同时发生两件事。
In the end, the two paradigms are really somewhat orthogonal in the sense that you will need multithreading whenever you want to have two or more tasks running asynchronously, but because of time-slicing, you do not necessarily need a multi-processor system to accomplish that. If you are trying to run multiple threads, and are doing a task that is highly parallel (i.e., trying to solve an integral), then yes, the more cores you can throw at a problem, the better. You won't necessarily need a 1-to-1 relationship between threads and processing cores, but at the same time, you don't want to spin off so many threads that you end up with tons of idle threads because they must wait to be scheduled on one of the available CPU cores. On the other hand, if your parallel tasks requires some sequential component, i.e., a thread will be waiting for the result from another thread before it can continue, then you may be able to run more threads with some type of barrier or synchronization method so that the threads that need to be idle are not spinning away using CPU time, and only the threads that need to run are contending for CPU resources.
最后,这两种范式确实有些正交,因为无论何时您想要异步运行两个或更多任务,您都需要多线程,但由于时间切片,您不一定需要多处理器系统来完成那。如果您正在尝试运行多个线程,并且正在执行高度并行的任务(即,尝试解决积分),那么是的,您可以在问题上投入的内核越多越好。您不一定需要线程和处理核心之间的 1 对 1 关系,但同时,您不希望分拆出太多线程,从而导致大量空闲线程,因为它们必须等待在可用的 CPU 内核之一上进行调度。另一方面,如果您的并行任务需要一些顺序组件,即,
回答by davka
There are a few important points that I believe should be added to the excellent answer by @Jason.
我认为@Jason 应该将一些重要的观点添加到出色的答案中。
First, multithreading is not always an illusion even on a single processor - there are operations that do not involve the processor. These are mainly I/O - disk, network, terminal etc. The basic form for such operation is blockingor synchronous, i.e. your program waits until the operation is completed and then proceeds. While waiting, the CPU is switched to another process/thread.
首先,即使在单个处理器上,多线程也并不总是一种错觉——有些操作不涉及处理器。这些主要是 I/O - 磁盘、网络、终端等。这种操作的基本形式是阻塞或同步,即您的程序等待操作完成然后继续。在等待时,CPU 切换到另一个进程/线程。
if you have anything you can do during that time (e.g. background computation while waiting for user input, serving another request etc.) you have basically two options:
如果您在那段时间内可以做任何事情(例如,在等待用户输入时进行后台计算,为另一个请求提供服务等),您基本上有两个选择:
use asynchronous I/O: you call a non-blockingI/O providing it with a callback function, telling it "call this function when you are done". The call returns immediately and the I/O operation continues in the background. You go on with the other stuff.
use multithreading: you have a dedicated thread for each kind of task. While one waits for the blocking I/O call, the other goes on.
使用异步 I/O:你调用一个非阻塞I/O,提供一个回调函数,告诉它“完成后调用这个函数”。调用立即返回,I/O 操作在后台继续。你继续做其他事情。
使用多线程:每种任务都有一个专用线程。当一个等待阻塞 I/O 调用时,另一个继续。
Both approaches are difficult programming paradigms, each has its pros and cons.
这两种方法都是困难的编程范式,每种方法都有其优点和缺点。
- with async I/O the logic of the program's logic is less obvious and is difficult to follow and debug. However you avoid thread-safetyissues.
- with threads, the challange is to write thread-safeprograms. Thread safety faults are nasty bugs that are quite difficult to reproduce. Over-use of locking can actually lead to degrading instead of improving the performance.
- 使用异步 I/O,程序逻辑的逻辑不太明显,并且难以跟踪和调试。但是,您可以避免线程安全问题。
- 对于线程,挑战在于编写线程安全的程序。线程安全错误是非常难以重现的令人讨厌的错误。过度使用锁定实际上会导致性能下降而不是提高性能。
(coming to the multi-processing)
(来到多处理)
Multithreading made popular on Windows because manipulating processes is quite heavy on Windows (creating a process, context-switching etc.) as opposed to threads which are much more lightweight (at least this was the case when I worked on Win2K).
多线程在 Windows 上变得流行,因为在 Windows 上操作进程非常繁重(创建进程、上下文切换等),而不是轻得多的线程(至少在我使用 Win2K 时就是这种情况)。
On Linux/Unix, processes are much more lightweight. Also (AFAIK) threads on Linux are implemented actually as a kind of processes internally, so there is no gain in context-switching of threads vs. processes. However, you need to use some form of IPC (inter-process communications), as shared memory, pipes, message queue etc.
在 Linux/Unix 上,进程要轻得多。此外,Linux 上的(AFAIK)线程实际上是作为一种进程在内部实现的,因此线程与进程的上下文切换没有任何好处。但是,您需要使用某种形式的 IPC(进程间通信),如共享内存、管道、消息队列等。
On a more lite note, look at the SQLite FAQ, which declares "Threads are evil"! :)
在更精简的注释中,查看SQLite FAQ,它声明“线程是邪恶的”!:)
回答by Mike C
To answer the first question: The best approach is to just use multithreading techniques in your code until you get to the point where even that doesn't give you enough benefit. Assume the OS will handle delegation to multiple processors if they're available.
回答第一个问题:最好的方法是在您的代码中只使用多线程技术,直到达到即使这样也不能给您带来足够好处的程度。假设操作系统将处理对多个处理器的委派(如果它们可用)。
If you actually are working on a problem where multithreading isn't enough, even with multiple processors (or if you're running on an OS that isn't using its multiple processors), then you can worry about discovering how to get more power. Which might mean spawning processes across a network to other machines.
如果您实际上正在处理多线程还不够的问题,即使有多个处理器(或者如果您在不使用多个处理器的操作系统上运行),那么您可能会担心如何获得更多功能. 这可能意味着通过网络向其他机器生成进程。
I haven't used TBB, but I have used IPP and found it to be efficient and well-designed. Boost is portable.
我没有用过 TBB,但我用过 IPP,发现它高效且设计精良。Boost 是便携的。
回答by Paul Morrison
Just wanted to mention that the Flow-Based Programming ( http://www.jpaulmorrison.com/fbp) paradigm is a naturally multiprogramming/multiprocessing approach to application development. It provides a consistent application view from high level to low level. The Java and C# implementations take advantage of all the processors on your machine, but the older C++ implementation only uses one processor. However, it could fairly easily be extended to use BOOST (or pthreads, I assume) by the use of locking on connections. I had started converting it to use fibers, but I'm not sure if there's any point in continuing on this route. :-) Feedback would be appreciated. BTW The Java and C# implementations can even intercommunicate using sockets.
只想提一下,基于流的编程 ( http://www.jpaulmorrison.com/fbp) 范式是应用程序开发的一种自然多道程序/多处理方法。它提供了从高级到低级的一致应用程序视图。Java 和 C# 实现利用您机器上的所有处理器,但较旧的 C++ 实现仅使用一个处理器。但是,通过使用锁定连接,它可以很容易地扩展到使用 BOOST(或 pthreads,我假设)。我已经开始将它转换为使用纤维,但我不确定继续这条路线是否有任何意义。:-) 反馈将不胜感激。顺便说一句,Java 和 C# 实现甚至可以使用套接字进行相互通信。