multithreading 线程与并行,它们有何不同?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/806499/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 01:01:47  来源:igfitidea点击:

Threading vs Parallelism, how do they differ?

multithreadingparallel-processing

提问by Dhana

What is the difference between threading and parallelism?

线程和并行有什么区别?

Which one has advantage over the other?

哪一个比另一个有优势?

回答by RichardOD

Daniel Moth (a former coworker of mine)- Threading/Concurrency vs Parallelismarticle explains it all.

Daniel Moth(我的前同事)- Threading/Concurrency vs Parallelism文章解释了这一切。

Quoted:

引:

To take advantage of multiple cores from our software, ultimately threads have to be used. Because of this fact, some developers fall in the trap of equating multithreading to parallelism. That is not accurate...You can have multithreading on a single core machine, but you can only have parallelism on a multi core machine

The quick test: If on a single core machine you are using threads and it makes perfect sense for your scenario, then you are not "doing parallelism", you are just doing multithreading.

为了利用我们软件的多个内核,最终必须使用线程。由于这个事实,一些开发人员陷入了将多线程等同于并行的陷阱。这不准确...您可以在单核机器上进行多线程处理,但只能在多核机器上进行并行处理

快速测试:如果您在单核机器上使用线程并且它对您的场景非常有意义,那么您不是在“进行并行”,而是在进行多线程。

回答by jldugger

Parallelism is a general technique of using more than one flow of instructions to complete a computation. The critical aspect of all parallel techniques is communicating between flows to collaborate a final answer.

并行是一种使用多个指令流来完成计算的通用技术。所有并行技术的关键方面是在流之间进行通信以协作最终答案。

Threading is a specific implementation of parallelism. Each flow of instructions is given it's own stack to keep a record of local variables and function calls, and communicates with the other flows implicitly by shared memory.

线程是并行的一种具体实现。每个指令流都有自己的堆栈来保存局部变量和函数调用的记录,并通过共享内存隐式地与其他流通信。

One example might be to have one thread simply queue up disk requests and pass it to a worker thread, effectively parallelizing disk and CPU. The traditional UNIX pipes method is to split these into two complete programs, say "cat" and grep in the command:

一个例子可能是让一个线程简单地将磁盘请求排队并将其传递给一个工作线程,从而有效地并行化磁盘和 CPU。传统的 UNIX 管道方法是将它们拆分为两个完整的程序,在命令中说“cat”和 grep:

cat /var/log/Xorg.0.log | grep "EE"

Threading could conceivably reduce the communication costs of copying disk I/O from the cat process to the grep process.

可以想象,线程可以降低将磁盘 I/O 从 cat 进程复制到 grep 进程的通信成本。

回答by Martin P. Hellwig

Threading is usually referred to having multiple processes working at the same time on a single CPU (well actually not you think they do but they switch very fast between them).

线程通常是指在单个 CPU 上同时运行多个进程(实际上并非您认为它们这样做,但它们之间的切换速度非常快)。

Parallelism is having multiple processes working at the same time on multiple CPU's.

并行性是让多个进程同时在多个 CPU 上工作。

Both have their pros and cons heavily depending on the scheduler used by your operating system. Usually the computation cost of creating a thread is much lower then spawning a process on another CPU, however having a 'whole' CPU for yourself increases the overall speed of that process. But then again if that process needs to communicate with another process on another CPU you need to solve the IPC (inter process communication) problem which might be such an overhead that it is effectively better to just use a thread on the same CPU.

两者都有其优缺点,这在很大程度上取决于您的操作系统使用的调度程序。通常,创建线程的计算成本远低于在另一个 CPU 上生成进程的计算成本,但是为自己拥有一个“完整”的 CPU 会提高该进程的整体速度。但话说回来,如果该进程需要与另一个 CPU 上的另一个进程通信,您需要解决 IPC(进程间通信)问题,这可能是一种开销,以至于在同一个 CPU 上使用线程实际上更好。

Most operating system are aware of multiple CPU's/Cores and can use them, but this makes the scheduler usually quite complex.

大多数操作系统都知道多个 CPU/核心并且可以使用它们,但这使得调度程序通常非常复杂。

If your are programming in a language that uses a VM (virtual machine), be aware that they need to implement their own scheduler (if at all). Python for example uses a GIL, which pretty much says that everything running on that VM stays on the same CPU, always. Though some OS's are capable of migrating a heavy process to another CPU that isn't so busy at the moment, which of course means that the whole process needs to be paused while it is doing that.

如果您使用使用 VM(虚拟机)的语言进行编程,请注意他们需要实现自己的调度程序(如果有的话)。例如,Python 使用 GIL,它几乎表示在该 VM 上运行的所有内容始终保持在同一个 CPU 上。尽管某些操作系统能够将繁重的进程迁移到目前不那么忙的另一个 CPU,但这当然意味着整个进程在执行此操作时需要暂停。

Some operating systems like DragonFlyBSD take a whole different approach to scheduling then what at this moment is the 'standard' approach.

一些像 DragonFlyBSD 这样的操作系统采用完全不同的方法来调度,那么现在是“标准”方法。

I think this answer gives you enough keywords to search for more information :-)

我认为这个答案为您提供了足够的关键字来搜索更多信息:-)

回答by PaulJWilliams

Threading is a technology, parallelism is a paradigm that may be implemented using threading (but could just as easily be done using single threads on multiple processors)

线程是一种技术,并行是一种可以使用线程实现的范式(但也可以在多个处理器上使用单线程轻松实现)

回答by Iqra.

Here is the best answer to clear out anyone's doubts related to parallelism and threading.

这是清除任何人对并行性和线程的疑问的最佳答案。

Threads are a software construct. I can start as many pthreads as I want, even on an old single core processor. So multi-threading is not necessarily parallel: it's only parallel if the hardware can support it. So if you have multiple cores and/or hyperthreading, your multi-threading becomes parallel. And these days that is in fact most of the time.

Concurrency is about activities that have no clear temporal ordering. So again, if the hardware supports it, they can be done in parallel, if not, not.

So, traditionally multi-threading is almost synonymous with concurrency. And both of them only become parallel if the hardware supports it. Even then you can start many more threads than the hardware supports, and you are left with concurrency.

线程是一种软件结构。我可以根据需要启动任意数量的 pthread,即使在旧的单核处理器上也是如此。所以多线程不一定是并行的:只有在硬件可以支持的情况下才可以并行。因此,如果您有多个内核和/或超线程,您的多线程就会变得并行。而如今,这实际上是大部分时间。

并发是关于没有明确时间顺序的活动。同样,如果硬件支持,它们可以并行完成,如果不支持,则不能。

因此,传统上多线程几乎是并发的同义词。只有在硬件支持的情况下,它们才会成为并行。即便如此,您也可以启动比硬件支持多得多的线程,并且只剩下并发性。

From an answer by Victor Eijkhout on Quora.

来自Victor Eijkhout 在 Quora 上的回答

回答by Alexander Crescent

There are two different kinds of concurrency:

有两种不同类型的并发:

  1. Threading: CPU switches between different threads really fast, giving a falsehood of concurrency. Keypoint: only one thread is running at any given time.When one thread is running, others are blocked. You might think, how is this any useful than just running procedurally? Well, think of it as a priority queue. Threads can be scheduled. CPU scheduler can give each thread a certain amount of time to run, pause them, pass data to other threads, then give them different priorities to run at a later time. It's a must for not instant running processes that interact with each other. It's used in servers extensively: thousands of clients can request something at the same time, then getting what they requested at a later time (If done procedurally, only one client can be served at a time). Philosophy: do different things together.It doesn't reduce the total time (moot point for server, because one client doesn't care other clients' total requests).
  2. Parallelism: threads are running parallel, usually in different CPU core, true concurrency. Keypoint: mlutiple threads are running at any given time.It's useful for heavy computations, super long running processes. Same thing with a fleet of single core machines, split data into sections for each machine to compute, pool them together at the end. Different machines/cores are hard to interact with each other. Philosophy: do one thing in less time.
  1. 线程:CPU 在不同线程之间切换的速度非常快,这是并发的错误。关键点:在任何给定时间只有一个线程在运行。当一个线程运行时,其他线程被阻塞。您可能会想,这比按程序运行还有什么用呢?好吧,把它想象成一个优先队列。线程可以被调度。CPU 调度器可以给每个线程一定的运行时间,暂停它们,将数据传递给其他线程,然后给它们不同的优先级以供稍后运行。对于相互交互的非即时运行进程来说,这是必须的。它广泛用于服务器:数千个客户端可以同时请求某些内容,然后在稍后获取他们请求的内容(如果按程序完成,一次只能为一个客户端提供服务)。理念:一起做不同的事情。它不会减少总时间(服务器的争论点,因为一个客户端不关心其他客户端的总请求)。
  2. Parallelism:线程并行运行,通常在不同的CPU核心,真正的并发。关键点:多个线程在任何给定时间都在运行。它对于繁重的计算、超长时间运行的进程很有用。一组单核机器也是如此,将数据分成几部分供每台机器计算,最后将它们汇集在一起​​。不同的机器/内核很难相互交互。理念:用更少的时间做一件事。

As you can see, they solve totally different kinds of problems.

如您所见,它们解决了完全不同的问题。

回答by Sonu

If we think CPU as a company and threads as its workersthen, it help us to understand threading and parallelism more easily.

如果我们将CPU 视为一个公司,将线程视为它的工人,那么它有助于我们更轻松地理解线程和并行性。

Like a company have many workers, the CPU also have many threads.

就像一个公司有很多工人一样,CPU 也有很多线程。

Also there may be more than one company and therefore there may be more than one CPU's.

也可能有不止一家公司,因此可能有不止一个 CPU。

Therefore when workers(threads) work in a company(CPU), it is called threading.

因此,当工人(线程)在公司(CPU)中工作时,它被称为线程

And when two or more companies(CPU) work independently or together, it is called parallelism.

当两个或多个公司(CPU)独立或一起工作时,称为并行性

回答by Michael Borgwardt

How do you define "parallelism"? Multithreading is a concrete implementation of the concept of parallel program execution.

您如何定义“并行性”?多线程是并行程序执行概念的具体实现。

The article RichardOD linked to seems to be mainly concerned with whether threads are actually executed in parallel on a concrete machine.

RichardOD 链接的文章似乎主要关注线程是否实际上在具体机器上并行执行。

However, your question seems to see multithreading and parallelism as opposites. Do you perhaps mean programs that use multiple processesrather than multiple threads? If so, the differences are:

但是,您的问题似乎将多线程和并行视为对立面。您可能是指使用多进程而不是多线程的程序吗?如果是这样,差异是:

  • Threads are much cheaper to create than processes. This is why using threads rather than processes resulted in a huge speedup in web applications - this was called "FastCGI".
  • Multiple threads on the same machine have access to shared memory. This makes communication between threads much easier, but also very dangerous (it's easy to create bugs like race conditions that are very hard to diagnose and fix).
  • 创建线程比创建进程便宜得多。这就是为什么在 Web 应用程序中使用线程而不是进程会导致巨大的加速——这被称为“FastCGI”。
  • 同一台机器上的多个线程可以访问共享内存。这使得线程之间的通信更容易,但也非常危险(很容易产生像竞态条件这样很难诊断和修复的错误)。

回答by Thevs

Threading is a poor man's parallelism.

线程是穷人的并行性。

EDIT: To be more precise:

编辑:更准确地说:

Threading has nothing to do with parallelism and wise versa. Threading is about making feel that some processes run in parallel. However, this doesn't make processes to completeALL their actions any faster in total.

线程与并行性无关,反之亦然。线程是让感觉某些进程并行运行。但是,这并不会使流程总体上更快地完成所有操作。