multithreading 线程和进程与多线程和多核/多处理器:它们是如何映射的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1713554/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Threads & Processes Vs MultiThreading & Multi-Core/MultiProcessor : How they are mapped?
提问by claws
I was very confused but the following thread cleared my doubts:
我很困惑,但以下线程消除了我的疑虑:
Multiprocessing, Multithreading,HyperThreading, Multi-core
But it addresses the queries from the hardware point of view. I want to know how these hardware features are mapped to software?
但它从硬件的角度解决了查询。我想知道这些硬件功能是如何映射到软件上的?
One thing that is obvious is that there is no difference between MultiProcessor(=Mutlicpu) and MultiCore other than that in multicore all cpus reside on one chip(die) where as in Multiprocessor all cpus are on their own chips & connected together.
显而易见的一件事是,多处理器(=Mutlicpu)和多核之间没有区别,除了在多核中所有 cpu 都驻留在一个芯片(裸片)上,而在多处理器中,所有 cpu 都在自己的芯片上并连接在一起。
So, mutlicore/multiprocessor systems are capable of executing multiple processes (firefox,mediaplayer,googletalk) at the "sametime" (unlike context switching these processes on a single processor system) Right?
因此,多核/多处理器系统能够“同时”执行多个进程(firefox、mediaplayer、googletalk)(与在单个处理器系统上切换这些进程的上下文不同),对吗?
If it correct. I'm clear so far. But the confusion arises when multithreading comes into picture.
如果正确。到目前为止我很清楚。但是当多线程出现时就会出现混乱。
MultiThreading "is for" parallel processing. right?
What are elements that are involved in multithreading inside cpu? diagram? For me to exploit the power of parallel processing of two independent tasks, what should be the requriements of CPU?
When people say context switching of threads. I don't really get it. because if its context switching of threads then its not parallel processing. the threads must be executed "scrictly simultaneously". right?
My notion of multithreading is that: Considering a system with single cpu. when process is context switched to firefox. (suppose) each tab of firefox is a thread and all the threads are executing strictly at the same time. Not like one thread has executed for sometime then again another thread has taken until the context switch time is arrived.
What happens if I run a multithreaded software on a processor which can't handle threads? I mean how does the cpu handle such software?
If everything is good so far, now question is HOW MANY THREADS? It must be limited by hardware, I guess? If hardware can support only 2 threads and I start 10 threads in my process. How would cpu handle it? Pros/Cons? From software engineering point of view, while developing a software that will be used by the users in wide variety of systems, Then how would I decide should I go for multithreading? if so, how many threads?
多线程“用于”并行处理。对?
cpu内部多线程涉及哪些元素?图表?对于我要利用并行处理两个独立任务的能力,CPU 的要求是什么?
当人们说线程的上下文切换时。我真的不明白。因为如果它的线程上下文切换那么它不是并行处理。线程必须“严格同时”执行。对?
我的多线程概念是:考虑一个具有单 CPU 的系统。当进程上下文切换到 Firefox 时。(假设)firefox 的每个选项卡都是一个线程,并且所有线程都严格同时执行。不像一个线程执行了一段时间然后另一个线程又执行了直到上下文切换时间到来。
如果我在无法处理线程的处理器上运行多线程软件会怎样?我的意思是cpu如何处理这样的软件?
如果到目前为止一切都很好,那么现在的问题是有多少线程?我猜它必须受硬件限制吗?如果硬件只能支持 2 个线程并且我在我的进程中启动了 10 个线程。cpu 将如何处理它?优点缺点?从软件工程的角度来看,在开发一个可供用户在各种系统中使用的软件时,我该如何决定是否应该使用多线程?如果是这样,有多少线程?
回答by minjang
First, try to understand the concept of 'process' and 'thread'. A thread is a basic unit for execution: a thread is scheduled by operating system and executed by CPU. A process is a sort of container that holds multiple threads.
首先,试着理解“进程”和“线程”的概念。线程是执行的基本单位:线程由操作系统调度,由CPU执行。进程是一种容纳多个线程的容器。
Yes, either multi-processing or multi-threading is for parallel processing. More precisely, to exploit thread-level parallelism.
Okay, multi-threading could mean hardware multi-threading(one example is HyperThreading). But, I assume that you just say multithreading in software. In this sense, CPU should support context switching.
Context switching is needed to implement multi-taskingeven in a physically single core by time division.
Say there are two physical cores and four very busy threads. In this case, two threads are just waiting until they will get the chance to use CPU. Read some articles related to preemptive OS scheduling.
The number of thread that can physically run in concurrent is just identical to # of logical processors. You are asking a general thread scheduling problem in OS literature such as round-robin..
是的,多处理或多线程用于并行处理。更准确地说,是为了利用线程级并行性。
好的,多线程可能意味着硬件多线程(一个例子是超线程)。但是,我假设您只是说软件中的多线程。从这个意义上说,CPU 应该支持上下文切换。
即使在物理单核中通过时分实现多任务也需要上下文切换。
假设有两个物理内核和四个非常繁忙的线程。在这种情况下,两个线程只是在等待,直到它们有机会使用 CPU。阅读一些与抢占式操作系统调度相关的文章。
可以物理并发运行的线程数与逻辑处理器数相同。您问的是 OS 文献中的一般线程调度问题,例如循环。
I stronglysuggest you to study basics of operating system first. Then move on multithreading issues. It seems like you're still unclear for the key concepts such as context switching and scheduling. It will take a couple of month, but if you really want to be an expert in computer software, then you should know such very basic concepts. Please take whatever OS books and lecture slides.
我强烈建议你先学习操作系统的基础知识。然后继续处理多线程问题。您似乎还不清楚上下文切换和调度等关键概念。这将需要几个月的时间,但是如果您真的想成为计算机软件方面的专家,那么您应该了解这些非常基本的概念。请拿走任何操作系统书籍和讲座幻灯片。
回答by Philip Derbeko
Threads running on the same core are not technically parallel. They only appear to be executed in parallel, as the CPU switches between them very fast (for us, humans). This switch is what is called context switch.
Now, threads executing on different cores are executed in parallel.
Most modern CPUs have a number of cores, however, most modern OSes (windows, linux and friends) usually execute much larger number of threads, which still causes context switches.
Even if no user program is executed, still OS itself performs context switches for maintanance work.
This should answer 1-3.
在同一核心上运行的线程在技术上不是并行的。它们似乎只是并行执行,因为 CPU 在它们之间切换非常快(对我们人类而言)。这种切换就是所谓的上下文切换。现在,在不同内核上执行的线程并行执行。大多数现代 CPU 都有多个内核,但是,大多数现代操作系统(windows、linux 和朋友)通常执行大量线程,这仍然会导致上下文切换。即使没有执行用户程序,操作系统本身仍然会执行上下文切换以进行维护工作。
这应该回答 1-3。
About 4: basically, every processor can work with threads. it is much more a characteristic of operating system. Thread is basically: memory (optional), stack and registers, once those are replaced you are in another thread.
关于 4:基本上,每个处理器都可以使用线程。它更像是操作系统的一个特征。线程基本上是:内存(可选)、堆栈和寄存器,一旦这些被替换,你就在另一个线程中。
5: the number of threads is pretty high and is limited by OS. Usually it is higher than regular programmer can successfully handle :) The number of threads is dictated by your program:
5:线程数相当多,受操作系统限制。通常它高于普通程序员可以成功处理的 :) 线程数由您的程序决定:
is it IO bound?
是IO绑定吗?
- can the task be divided into a number of smaller tasks?
- how small is the task? the task can be too small to make it worth to spawn threads at all.
- synchronization: if extensive synhronization is required, the penalty might be too heavy and you should reduce the number of threads.
- 可以将任务划分为多个较小的任务吗?
- 任务有多小?任务可能太小,根本不值得产生线程。
- 同步:如果需要大量同步,惩罚可能太重,你应该减少线程数。
回答by yu_sha
Multiple threads are separate 'chains' of commands within one process. From CPU point of view threads are more or less like processes. Each thread has its own set of registers and its own stack.
多个线程是一个进程内的独立命令“链”。从 CPU 的角度来看,线程或多或少类似于进程。每个线程都有自己的一组寄存器和自己的堆栈。
The reason why you can have more threads than CPUs is that most threads don't need CPU all the time. Thread can be waiting for user input, downloading something from the web or writing to disk. While it is doing that, it does not need CPU, so CPU is free to execute other threads.
线程数可以多于 CPU 的原因是大多数线程并不总是需要 CPU。线程可以等待用户输入、从网络下载内容或写入磁盘。这样做时,它不需要 CPU,因此 CPU 可以自由执行其他线程。
In your example, each tab of Firefox probably can even have several threads. Or they can share some threads. You need one for downloading, one for rendering, one for message loop (user input), and perhaps one to run Javascript. You cannot easily combine them because while you download you still need to react to user's input. However, download thread is sleeping most of the time, and even when it's downloading it needs CPU only occasionally, and message loop thread only wakes up when you press a button.
在您的示例中,Firefox 的每个选项卡甚至可能有多个线程。或者他们可以共享一些线程。您需要一个用于下载,一个用于渲染,一个用于消息循环(用户输入),可能还有一个用于运行 Javascript。您无法轻松地将它们组合起来,因为在下载时您仍然需要对用户的输入做出反应。但是,下载线程大部分时间都处于休眠状态,即使在下载时也只是偶尔需要 CPU,并且消息循环线程仅在您按下按钮时才唤醒。
If you go to task manager you'll see that despite all these threads your CPU use is still quite low.
如果您转到任务管理器,您会看到尽管有所有这些线程,但您的 CPU 使用率仍然很低。
Of course if all your threads do some number-crunching tasks, then you shouldn't create too many of them as you get no performance benefit (though there may be architectural benefits!).
当然,如果您的所有线程都执行一些数字运算任务,那么您不应该创建过多的线程,因为您不会获得性能优势(尽管可能会有架构优势!)。
However, if they are mainly I/O bound then create as many threads as your architecture dictates. It's hard to give advice without knowing your particular task.
但是,如果它们主要受 I/O 限制,则创建与您的架构所要求的一样多的线程。在不了解您的特定任务的情况下,很难给出建议。
回答by Moishe Lettvin
Broadly speaking, yeah, but "parallel" can mean different things.
It depends what tasks you want to run in parallel.
Not necessarily. Some (indeed most) threads spend a lot of time doing nothing. Might as well switch away from them to a thread that wants to do something.
The OS handles thread switching. It will delegate to different cores if it wants to. If there's only one core it'll divide time between the different threads and processes.
The number of threads is limited by software and hardware. Threads consume processor and memory in varying degrees depending on what they're doing. The thread management software may impose its own limits as well.
从广义上讲,是的,但“平行”可能意味着不同的东西。
这取决于您要并行运行哪些任务。
不必要。一些(确实是大多数)线程花费大量时间无所事事。不妨从他们切换到想要做某事的线程。
操作系统处理线程切换。如果需要,它将委托给不同的核心。如果只有一个核心,它将在不同的线程和进程之间分配时间。
线程数受软件和硬件的限制。线程根据它们正在执行的操作不同程度地消耗处理器和内存。线程管理软件也可能施加其自身的限制。
回答by sybreon
The key thing to remember is the separation between logical/virtual parallelism and real/hardware parallelism. With your average OS, a system call is performed to spawn a new thread. What actually happens (whether it is mapped to a different core, a different hardware thread on the same core, or queued into the pool of software threads) is up to the OS.
要记住的关键是逻辑/虚拟并行和真实/硬件并行之间的分离。对于普通操作系统,执行系统调用以生成新线程。实际发生的事情(是否映射到不同的内核、同一内核上的不同硬件线程,或排队进入软件线程池)取决于操作系统。
- Parallel processing uses all the methods not just multi-threading.
- Generally speaking, if you want to have real parallel processing, you need to perform it in hardware. Take the example of the Niagara, it has up to 8-cores each capable of executing 4-threads in hardware.
- Context switching is needed when there are more threads than is capable of being executed in parallel in hardware. Even then, when executed in series (switching between one thread to the next), they are considered concurrent because there is no guarantee on the orderof switching. So, it may go T0, T1, T2, T1, T3, T0, T2 and so on. For all intents and purposes, the threads are parallel.
- Time slicing.
- That would be up to the OS.
- 并行处理使用所有方法,而不仅仅是多线程。
- 一般来说,如果要真正的并行处理,需要在硬件上进行。以Niagara为例,它有多达 8 个内核,每个内核都能够在硬件中执行 4 个线程。
- 当线程多于硬件中能够并行执行的线程数时,需要上下文切换。即使这样,当串行执行(在一个线程之间切换到下一个线程)时,它们也被认为是并发的,因为无法保证切换的顺序。所以,它可能会去 T0、T1、T2、T1、T3、T0、T2 等等。出于所有意图和目的,线程是并行的。
- 时间切片。
- 这将取决于操作系统。
回答by Paul Davies
Multithreading is the execution of more than one thread at a time. It can happen both on single core processors and the multicore processor systems. For single processor systems, context switching effects it. Look!Context switching in this computational environment refers to time slicing by the operating system. Therefore do not get confused. The operating system is the one that controls the execution of other programs. It allows one program to execute in the CPU at a time. But the frequency at which the threads are switched in and out of the CPU determines the transparency of parallelism exhibited by the system.
多线程是一次执行多个线程。它可能发生在单核处理器和多核处理器系统上。对于单处理器系统,上下文切换会影响它。看!这个计算环境中的上下文切换是指操作系统的时间切片。因此不要混淆。操作系统是控制其他程序执行的操作系统。它允许一次在 CPU 中执行一个程序。但是线程进出 CPU 的频率决定了系统所表现出的并行性的透明度。
For multicore environment,multithreading occurs when each core executes a thread.Though,in multicore again,context switching can occur in the individual cores.
对于多核环境,当每个核执行一个线程时会发生多线程。不过,在多核中,上下文切换可以在单个核中发生。
回答by Dom Grey
I think answers so far are pretty much to the point and give you a good basic context. In essence, say you have quad core processor, but each core is capable of executing 2 simultaneous threads.
我认为到目前为止的答案非常中肯,并为您提供了一个很好的基本背景。本质上,假设您有四核处理器,但每个核都能够同时执行 2 个线程。
Note, that there is only slight (or no) increase of speed if you are running 2 simultaneous threads on 1 core versus you run 1st thread and then 2nd thread vertically. However, each physical core adds speed to your general workflow.
请注意,如果您在 1 个核心上运行 2 个并发线程与运行第一个线程然后垂直运行第二个线程相比,速度只会略有(或没有)增加。但是,每个物理内核都会为您的一般工作流程增加速度。
Now, say you have a process running on your OS that has multiple threads (i.e. needs to run multiple things in "parallel") and has some kind of stack of tasks in a queue (or some other system with priority rules). Then software sends tasks to a queue and your processor attempts to execute them as fast as it can. Now you have 2 cases:
现在,假设您的操作系统上运行一个进程,该进程具有多个线程(即需要“并行”运行多个事物)并且在队列中具有某种任务堆栈(或其他具有优先级规则的系统)。然后软件将任务发送到队列,您的处理器会尝试尽可能快地执行它们。现在你有两种情况:
- If a software supports multiprocessing, then tasks will be sent to any available processor (that is not doing anything or simply finished doing some other job and job send from your software is 1st in a queue).
- If your software does not support multiprocessing, then all of your jobs will be done in a similar manner, but only by one of your cores.
- 如果软件支持多处理,那么任务将被发送到任何可用的处理器(没有做任何事情或只是完成了一些其他工作,并且从您的软件发送的工作在队列中排在第一位)。
- 如果您的软件不支持多处理,那么您的所有工作都将以类似的方式完成,但只能由您的一个内核完成。
I suggest reading Wikipedia pageon thread. Very first picture there already gives you a nice insight. :)
我建议在线程上阅读维基百科页面。那里的第一张图片已经给了你一个很好的洞察力。:)