multithreading SMP 内核、进程和线程如何准确地协同工作?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2986931/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do SMP cores, processes, and threads work together exactly?
提问by Karl
On a single core CPU, each process runs in the OS, and the CPU jumps around from one process to another to best utilize itself. A process can have many threads, in which case the CPU runs through these threads when it is running on the respective process.
在单核 CPU 上,每个进程都在操作系统中运行,CPU 从一个进程跳到另一个进程以最好地利用自己。一个进程可以有多个线程,在这种情况下,当 CPU 在各自的进程上运行时,它会通过这些线程运行。
Now, on a multiple core CPU:
现在,在多核 CPU 上:
Do the cores run in every process together, or can the cores run separately in different processes at one particular point of time? For instance, you have program A running two threads. Can a dual core CPU run both threads of this program? I think the answer should be yes if we are using something like OpenMP. But while the cores are running in this OpenMP-embedded process, can one of the cores simply switch to other process?
For programs that are created for single core, when running at 100%, why is the CPU utilization of each core distributed? (e.g. A dual core CPU of 80% and 20%. The utilization percentage of all cores always add up to 100% for this case.) Do the cores try to help each other by running each thread, of each process, in some ways?
内核是在每个进程中一起运行,还是可以在一个特定时间点在不同进程中单独运行?例如,您有程序 A 运行两个线程。双核 CPU 可以运行这个程序的两个线程吗?如果我们使用OpenMP 之类的东西,我认为答案应该是肯定的。但是,当内核在这个 OpenMP 嵌入式进程中运行时,其中一个内核是否可以简单地切换到另一个进程?
为单核创建的程序,在100%运行时,为什么每个核的CPU利用率是分布的?(例如,80% 和 20% 的双核 CPU。在这种情况下,所有内核的利用率总和为 100%。)内核是否尝试通过运行每个进程的每个线程,以某种方式互相帮助?
采纳答案by BjoernD
Cores(or CPUs) are the physical elements of your computer that execute code. Usually, each core has all necessary elements to perform computations, register files, interrupt lines etc.
内核(或 CPU)是计算机执行代码的物理元素。通常,每个内核都具有执行计算、寄存器文件、中断线等所需的所有元素。
Most operating systems represent applications as processes. This means that the application has its own address space (== view of memory), where the OS makes sure that this view and its content are isolated from other applications.
大多数操作系统将应用程序表示为进程。这意味着应用程序有自己的地址空间(== 内存视图),操作系统确保此视图及其内容与其他应用程序隔离。
A process consists of one or more threads, which carry out the real work of an application by executing machine code on a CPU. The operating system determines, which thread executes on which CPU (by using clever heuristics to improve load balance, energy consumption etc.). If your application consists only of a single thread, then your whole multi-CPU-system won't help you much as it will still only use one CPU for your application. (However, overall performance may still improve as the OS will run other applications on the other CPUs so they don't intermingle with the first one).
一个进程由一个或多个线程组成,这些线程通过在 CPU 上执行机器代码来执行应用程序的实际工作。操作系统确定哪个线程在哪个 CPU 上执行(通过使用巧妙的启发式方法来改善负载平衡、能耗等)。如果您的应用程序仅包含一个线程,那么您的整个多 CPU 系统将不会对您有多大帮助,因为它仍然只会为您的应用程序使用一个 CPU。(但是,整体性能可能仍会提高,因为操作系统将在其他 CPU 上运行其他应用程序,因此它们不会与第一个 CPU 混合)。
Now to your specific questions:
现在回答您的具体问题:
1) The OS usually allows you to at least give hints about on which core you want to execute certain threads. What OpenMP does is to generate code that spawns a certain amount of threads to distribute shared computational work from loops of your program in multiple threads. It can use the OS's hint mechanism (see: thread affinity) to do so. However, OpenMP applications will still run concurrently to others and thus the OS is free to interrupt one of the threads and schedule other (potentially unrelated) work on a CPU. In reality, there are many different scheduling schemes you might want to apply depending on your situation, but this is highly specific and most of the time you should be able to trust your OS doing the right thing for you.
1) The OS usually allows you to at least give hints about on which core you want to execute certain threads. What OpenMP does is to generate code that spawns a certain amount of threads to distribute shared computational work from loops of your program in multiple threads. It can use the OS's hint mechanism (see: thread affinity) to do so. However, OpenMP applications will still run concurrently to others and thus the OS is free to interrupt one of the threads and schedule other (potentially unrelated) work on a CPU. In reality, there are many different scheduling schemes you might want to apply depending on your situation, but this is highly specific and most of the time you should be able to trust your OS doing the right thing for you.
2) Even if you are running a single-threaded application on a multi-core CPU, you notice other CPUs doing work as well. This comes a) from the OS doing its job in the meantime and b) from the fact that your application is never running alone -- each running system consists of a whole bunch of concurrently executing tasks. Check Windows' task manager (or ps/topon Linux) to check what is running.
2) Even if you are running a single-threaded application on a multi-core CPU, you notice other CPUs doing work as well. This comes a) from the OS doing its job in the meantime and b) from the fact that your application is never running alone -- each running system consists of a whole bunch of concurrently executing tasks. Check Windows' task manager (or ps/topon Linux) to check what is running.
回答by John Saunders
Note also that the OS doesn't much care which process the threads are from. It will usually schedule threads to processors / cores regardless of which process the thread is from. This could lead to four threads from one process running at the same time, as easily as one thread from four processes running at the same time.
Note also that the OS doesn't much care which process the threads are from. It will usually schedule threads to processors / cores regardless of which process the thread is from. This could lead to four threads from one process running at the same time, as easily as one thread from four processes running at the same time.
回答by NitinS
@BjoernD, you mentioned that..
@BjoernD, you mentioned that..
.. If your application consists only of a single thread, then your whole multi-CPU-system won't help you much as it will still only use one CPU for your application...
.. If your application consists only of a single thread, then your whole multi-CPU-system won't help you much as it will still only use one CPU for your application...
I think even if its a single threaded application, that application thread may be executed on different cores during its lifetime. On each preemption and later assignment to a CPU, a different core may get assigned to that thread.
I think even if its a single threaded application, that application thread may be executed on different cores during its lifetime. On each preemption and later assignment to a CPU, a different core may get assigned to that thread.
回答by Nick Bastin
Yes, threads and processes can run concurrently on multi-core CPUs, so this works as you describe (regardless of how you create those threads and processes, OpenMP or otherwise). A single process or thread only runs on a single core at a time. If there are more threads requesting CPU time than available cores (generally the case), the operating system scheduler will move threads on and off cores as needed.
Yes, threads and processes can run concurrently on multi-core CPUs, so this works as you describe (regardless of how you create those threads and processes, OpenMP or otherwise). A single process or thread only runs on a single core at a time. If there are more threads requesting CPU time than available cores (generally the case), the operating system scheduler will move threads on and off cores as needed.
The reason why single-threaded processes run on more than one CPU or core is related to your operating system, and not specifically any feature of the hardware. Some operating systems have no sense of "thread affinity" - they don't care what processor a thread is running on - so when time comes to re-evaluate what resources are being used (several times a second, at least), they'll move a thread/process from one core/CPU to another. Other than causing cache misses, this generally doesn't affect the performance of your process.
The reason why single-threaded processes run on more than one CPU or core is related to your operating system, and not specifically any feature of the hardware. Some operating systems have no sense of "thread affinity" - they don't care what processor a thread is running on - so when time comes to re-evaluate what resources are being used (several times a second, at least), they'll move a thread/process from one core/CPU to another. Other than causing cache misses, this generally doesn't affect the performance of your process.
回答by Vasudeva Reddy
If there is one thread application which has say 10 threads, initially it will start on the same CPU/core.over a period of time the multiple threads will be distributed to other cores/cpus due to the load balancer in Linux. If there are multiple such thread applications are there,I think all the application threads mostly run on the same core/cpu as the locals/globals of the threads are readily available in l1/l2 cache of the core in which they were running.Moving them out of the core is time consuming than their execution time.If the threads need be run in a different core.I think one has to supply the affinity info to the thread.
If there is one thread application which has say 10 threads, initially it will start on the same CPU/core.over a period of time the multiple threads will be distributed to other cores/cpus due to the load balancer in Linux. If there are multiple such thread applications are there,I think all the application threads mostly run on the same core/cpu as the locals/globals of the threads are readily available in l1/l2 cache of the core in which they were running.Moving them out of the core is time consuming than their execution time.If the threads need be run in a different core.I think one has to supply the affinity info to the thread.