Java 线程和内核数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34689709/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 23:12:06  来源:igfitidea点击:

Java threads and number of cores

javamultithreadingconcurrencycpu-cores

提问by user5765683

I just had a quick question on how processors and threads work. According to my current understanding, a core can only perform 1 process at a time. But we are able to produce a thread pool(lets say 30) with a larger number than the number of cores that we posses(lets say 4) and have them run concurrently. How is this possible if we are only have 4 cores? I am also able to run my 30 thread program on my local computer and also continue to perform other activities on my computer such as watch movies or browse the internet.

我只是有一个关于处理器和线程如何工作的快速问题。按照我目前的理解,一个核心一次只能执行1个进程。但是我们能够生成一个线程池(假设为 30),其数量大于我们拥有的内核数量(假设为 4 个)并让它们同时运行。如果我们只有 4 个内核,这怎么可能?我还可以在我的本地计算机上运行我的 30 线程程序,并继续在我的计算机上执行其他活动,例如看电影或浏览互联网。

I have read somewhere that scheduling of threads occurs and that sort of gives the illusion that these 30 threads are running concurrently by the 4 cores. Is this true and if so can someone explain how this works and also recommend some good reading on this?

我在某处读到过线程调度发生的情况,这给人一种错觉,即这 30 个线程由 4 个内核同时运行。这是真的吗,如果是的话,有人可以解释这是如何工作的,并推荐一些关于此的好书吗?

Thank you in advance for the help.

预先感谢您的帮助。

回答by gardenhead

Processes vs Threads

进程与线程

In days of old, each process had precisely one threadof execution, so processes were scheduled onto cores directly (and in these old days, there was almost only one core to schedule onto). However, in operating systems that support threading (which is almost all moderns OS's), it is threads, not processes that are scheduled. So for the rest of this discussion we will talk exclusively about threads, and you should understand that each running process has one or more threads of execution.

在过去,每个进程只有一个执行线程,因此进程被直接调度到内核上(在过去,几乎只有一个内核可以调度)。但是,在支持线程(几乎所有现代操作系统)的操作系统中,调度的是线程,而不是进程。因此,在接下来的讨论中,我们将专门讨论线程,您应该了解每个正在运行的进程都有一个或多个执行线程。

Parallelism vs Concurrency

并行与并发

When two threads are running in parallel, they are both running at the same time. For example, if we have two threads, A and B, then their parallel execution would look like this:

当两个线程都在运行并行,它们都运行在同一时间。例如,如果我们有两个线程 A 和 B,那么它们的并行执行将如下所示:

CPU 1: A ------------------------->

CPU 1:A ------------------------->

CPU 2: B ------------------------->

CPU 2:B ------------------------->

When two threads are running concurrently, their execution overlaps. Overlapping can happen in one of two ways: either the threads are executing at the same time (i.e. in parallel, as above), or their executions are being interleaved on the processor, like so:

当两个线程运行的同时,它们的执行重叠。重叠可以通过以下两种方式之一发生:线程同时执行(即并行执行,如上所述),或者它们的执行在处理器上交错执行,如下所示:

CPU 1: A -----------> B ----------> A -----------> B ---------->

CPU 1:A -----------> B ----------> A -----------> B -------- -->

So, for our purposes, parallelism can be thought of as a special case of concurrency*

因此,就我们的目的而言,并行性可以被认为是并发性的一种特殊情况*

Scheduling

调度

But we are able to produce a thread pool(lets say 30) with a larger number than the number of cores that we posses(lets say 4) and have them run concurrently. How is this possible if we are only have 4 cores?

但是我们能够生成一个线程池(假设为 30),其数量大于我们拥有的内核数量(假设为 4 个)并让它们同时运行。如果我们只有 4 个内核,这怎么可能?

In this case, they can run concurrently because the CPU scheduler is giving each one of those 30 threads some share of CPU time. Some threads willbe running in parallel (if you have 4 cores, then 4 threads will be running in parallel at any one time), but all 30 threads will be running concurrently. The reason you can then go play games or browse the web is that these new threads are added to the thread pool/queue and also given a share of CPU time.

在这种情况下,它们可以并发运行,因为 CPU 调度程序为这 30 个线程中的每一个分配了一些 CPU 时间。一些线程并行运行(如果您有 4 个内核,那么在任何时候都会有 4 个线程并行运行),但所有 30 个线程将同时运行。然后你可以去玩游戏或浏览网页的原因是这些新线程被添加到线程池/队列中,并且还被分配了 CPU 时间。

Logical vs Physical Cores

逻辑内核与物理内核

According to my current understanding, a core can only perform 1 process at a time

根据我目前的理解,一个核心一次只能执行1个进程

This is not quitetrue. Due to very clever hardware design and pipelining that would be much too long to go into here (plus I don't understand it), it is possible for one physical core to actually be executing two completely different threads of execution at the same time. Chew over that sentence a bit if you need to -- it still blows my mind.

这并不完全正确。由于非常聪明的硬件设计和流水线,在这里介绍的时间太长(加上我不明白),一个物理内核实际上可能同时执行两个完全不同的执行线程。如果需要,请仔细阅读这句话——它仍然让我大吃一惊。

This amazing feat is called simultaneous multi-threading (or popularly Hyper-Threading, although that is a proprietary name for a specific instance of such technology). Thus, we have physical cores, which are the actual hardware CPU cores, and logical cores, which is the number of cores the operating system tells software is available for use. Logical cores are essentially an abstraction. In typical modern Intel CPUs, each physical core acts as two logical cores.

这个惊人的壮举被称为同时多线程(或流行的超线程,尽管这是此类技术特定实例的专有名称)。因此,我们有物理内核,即实际的硬件 CPU 内核,以及逻辑内核,即操作系统告诉软件可供使用的内核数。逻辑核心本质上是一种抽象。在典型的现代 Intel CPU 中,每个物理内核都充当两个逻辑内核。

can someone explain how this works and also recommend some good reading on this?

有人可以解释这是如何工作的,并推荐一些关于此的好书吗?

I would recommend Operating System Conceptsif you really want to understand how processes, threads, and scheduling all work together.

如果您真的想了解进程、线程和调度如何协同工作,我会推荐操作系统概念

  • The precise meanings of the terms paralleland concurrentare hotly debated, even here in our very own stack overflow. What one means by these terms depends a lot on the application domain.

回答by user5765683

Java do not perform Thread scheduling, it leaves this on Operating Systemto perform Thread scheduling.

Java 不执行线程调度,它将这个留给操作系统来执行线程调度。

For computationally intensivetasks, It is recommended to have thread pool size equal to number of cores available. But for I/O boundtasks we should have larger number of threads. There are many other variations, if both type of tasks are available and needs CPU time slice.

对于计算密集型任务,建议线程池大小等于可用内核数。但是对于I/O 绑定的任务,我们应该有更多的线程。如果两种类型的任务都可用并且需要 CPU 时间片,则还有许多其他变体。

a core can only perform 1 process at a time

一个核心一次只能执行 1 个进程

Yes, but they can multitaskand create an illusionthat they are processing more than one process at a time

是的,但他们可以同时处理多项任务并造成一种错觉,即他们一次处理多个进程

How is this possible if we are only have 4 cores? I am also able to run my 30 thread program on my local computer and also continue to perform other activities on my computer

如果我们只有 4 个内核,这怎么可能?我还可以在我的本地计算机上运行我的 30 线程程序,并继续在我的计算机上执行其他活动

This is possible due to multitasking(which is concurrency). Lets say you started 30 threads and OS is also running 50 threads, all 80 threads will share 4 CPU cores by getting CPU time slice one by one (one thread per core at a time). Which means on average each core will run 80/4=20 threads concurrently. And you will feel all threads/processes are running at the same time.

由于多任务(即并发),这是可能的。假设您启动了 30 个线程,操作系统也运行了 50 个线程,所有 80 个线程将通过一个一个获取 CPU 时间片(每次每个内核一个线程)共享 4 个 CPU 内核。这意味着平均每个内核将同时运行 80/4=20 个线程。您会感觉到所有线程/进程都在同时运行。

can someone explain how this works

有人可以解释一下这是如何工作的吗

All of this happens at OS level. If you are a programmer then you should not worry about this. But if you are a student of OS then pick any OS book & learn more about Multi-threading at OS level in detail or find some good research paper for depth. One thing you should know that each OS handle these things in different way (but generally concepts are same)

所有这些都发生在操作系统级别。如果你是一名程序员,那么你不应该担心这个。但是,如果您是 OS 的学生,那么请选择任何 OS 书籍并详细了解 OS 级别的多线程或查找一些很好的深入研究论文。您应该知道的一件事是每个操作系统以不同的方式处理这些事情(但通常概念是相同的)

There are some languages like Erlang, which use green threads (or processes), due to which they get the ability to map and schedule threads on their own eliminating OS. So, do some research on green threadsas well if you are interested.

有一些像Erlang这样的语言使用绿色线程(或进程),因此它们能够在自己的消除操作系统上映射和调度线程。因此,如果您有兴趣,也可以对绿色线程进行一些研究。

Note:You can also research on actorswhich is another abstractionover threads. Languages like Erlang, Scala etc use actors to accomplish tasks. One thread can have hundred of actors; each actor can perform different task (similar to threads in java).

注意:您还可以研究actor,这是对线程的另一种抽象。Erlang、Scala 等语言使用 actor 来完成任务。一个线程可以有数百个参与者;每个actor可以执行不同的任务(类似于java中的线程)。

This is a very vast and active research topicand there are many things to learn.

这是一个非常广泛和活跃的研究课题,有很多东西需要学习。

回答by Andy Guibert

In short, your understanding of a core is correct. A core can execute 1 thread (aka process) at a time.

简而言之,您对核心的理解是正确的。一个核心一次可以执行 1 个线程(也称为进程)。

However, your program doesn't reallyrun 30 threads at once. Of those 30 threads, only 4 are running at a time, and the other 26 are waiting. The CPU will schedule threads and give each thread a slice of time to run on a core. So the CPU will make all the threads take turns running.

但是,您的程序并不会真正同时运行 30 个线程。在这 30 个线程中,一次只有 4 个在运行,其他 26 个在等待。CPU 将调度线程并为每个线程提供在内核上运行的时间片。所以CPU会让所有的线程轮流运行。

A common misconception:

一个常见的误解:

Having more threads will make my program run faster.

拥有更多线程将使我的程序运行得更快。

FALSE: Having more threads will NOTalways make your program run faster. It just means the CPU has to do more switching, and in fact, having too many threads will make your program run slowerbecause of the overhead caused by switching out all the different processes.

错误:拥有更多线程并不总是能让您的程序运行得更快。这只是意味着 CPU 必须做更多的切换,实际上,线程过多会使您的程序运行更慢,因为切换所有不同的进程会带来开销。