java 处理器核心数与线程池大小

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14556037/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 16:39:22  来源:igfitidea点击:

Number of processor core vs the size of a thread pool

javamultithreading

提问by Prasad Weera

Many times I've heard that it is better to maintain the number of threads in a thread pool below the number of cores in that system. Having twice or more threads than the number of cores is not only a waste, but also could cause performance degradation.

很多时候我听说最好将线程池中的线程数保持在该系统中的内核数以下。线程数超过内核数的两倍或更多不仅是一种浪费,而且可能导致性能下降。

Are those true? If not, what are the fundamental principles that debunk those claims (specifically relating to java)?

这些是真的吗?如果不是,则驳斥这些主张的基本原则是什么(特别是与 Java 相关的)?

回答by Stephen C

Many times I've heard that it is better to maintain the number of threads in a thread pool below the number of cores in that system. Having twice or more threads than the number of cores is not only a waste, but also could cause performance degradation.

很多时候我听说最好将线程池中的线程数保持在该系统中的内核数以下。线程数超过内核数的两倍或更多不仅是一种浪费,而且可能导致性能下降。

The claims are not true as a general statement. That is to say, sometimes they are true (or true-ish) and other times they are patently false.

作为一般陈述,这些说法并不真实。也就是说,有时它们是真实的(或真实的),而有时它们显然是错误的。

A couple things are indisputably true:

有几件事无可争议是正确的:

  1. More threads means more memory usage. Each thread requires a thread stack. For recent HotSpot JVMs, the minimumthread stack size is 64Kb, and the default can be as much as 1Mb. That can be significant. In addition, any thread that is alive is likely to own or share objects in the heap whether or not it is currently runnable. Therefore is is reasonable to expect that more threads means a larger memory working set.

  2. A JVM cannot have more threads actually running than there are cores (or hyperthread cores or whatever) on the execution hardware. A car won't run without an engine, and a thread won't run without a core.

  1. 更多的线程意味着更多的内存使用。每个线程都需要一个线程堆栈。对于最近的 HotSpot JVM,最小线程栈大小是 64Kb,默认可以达到 1Mb。这可能很重要。此外,任何处于活动状态的线程都可能拥有或共享堆中的对象,无论它当前是否可运行。因此,期望更多的线程意味着更大的内存工作集是合理的。

  2. JVM 实际运行的线程不能多于执行硬件上的核心(或超线程核心或其他)。汽车没有引擎就不能运行,线程没有内核就不能运行。

Beyond that, things get less clear cut. The "problem" is that a live thread can in a variety of "states". For instance:

除此之外,事情变得不那么明确了。“问题”是活动线程可以处于各种“状态”。例如:

  • A live thread can be running; i.e. actively executing instructions.
  • A live thread can be runnable; i.e. waiting for a core so that it can be run.
  • A live thread can by synchronizing; i.e. waiting for a signal from another thread, or waiting for a lock to be released.
  • A live thread can be waiting on an external event; e.g. waiting for some external server / service to respond to a request.
  • 一个活动线程可以运行;即主动执行指令。
  • 活动线程可以运行;即等待一个核心,以便它可以运行。
  • 一个活动线程可以通过同步;即等待来自另一个线程的信号,或等待释放锁。
  • 活动线程可以等待外部事件;例如,等待某个外部服务器/服务响应请求。

The "one thread per core" heuristic assumes that threads are either running or runnable (according to the above). But for a lot of multi-threaded applications, the heuristic is wrong ... because it doesn't take account of threads in the other states.

“每核一个线程”启发式假设线程正在运行或可运行(根据上文)。但是对于很多多线程应用程序来说,启发式是错误的……因为它没有考虑其他状态中的线程。

Now "too many" threads clearly cancause significant performance degradation, simple by using too much memory. (Imagine that you have 4Gb of physical memory and you create 8,000 threads with 1Mb stacks. That is a recipe for virtual memory thrashing.)

现在,“太多”线程显然导致显着的性能下降,这很简单,只需使用过多内存即可。(想象一下,您有 4Gb 的物理内存,并创建了 8,000 个具有 1Mb 堆栈的线程。这是虚拟内存抖动的一个秘诀。)

But what about other things? Can having too many threads causeexcessive context switching?

但是其他的事情呢?线程过多会导致上下文切换过多吗?

I don't think so. If you have lots of threads, and your application's use of those threads can result in excessive context switches, and that isbad for performance. However, I posit that the root cause of the context switched is not the actual number of threads. The root of the performance problems are more likely that the application is:

我不这么认为。如果你有大量的线程,和你的应用程序使用的线程可能会导致过多的上下文切换,这不好的性能。但是,我认为上下文切换的根本原因不是线程的实际数量。性能问题的根源更有可能是应用程序:

  • synchronizing in a particularly wasteful way; e.g. using Object.notifyAll()when Object.notify()would be better, OR
  • synchronizing on a highly contended data structure, OR
  • doing too much synchronization relative to the amount of useful work that each thread is doing, OR
  • trying to do too much I/O in parallel.
  • 以一种特别浪费的方式同步;例如使用Object.notifyAll()whenObject.notify()会更好,或者
  • 在高度竞争的数据结构上同步,或
  • 相对于每个线程正在做的有用工作量做太多的同步,或者
  • 试图并行执行过多的 I/O。

(In the last case, the bottleneck is likely to be the I/O system rather than context switches ... unless the I/O is IPC with services / programs on the same machine.)

(在最后一种情况下,瓶颈很可能是 I/O 系统而不是上下文切换……除非 I/O 是在同一台机器上具有服务/程序的 IPC。)

The other point is that in the absence of the confounding factors above, having more threads is not going to increase context switches. If your application has N runnable threads competing for M processors, and the threads are purely computational and contention free, then the OS'es thread scheduler is going to attempt to time-slice between them. But the length of a timeslice is likely to be measured in tenths of a second (or more), so that the context switch overhead is negligible compared with the work that a CPU-bound thread actually performs during its slice. And if we assume that the length of a time slice is constant, then the context switch overhead will be constant too. Adding more runnable threads (increasing N) won't change the ratio of work to overhead significantly.

另一点是,在没有上述混杂因素的情况下,拥有更多线程不会增加上下文切换。如果您的应用程序有 N 个可运行的线程来竞争 M 个处理器,并且这些线程纯粹是计算性和无争用的,那么操作系统的线程调度程序将尝试在它们之间进行时间切片。但是时间片的长度很可能以十分之一秒(或更多)为单位进行测量,因此与受 CPU 限制的线程在其切片期间实际执行的工作相比,上下文切换开销可以忽略不计。如果我们假设时间片的长度是恒定的,那么上下文切换开销也将是恒定的。添加更多可运行线程(增加 N)不会显着改变工作与开销的比率。



In summary, it is true that "too many threads" is harmful for performance. However, there is no reliable universal "rule of thumb" for how many is "too many". And (fortunately) you generally have considerable leeway before the performance problems of "too many" become significant.

综上所述,“线程太多”对性能是有害的。但是,对于多少是“太多”,没有可靠的通用“经验法则”。并且(幸运的是)在“太多”的性能问题变得显着之前,您通常有相当大的回旋余地。

回答by Jerry Coffin

Having fewer threads than cores generally means you can'ttake advantage of all available cores.

线程数少于内核数通常意味着您无法利用所有可用内核。

The usual question is how many more threads than cores you want. That, however, varies, depending on the amount of time (overall) that your threads spend doing things like I/O vs. the amount of time they spend doing computation. If they're all doing pure computation, then you'd normally want about the same number of threads as cores. If they're doing a fair amount of I/O, you'd typically want quite a few more threads than cores.

通常的问题是您想要的线程数比内核多多少。但是,这取决于线程执行 I/O 等操作所花费的时间(总体)与它们执行计算所花费的时间。如果它们都在进行纯计算,那么您通常需要与内核数量相同的线程数。如果他们进行大量 I/O,您通常需要比内核更多的线程。

Looking at it from the other direction for a moment, you want enough threads running to ensure that whenever one thread blocks for some reason (typically waiting on I/O) you have another thread (that's not blocked) available to run on that core. The exact number that takes depends on how much of its time each thread spends blocked.

从另一个方向看一会儿,您需要运行足够多的线程以确保每当一个线程因某种原因阻塞(通常等待 I/O)时,您有另一个线程(未阻塞)可用于在该内核上运行。所需的确切数字取决于每个线程在阻塞上花费的时间。

回答by David Schwartz

That's not true, unless the number of threads is vastly more than the number of cores. The reasoning is that additional threads will mean additional context switches. But it's not true because an operating system will only make unforced context switches if those context switches are beneficial, and additional threads don't force additional context switches.

事实并非如此,除非线程数远远超过内核数。原因是额外的线程将意味着额外的上下文切换。但事实并非如此,因为如果这些上下文切换是有益的,操作系统只会进行非强制上下文切换,并且附加线程不会强制附加上下文切换。

If you create an absurd number of threads, that wastes resources. But none of this is anything compared to how bad creating too few threads is. If you create too few threads, an unexpected block (such as a page fault) can result in CPUs sitting idle, and that swamps any possible harm from a few extra context switches.

如果您创建了大量线程,则会浪费资源。但是,与创建太少线程的糟糕程度相比,这些都算不了什么。如果您创建的线程太少,意外的块(例如页面错误)可能会导致 CPU 闲置,这会消除一些额外的上下文切换带来的任何可能的伤害。

回答by valdo

Not exactly true, this depends on the overall software architecture. There's a reason of keeping more threads than available cores in case some of the threads are suspended by the OS because they're waiting for an I/O to complete. This may be an explicit I/O invocation (such as synchronous reading from file), as well as implicit, such as system paging handling.

不完全正确,这取决于整体软件架构。如果某些线程因等待 I/O 完成而被操作系统挂起,则保留多于可用内核的线程是有原因的。这可能是显式 I/O 调用(例如同步读取文件),也可能是隐式 I/O 调用,例如系统分页处理。

Actually I've read in one book that keeping the number of threads twice the number of CPU cores is is a good practice.

实际上,我在一本书中读到,将线程数保持为 CPU 内核数的两倍是一种很好的做法。