我应该在 Java 程序中使用多少个线程?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/130506/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How many threads should I use in my Java program?
提问by Andrew
I recently inherited a small Java program that takes information from a large database, does some processing and produces a detailed image regarding the information. The original author wrote the code using a single thread, then later modified it to allow it to use multiple threads.
我最近继承了一个小型 Java 程序,该程序从大型数据库中获取信息,进行一些处理并生成有关信息的详细图像。原作者使用单线程编写代码,然后对其进行了修改以允许它使用多线程。
In the code he defines a constant;
在代码中他定义了一个常量;
// number of threads
public static final int THREADS = Runtime.getRuntime().availableProcessors();
Which then sets the number of threads that are used to create the image.
然后设置用于创建图像的线程数。
I understand his reasoning that the number of threads cannot be greater than the number of available processors, so set it the the amount to get the full potential out of the processor(s). Is this correct? or is there a better way to utilize the full potential of the processor(s)?
我理解他的推理,线程数不能大于可用处理器的数量,因此将其设置为充分发挥处理器潜力的数量。这个对吗?或者有没有更好的方法来利用处理器的全部潜力?
EDIT: To give some more clarification, The specific algorithm that is being threaded scales to the resolution of the picture being created, (1 thread per pixel). That is obviously not the best solution though. The work that this algorithm does is what takes all the time, and is wholly mathematical operations, there are no locks or other factors that will cause any given thread to sleep. I just want to maximize the programs CPU utilization to decrease the time to completion.
编辑:为了提供更多说明,正在线程化的特定算法可缩放到正在创建的图片的分辨率(每像素 1 个线程)。但这显然不是最好的解决方案。该算法所做的工作需要所有时间,并且完全是数学运算,没有锁或其他因素会导致任何给定线程休眠。我只想最大化程序的 CPU 利用率以减少完成时间。
回答by Kevin Day
Threads are fine, but as others have noted, you have to be highly aware of your bottlenecks. Your algorithm sounds like it would be susceptible to cache contention between multiple CPUs - this is particularly nasty because it has the potential to hit the performance of all of your threads (normally you think of using multiple threads to continue processing while waiting for slow or high latency IO operations).
线程很好,但正如其他人所指出的,您必须高度了解自己的瓶颈。您的算法听起来很容易受到多个 CPU 之间的缓存争用的影响 - 这尤其令人讨厌,因为它有可能影响所有线程的性能(通常您会考虑使用多个线程来继续处理,同时等待慢速或高延迟 IO 操作)。
Cache contention is a very important aspect of using multi CPUs to process a highly parallelized algorithm: Make sure that you take your memory utilization into account. If you can construct your data objects so each thread has it's own memory that it is working on, you can greatly reduce cache contention between the CPUs. For example, it may be easier to have a big array of ints and have different threads working on different parts of that array - but in Java, the bounds checks on that array are going to be trying to access the same address in memory, which can cause a given CPU to have to reload data from L2 or L3 cache.
缓存争用是使用多 CPU 处理高度并行化算法的一个非常重要的方面:确保将内存利用率考虑在内。如果您可以构建数据对象,使每个线程都有自己的内存供其处理,则可以大大减少 CPU 之间的缓存争用。例如,拥有一个大的整数数组并让不同的线程在该数组的不同部分工作可能更容易 - 但在 Java 中,对该数组的边界检查将尝试访问内存中的相同地址,这可能导致给定的 CPU 必须从 L2 或 L3 缓存重新加载数据。
Splitting the data into it's own data structures, and configure those data structures so they are thread local (might even be more optimal to use ThreadLocal- that actually uses constructs in the OS that provide guarantees that the CPU can use to optimize cache.
将数据拆分成它自己的数据结构,并配置这些数据结构,使它们是线程本地的(使用ThreadLocal甚至可能更优化——它实际上使用操作系统中的构造来保证 CPU 可以用来优化缓存。
The best piece of advice I can give you is test, test, test. Don't make assumptions about how CPUs will perform - there is a hugeamount of magic going on in CPUs these days, often with counterintuitive results. Note also that the JIT runtime optimization will add an additional layer of complexity here (maybe good, maybe not).
我能给你的最好的建议是测试,测试,再测试。不要对 CPU 将如何执行做出假设 - 现在 CPU 中发生了大量的魔法,通常会产生违反直觉的结果。还要注意,JIT 运行时优化会在这里增加一层额外的复杂性(可能好,也可能不好)。
回答by Will Hartung
On the one hand, you'd like to think Threads == CPU/Cores makes perfect sense. Why have a thread if there's nothing to run it?
一方面,您可能会认为 Threads == CPU/Cores 是完全合理的。如果没有什么可以运行它,为什么要有一个线程?
The detail boils down to "what are the threads doing". A thread that's idle waiting for a network packet or a disk block is CPU time wasted.
细节归结为“线程在做什么”。空闲等待网络数据包或磁盘块的线程是浪费 CPU 时间。
If your threads are CPU heavy, then a 1:1 correlation makes some sense. If you have a single "read the DB" thread that feeds the other threads, and a single "Dump the data" thread and pulls data from the CPU threads and create output, those two could most likely easily share a CPU while the CPU heavy threads keep churning away.
如果您的线程占用大量 CPU,那么 1:1 的相关性就有意义。如果您有一个“读取数据库”线程来为其他线程提供数据,而一个“转储数据”线程从 CPU 线程中提取数据并创建输出,那么这两个线程很可能很容易在 CPU 很重的情况下共享一个 CPU线程不断搅动。
The real answer, as with all sorts of things, is to measure it. Since the number is configurable (apparently), configure it! Run it with 1:1 threads to CPUs, 2:1, 1.5:1, whatever, and time the results. Fast one wins.
与所有事情一样,真正的答案是衡量它。由于数量是可配置的(显然),请配置它!以 1:1 线程与 CPU、2:1、1.5:1 等方式运行它,并计算结果的时间。快取胜。
回答by Rob
The number that your application needs; no more, and no less.
您的应用程序需要的号码;不多也不少。
Obviously, if you're writing an application which contains some parallelisable algorithm, then you can probably start benchmarking to find a good balance in the number of threads, but bear in mind that hundreds of threads won't speed up any operation.
显然,如果您正在编写一个包含一些可并行化算法的应用程序,那么您可能可以开始进行基准测试以找到线程数量的良好平衡,但请记住,数百个线程不会加速任何操作。
If your algorithm can't be parallelised, then no number of additional threads is going to help.
如果您的算法无法并行化,那么任何额外的线程都无济于事。
回答by Derek Park
Yes, that's a perfectly reasonable approach. One thread per processor/core will maximize processing power and minimize context switching. I'd probably leave that as-is unless I found a problem via benchmarking/profiling.
是的,这是一个完全合理的方法。每个处理器/内核一个线程将最大限度地提高处理能力并最大限度地减少上下文切换。除非我通过基准测试/分析发现问题,否则我可能会保持原样。
One thing to note is that the JVM does not guarantee availableProcessors()will be constant, so technically, you should check it immediately before spawning your threads. I doubt that this value is likely to change at runtime on typical computers, though.
需要注意的一件事是 JVM 不保证availableProcessors()将是恒定的,因此从技术上讲,您应该在生成线程之前立即检查它。不过,我怀疑这个值可能会在典型计算机上的运行时发生变化。
P.S. As others have pointed out, if your process is not CPU-bound, this approach is unlikely to be optimal. Since you say these threads are being used to generate images, though, I assume you areCPU bound.
PS 正如其他人指出的那样,如果您的进程不受 CPU 限制,则这种方法不太可能是最佳的。既然您说这些线程用于生成图像,那么我假设您受CPU 限制。
回答by Javier
number of processors is a good start; but if those threads do a lot of i/o, then might be better with more... or less.
处理器的数量是一个好的开始;但如果这些线程执行大量 I/O,那么使用更多...或更少可能会更好。
first think of what are the resources available and what do you want to optimise (least time to finish, least impact to other tasks, etc). then do the math.
首先考虑有哪些可用资源以及您想要优化什么(完成时间最短、对其他任务的影响最小等)。然后做数学。
sometimes it could be better if you dedicate a thread or two to each i/o resource, and the others fight for CPU. the analisys is usually easier on these designs.
有时,如果您将一两个线程专用于每个 I/O 资源,而其他线程为 CPU 争用,则可能会更好。对于这些设计,analisys 通常更容易。
回答by user19113
The benefit of using threads is to reduce wall-clock execution time of your program by allowing your program to work on a different part of the job while another part is waiting for something to happen (usually I/O). If your program is totally CPU bound adding threads will only slow it down. If it is fully or partially I/O bound, adding threads may help but there's a balance point to be struck between the overhead of adding threads and the additional work that will get accomplished. To make the number of threads equal to the number of processors will yield peak performance if the program is totally, or near-totally CPU-bound.
使用线程的好处是允许您的程序在作业的不同部分工作,而另一部分正在等待某事发生(通常是 I/O),从而减少程序的挂钟执行时间。如果您的程序完全受 CPU 限制,则添加线程只会减慢它的速度。如果它完全或部分受 I/O 限制,添加线程可能会有所帮助,但在添加线程的开销和将要完成的额外工作之间需要找到一个平衡点。如果程序完全或接近完全受 CPU 限制,则使线程数等于处理器数将产生最佳性能。
As with many questions with the word "should" in them, the answer is, "It depends". If you think you can get better performance, adjust the number of threads up or down and benchmark the application's performance. Also take into account any other factors that might influence the decision (if your application is eating 100% of the computer's available horsepower, the performance of other applications will be reduced).
与许多带有“应该”一词的问题一样,答案是“这取决于”。如果您认为可以获得更好的性能,请向上或向下调整线程数并对应用程序的性能进行基准测试。还要考虑可能影响决策的任何其他因素(如果您的应用程序消耗了计算机可用马力的 100%,则其他应用程序的性能将会降低)。
This assumes that the multi-threaded code is written properly etc. If the original developer only had one CPU, he would never have had a chance to experience problems with poorly-written threading code. So you should probably test behaviour as well as performance when adjusting the number of threads.
这假设多线程代码编写正确等。如果原始开发人员只有一个 CPU,他将永远不会遇到编写糟糕的线程代码的问题。因此,您应该在调整线程数时测试行为和性能。
By the way, you might want to consider allowing the number of threads to be configured at run time instead of compile time to make this whole process easier.
顺便说一下,您可能需要考虑允许在运行时而不是编译时配置线程数,以使整个过程更容易。
回答by user19113
After seeing your edit, it's quite possible that one thread per CPU is as good as it gets. Your application seems quite parallelizable. If you have extra hardware you can use GridGain to grid-enable your app and have it run on multiple machines. That's probably about the only thing, beyond buying faster / more cores, that will speed it up.
在看到您的编辑后,很可能每个 CPU 一个线程已经达到最佳状态。您的应用程序似乎非常可并行化。如果你有额外的硬件,你可以使用 GridGain 来为你的应用程序启用网格并让它在多台机器上运行。除了购买更快/更多内核之外,这可能是唯一可以加快速度的事情。

