java 多线程能提高性能吗?场景java

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27578208/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 12:02:42  来源:igfitidea点击:

does multi threading improve performance? scenario java

javamultithreading

提问by Rory Lester

I have a List<Object> objectsToProcess.Lets say it contains 1000000 item`s. For all items in the array you then process each one like this :

我有一个List<Object> objectsToProcess.假设它包含 1000000 个项目。对于数组中的所有项目,您可以像这样处理每个项目:

for(Object : objectsToProcess){
    Go to database retrieve data.
    process
    save data
}

My question is : would multi threading improve performance? I would of thought that multi threads are allocated by default by the processor anyways?

我的问题是:多线程会提高性能吗?我认为多线程是由处理器默认分配的?

回答by Christian Hujer

In the described scenario, given that process is a time-consuming task, and given that the CPU has more than one core, multi-threading will indeed improve the performance.

在所描述的场景中,鉴于该进程是一项耗时的任务,并且鉴于 CPU 具有多个内核,多线程确实会提高性能。

The processor is not the one who allocates the threads. The processor is the one who provides the resources (virtual CPUs / virtual processors) that can be used by threads by providing more than one execution unit / execution context. Programs need to create multiple threads themselves in order to utilize multiple CPU cores at the same time.

处理器不是分配线程的人。处理器是通过提供多个执行单元/执行上下文来提供可供线程使用的资源(虚拟 CPU/虚拟处理器)的处理器。程序需要自己创建多个线程才能同时使用多个 CPU 内核。

The two major reasons for multi-threading are:

多线程的两个主要原因是:

  • Making use of multiple CPU cores which would otherwise be unused or at least not contribute to reducing the time it takes to solve a given problem - if the problem can be divided into subproblems which can be processed independently of each other (parallelization possible).
  • Making the program act and react on multiple things at the same time (i.e. Event Thread vs. Swing Worker).
  • 利用多个 CPU 内核,否则这些内核将不会被使用或至少不会有助于减少解决给定问题所需的时间 - 如果问题可以划分为可以相互独立处理的子问题(可以并行化)。
  • 让程序同时对多个事物采取行动和反应(即事件线程与 Swing Worker)。

There are programming languages and execution environments in which threads will be created automatically in order to process problems that can be parallelized. Java is not (yet) one of them, but since Java 8 it's on a good way to that, and Java 9 maybe will bring even more.

在某些编程语言和执行环境中,将自动创建线程以处理可以并行化的问题。Java 不是(还)其中之一,但自 Java 8 以来,它是一个很好的方式,Java 9 可能会带来更多。

Usually you do not want significantly more threads than the CPU provides CPU cores, for the simple reason that thread-switching and thread-synchronization is overhead that slows down.

通常您不想要比 CPU 提供的 CPU 内核多得多的线程,原因很简单,线程切换和线程同步是减慢的开销。

The package java.util.concurrentprovides many classes that help with typical problems of multithreading. What you want is an ExecutorServiceto which you assign the tasks that should be run and completed in parallel. The class Executorsprovides factor methods for creating popular types of ExecutorServices. If your problem just needs to be solved in parallel, you might want to go for Executors.newCachedThreadPool(). If your problem is urgent, you might want to go for Executors.newWorkStealingPool().

该包java.util.concurrent提供了许多有助于解决多线程典型问题的类。您想要的是ExecutorService将应该并行运行和完成的任务分配给它。该类Executors提供了用于创建流行类型ExecutorServices 的因子方法。如果您的问题只需要并行解决,您可能需要选择Executors.newCachedThreadPool(). 如果您的问题很紧急,您可能需要寻求Executors.newWorkStealingPool().

Your code thus could look like this:

因此,您的代码可能如下所示:

final ExecutorService service = Executors.newWorkStealingPool();
for (final Object object : objectsToProcess) {
    service.submit(() -> {
            Go to database retrieve data.
            process
            save data
        }
    });
}

Please note that the sequence in which the objects would be processed is no longer guaranteed if you go for this approach of multithreading.

请注意,如果您采用这种多线程方法,则不再保证处理对象的顺序。

If your objectsToProcessare something which can provide a parallel stream, you could also do this:

如果您objectsToProcess可以提供并行流,您也可以这样做:

objectsToProcess.parallelStream().forEach(object -> {
    Go to database retrieve data.
    process
    save data
});

This will leave the decisions about how to handle the threads to the VM, which often will be better than implementing the multi-threading ourselves.

这会将有关如何处理线程的决定留给 VM,这通常比我们自己实现多线程要好。

Further reading:

进一步阅读:

回答by djna

Depends on where the time is spent.

取决于时间花在哪里。

If you have a load of calculations to do then allocating work to more threads can help, as you say each thread may execute on a separate CPU. In such a situation there is no value in having more threads than CPUs. As Corbin says you have to figure out how to split the work across the threads and have responsibility for starting the threads, waiting for completion and aggregating the results.

如果您有大量计算要做,那么将工作分配给更多线程会有所帮助,正如您所说,每个线程可能在单独的 CPU 上执行。在这种情况下,线程数多于 CPU 数没有任何价值。正如 Corbin 所说,您必须弄清楚如何在线程之间拆分工作,并负责启动线程、等待完成并聚合结果。

If, as in your case, you are waiting for a database then there can be additional value in using threads. A database can serve several requests in paraallel (the database server itself is multi-threaded) so instead of coding

如果在您的情况下,您正在等待数据库,那么使用线程可能会有额外的价值。一个数据库可以并行处理多个请求(数据库服务器本身是多线程的),因此无需编码

for(Object : objectsToProcess){
    Go to database retrieve data.
    process
    save data
}

Where you wait for each response before issuing the next, you want to have several worker threads each performing

在发出下一个响应之前等待每个响应的地方,您希望有几个工作线程每个执行

 Go to database retrieve data.
 process
 save data

Then you get better throughput. The trick though is not to have too many worker threads. Several reasons for that:

然后你会得到更好的吞吐量。诀窍是不要有太多的工作线程。这样做的几个原因:

  1. Each thread is uses some resources, it has it's own stack, its own connection to the database. You would not want 10,000 such threads.
  2. Each request uses resources on the server, each connection uses memory, each database server will only serve so many requests in parallel. You have no benefit in submitting thousands of simultaneous requests if it can only server tens of them in parallel. Also If the database is shared you probably don't want to saturate the database with your requests, you need to be a "good citizen".
  1. 每个线程都使用一些资源,它有自己的堆栈,自己的数据库连接。您不会想要 10,000 个这样的线程。
  2. 每个请求使用服务器上的资源,每个连接使用内存,每个数据库服务器只会并行处理这么多请求。如果它只能并行处理数十个请求,则提交数千个并发请求没有任何好处。此外,如果数据库是共享的,您可能不想让您的请求使数据库饱和,您需要成为一个“好公民”。

Net: you will almost certainly get benefit by having a number of worker threads. The number of threads that helps will be determined by factors such as the number of CPUs you have and the ratio between the amount of processing you do and the response time from the DB. You can only really determine that by experiment, so make the number of threads configurable and investigate. Start with say 5, then 10. Keep your eye on the load on the DB as you increase the number of threads.

Net:您几乎肯定会因拥有多个工作线程而受益。有帮助的线程数将取决于诸如您拥有的 CPU 数量以及您执行的处理量与数据库响应时间之间的比率等因素。您只能通过实验来真正确定这一点,因此请配置线程数并进行调查。从 5 开始,然后是 10。当您增加线程数时,请注意数据库上的负载。