Java 强制多个线程在可用时使用多个 CPU

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1223072/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 01:05:47  来源:igfitidea点击:

Forcing multiple threads to use multiple CPUs when they are available

javaconcurrencymultithreadingmulticore

提问by Nosrama

I'm writing a Java program which uses a lot of CPU because of the nature of what it does. However, lots of it can run in parallel, and I have made my program multi-threaded. When I run it, it only seems to use one CPU until it needs more then it uses another CPU - is there anything I can do in Java to force different threads to run on different cores/CPUs?

我正在编写一个 Java 程序,由于它的工作性质,它使用了大量 CPU。但是,其中很多都可以并行运行,并且我已将我的程序设为多线程。当我运行它时,它似乎只使用一个 CPU,直到它需要更多然后使用另一个 CPU - 我可以在 Java 中做些什么来强制不同的线程在不同的内核/CPU 上运行吗?

采纳答案by Stephen C

When I run it, it only seems to use one CPU until it needs more then it uses another CPU - is there anything I can do in Java to force different threads to run on different cores/CPUs?

当我运行它时,它似乎只使用一个 CPU,直到它需要更多然后使用另一个 CPU - 我可以在 Java 中做些什么来强制不同的线程在不同的内核/CPU 上运行吗?

I interpret this part of your question as meaning that you have already addressed the problem of making your application multi-thread capable. And despite that, it doesn't immediately start using multiple cores.

我将您问题的这一部分解释为您已经解决了使您的应用程序具有多线程能力的问题。尽管如此,它并没有立即开始使用多核。

The answer to "is there any way to force ..." is (AFAIK) not directly. Your JVM and/or the host OS decide how many 'native' threads to use, and how those threads are mapped to physical processors. You do have some options for tuning. For example, I found this pagewhich talks about how to tune Java threading on Solaris. And this pagetalks about other things that can slow down a multi-threaded application.

“有没有办法强制......”的答案是(AFAIK)不是直接的。您的 JVM 和/或主机操作系统决定使用多少“本机”线程,以及这些线程如何映射到物理处理器。您确实有一些调整选项。例如,我发现这个页面讨论了如何在 Solaris 上调整 Java 线程。与此页关于可减缓了多线程应用程序的其他事情了会谈。

回答by S.Lott

The easiest thing to do is break your program into multiple processes. The OS will allocate them across the cores.

最简单的方法是将您的程序分成多个进程。操作系统将在核心之间分配它们。

Somewhat harder is to break your program into multiple threads and trust the JVM to allocate them properly. This is -- generally -- what people do to make use of available hardware.

将你的程序分成多个线程并信任 JVM 来正确分配它们有点困难。这通常是人们为利用可用硬件所做的工作。



Edit

编辑

How can a multi-processing program be "easier"? Here's a step in a pipeline.

多处理程序如何“更容易”?这是管道中的一个步骤。

public class SomeStep {
    public static void main( String args[] ) {
        BufferedReader stdin= new BufferedReader( System.in );
        BufferedWriter stdout= new BufferedWriter( System.out );
        String line= stdin.readLine();
        while( line != null ) {
             // process line, writing to stdout
             line = stdin.readLine();
        }
    }
}

Each step in the pipeline is similarly structured. 9 lines of overhead for whatever processing is included.

管道中的每个步骤都具有类似的结构。包括任何处理的 9 行开销。

This may not be the absolute most efficient. But it's very easy.

这可能不是绝对最有效的。但这很容易。



The overall structure of your concurrent processes is not a JVM problem. It's an OS problem, so use the shell.

并发进程的整体结构不是 JVM 问题。这是操作系统问题,因此请使用 shell。

java -cp pipline.jar FirstStep | java -cp pipline.jar SomeStep | java -cp pipline.jar LastStep

The only thing left is to work out some serialization for your data objects in the pipeline. Standard Serialization works well. Read http://java.sun.com/developer/technicalArticles/Programming/serialization/for hints on how to serialize. You can replace the BufferedReaderand BufferedWriterwith ObjectInputStreamand ObjectOutputStreamto accomplish this.

剩下的唯一事情就是为管道中的数据对象进行一些序列化。标准序列化效果很好。有关如何序列化的提示,请阅读http://java.sun.com/developer/technicalArticles/Programming/serialization/。您可以更换BufferedReader,并BufferedWriterObjectInputStreamObjectOutputStream做到这一点。

回答by BobMcGee

There are two basic ways to multi-thread in Java. Each logical task you create with these methods should run on a fresh core when needed and available.

Java中有两种基本的多线程方法。您使用这些方法创建的每个逻辑任务都应在需要且可用时在新内核上运行。

Method one:define a Runnable or Thread object (which can take a Runnable in the constructor) and start it running with the Thread.start() method. It will execute on whatever core the OS gives it -- generally the less loaded one.

方法一:定义一个Runnable或Thread对象(可以在构造函数中带一个Runnable),用Thread.start()方法启动它运行。它将在操作系统提供的任何内核上执行——通常是加载较少的内核。

Tutorial: Defining and Starting Threads

教程:定义和启动线程

Method two:define objects implementing the Runnable (if they don't return values) or Callable (if they do) interface, which contain your processing code. Pass these as tasks to an ExecutorService from the java.util.concurrent package. The java.util.concurrent.Executors class has a bunch of methods to create standard, useful kinds of ExecutorServices. Linkto Executors tutorial.

方法二:定义实现 Runnable(如果它们不返回值)或 Callable(如果它们返回值)接口的对象,其中包含您的处理代码。将这些作为任务从 java.util.concurrent 包传递给 ExecutorService。java.util.concurrent.Executors 类有很多方法来创建标准的、有用的 ExecutorServices 类型。链接到 Executors 教程。

From personal experience, the Executors fixed & cached thread pools are very good, although you'll want to tweak thread counts. Runtime.getRuntime().availableProcessors() can be used at run-time to count available cores. You'll need to shut down thread pools when your application is done, otherwise the application won't exit because the ThreadPool threads stay running.

从个人经验来看,Executors 固定和缓存线程池非常好,尽管您需要调整线程数。Runtime.getRuntime().availableProcessors() 可在运行时用于计算可用内核数。当您的应用程序完成时,您需要关闭线程池,否则应用程序不会退出,因为 ThreadPool 线程保持运行。

Getting good multicore performance is sometimes tricky, and full of gotchas:

获得良好的多核性能有时很棘手,而且充满了陷阱:

  • Disk I/O slows down a LOT when run in parallel. Only one thread should do disk read/write at a time.
  • Synchronization of objects provides safety to multi-threaded operations, but slows down work.
  • If tasks are too trivial (small work bits, execute fast) the overhead of managing them in an ExecutorService costs more than you gain from multiple cores.
  • Creating new Thread objects is slow. The ExecutorServices will try to re-use existing threads if possible.
  • All sorts of crazy stuff can happen when multiple threads work on something. Keep your system simple and try to make tasks logically distinct and non-interacting.
  • 并行运行时,磁盘 I/O 会减慢很多。一次只能有一个线程进行磁盘读/写。
  • 对象同步为多线程操作提供了安全性,但会减慢工作速度。
  • 如果任务太琐碎(工作量小,执行速度快),在 ExecutorService 中管理它们的开销比从多个内核获得的开销要多。
  • 创建新的 Thread 对象很慢。如果可能,ExecutorServices 将尝试重新使用现有线程。
  • 当多个线程处理某件事时,可能会发生各种疯狂的事情。保持您的系统简单,并尝试使任务在逻辑上不同且非交互。

One other problem: controlling work is hard! A good practice is to have one manager thread that creates and submits tasks, and then a couple working threads with work queues (using an ExecutorService).

另一个问题:控制工作很难!一个好的做法是让一个管理器线程创建和提交任务,然后有几个工作线程和工作队列(使用 ExecutorService)。

I'm just touching on key points here -- multithreaded programming is considered one of the hardest programming subjects by many experts. It's non-intuitive, complex, and the abstractions are often weak.

我只是在这里触及关键点——多线程编程被许多专家认为是最难的编程主题之一。它不直观、复杂,而且抽象性通常很弱。



Edit -- Example using ExecutorService:

编辑 - 使用 ExecutorService 的示例:

public class TaskThreader {
    class DoStuff implements Callable {
       Object in;
       public Object call(){
         in = doStep1(in);
         in = doStep2(in);
         in = doStep3(in); 
         return in;
       }
       public DoStuff(Object input){
          in = input;
       }
    }

    public abstract Object doStep1(Object input);    
    public abstract Object doStep2(Object input);    
    public abstract Object doStep3(Object input);    

    public static void main(String[] args) throws Exception {
        ExecutorService exec = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
        ArrayList<Callable> tasks = new ArrayList<Callable>();
        for(Object input : inputs){
           tasks.add(new DoStuff(input));
        }
        List<Future> results = exec.invokeAll(tasks);
        exec.shutdown();
        for(Future f : results) {
           write(f.get());
        }
    }
}

回答by Thorbj?rn Ravn Andersen

You should write your program to do its work in the form of a lotof Callable's handed to an ExecutorService and executed with invokeAll(...).

您应该编写程序以将大量Callable 交给 ExecutorService 并使用 invokeAll(...) 执行的形式完成其工作。

You can then choose a suitable implementation at runtime from the Executors class. A suggestion would be to call Executors.newFixedThreadPool() with a number roughly corresponding to the number of cpu cores to keep busy.

然后,您可以在运行时从 Executors 类中选择合适的实现。一个建议是调用 Executors.newFixedThreadPool() 并使用一个大致对应于保持忙碌的 cpu 内核数量的数字。

回答by brianegge

First, I'd suggest reading "Concurrency in Practice" by Brian Goetz.

首先,我建议阅读Brian Goetz 的“实践中的并发”

alt text

替代文字

This is by far the best book describing concurrent java programming.

这是迄今为止最好的描述并发 Java 编程的书。

Concurrency is 'easy to learn, difficult to master'. I'd suggest reading plenty about the subject before attempting it. It's very easy to get a multi-threaded program to work correctly 99.9% of the time, and fail 0.1%. However, here are some tips to get you started:

并发是“易学难精”。我建议在尝试之前阅读大量有关该主题的信息。很容易让多线程程序在 99.9% 的时间内正常工作,并在 0.1% 的时间内失败。但是,这里有一些提示可以帮助您入门:

There are two common ways to make a program use more than one core:

有两种常见的方法可以让一个程序使用多个内核:

  1. Make the program run using multiple processes. An example is Apache compiled with the Pre-Fork MPM, which assigns requests to child processes. In a multi-process program, memory is not shared by default. However, you can map sections of shared memory across processes. Apache does this with it's 'scoreboard'.
  2. Make the program multi-threaded. In a multi-threaded program, all heap memory is shared by default. Each thread still has it's own stack, but can access any part of the heap. Typically, most Java programs are multi-threaded, and not multi-process.
  1. 使程序使用多个进程运行。一个例子是使用 Pre-Fork MPM 编译的 Apache,它将请求分配给子进程。在多进程程序中,默认情况下不共享内存。但是,您可以跨进程映射共享内存的部分。Apache 用它的“记分板”来做到这一点。
  2. 使程序多线程。在多线程程序中,默认情况下所有堆内存都是共享的。每个线程仍然有自己的堆栈,但可以访问堆的任何部分。通常,大多数 Java 程序是多线程的,而不是多进程的。

At the lowest level, one can create and destroy threads. Java makes it easy to create threads in a portable cross platform manner.

在最低级别,可以创建和销毁线程。Java 使得以可移植的跨平台方式创建线程变得容易。

As it tends to get expensive to create and destroy threads all the time, Java now includes Executorsto create re-usable thread pools. Tasks can be assigned to the executors, and the result can be retrieved via a Future object.

由于创建和销毁线程总是很昂贵,因此 Java 现在包含Executors来创建可重用的线程池。可以将任务分配给执行程序,并且可以通过 Future 对象检索结果。

Typically, one has a task which can be divided into smaller tasks, but the end results need to be brought back together. For example, with a merge sort, one can divide the list into smaller and smaller parts, until one has every core doing the sorting. However, as each sublist is sorted, it needs to be merged in order to get the final sorted list. Since this is "divide-and-conquer" issue is fairly common, there is a JSR frameworkwhich can handle the underlying distribution and joining. This framework will likely be included in Java 7.

通常,一个任务可以分成更小的任务,但最终结果需要重新组合在一起。例如,使用归并排序,可以将列表分成越来越小的部分,直到每个核心都在进行排序。但是,由于每个子列表都已排序,因此需要对其进行合并以获得最终的排序列表。由于这是“分而治之”的问题相当普遍,因此有一个JSR 框架可以处理底层分发和加入。这个框架很可能会包含在 Java 7 中。

回答by Iouri Goussev

There is no way to set CPU affinity in Java. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4234402

在 Java 中无法设置 CPU 关联性。http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4234402

If you have to do it, use JNI to create native threads and set their affinity.

如果必须这样做,请使用 JNI 创建本机线程并设置它们的关联。

回答by Nandika

I think this issue is related to Java Parallel Proccesing Framework (JPPF). Using this you can run diferent jobs on diferent processors.

我认为这个问题与 Java Parallel Proccesing Framework (JPPF) 有关。使用它,您可以在不同的处理器上运行不同的作业。

回答by Zan Lynx

First, you should prove to yourself that your program would run fasteron multiple cores. Many operating systems put effort into running program threads on the same core whenever possible.

首先,您应该向自己证明您的程序在多核上运行得更快。许多操作系统尽可能地在同一个内核上运行程序线程。

Running on the same core has many advantages. The CPU cache is hot, meaning that data for that program is loaded into the CPU. The lock/monitor/synchronization objects are in CPU cache which means that other CPUs do not need to do cache synchronization operations across the bus (expensive!).

在同一个内核上运行有很多优点。CPU 缓存很热,这意味着该程序的数据已加载到 CPU 中。锁定/监控/同步对象在 CPU 缓存中,这意味着其他 CPU 不需要跨总线进行缓存同步操作(昂贵!)。

One thing that can very easily make your program run on the same CPU all the time is over-use of locks and shared memory. Your threads should not talk to each other. The less often your threads use the same objects in the same memory, the more often they will run on different CPUs. The more often they use the same memory, the more often they must block waiting for the other thread.

可以很容易地使您的程序始终在同一个 CPU 上运行的一件事是过度使用锁和共享内存。您的线程不应相互通信。您的线程在同一内存中使用相同对象的频率越低,它们在不同 CPU 上运行的频率就越高。它们使用相同内存的次数越多,它们必须越频繁地阻塞等待另一个线程。

Whenever the OS sees one thread block for another thread, it will run that thread on the same CPU whenever it can. It reduces the amount of memory that moves over the inter-CPU bus. That is what I guess is causing what you see in your program.

每当操作系统看到另一个线程的一个线程块时,它就会尽可能在同一个 CPU 上运行该线程。它减少了通过 CPU 间总线移动的内存量。我猜这就是导致您在程序中看到的原因。

回答by Volker Stolz

JVM performance tuning has been mentioned before in Why does this Java code not utilize all CPU cores?. Note that this only applies to the JVM, so your application must already be using threads (and more or less "correctly" at that):

JVM 性能调优之前在为什么此 Java 代码不利用所有 CPU 内核中提到过. 请注意,这仅适用于 JVM,因此您的应用程序必须已经在使用线程(并且或多或少“正确”地使用了线程):

http://ch.sun.com/sunnews/events/2009/apr/adworkshop/pdf/5-1-Java-Performance.pdf

http://ch.sun.com/sunnews/events/2009/apr/adworkshop/pdf/5-1-Java-Performance.pdf

回答by Ravindra babu

You can use below API from Executorswith Java 8 version

您可以在Java 8 版本的Executors 中使用以下 API

public static ExecutorService newWorkStealingPool()

Creates a work-stealing thread pool using all available processors as its target parallelism level.

使用所有可用的处理器作为其目标并行度级别来创建窃取工作的线程池。

Due to work stealing mechanism, idle threads steal tasks from task queue of busy threads and overall throughput will increase.

由于工作窃取机制,空闲线程从繁忙线程的任务队列中窃取任务,整体吞吐量会增加。

From grepcode, implementation of newWorkStealingPoolis as follows

grepcode,实现newWorkStealingPool如下

/**
     * Creates a work-stealing thread pool using all
     * {@link Runtime#availableProcessors available processors}
     * as its target parallelism level.
     * @return the newly created thread pool
     * @see #newWorkStealingPool(int)
     * @since 1.8
     */
    public static ExecutorService newWorkStealingPool() {
        return new ForkJoinPool
            (Runtime.getRuntime().availableProcessors(),
             ForkJoinPool.defaultForkJoinWorkerThreadFactory,
             null, true);
    }