如何确保 Java 线程在不同的内核上运行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1896065/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 18:25:29  来源:igfitidea点击:

How to ensure Java threads run on different cores

javamultithreadingmulticoreknapsack-problem

提问by KBP

I am writing a multi-threaded application in Java in order to improve performance over the sequential version. It is a parallel version of the dynamic programming solution to the 0/1 knapsack problem. I have an Intel Core 2 Duo with both Ubuntu and Windows 7 Professional on different partitions. I am running in Ubuntu.

我正在用 Java 编写一个多线程应用程序,以提高顺序版本的性能。它是 0/1 背包问题的动态规划解决方案的并行版本。我有一个 Intel Core 2 Duo,在不同的分区上同时运行 Ubuntu 和 Windows 7 Professional。我在 Ubuntu 中运行。

My problem is that the parallel version actually takes longer than the sequential version. I am thinking this may be because the threads are all being mapped to the same kernel thread or that they are being allocated to the same core. Is there a way I could ensure that each Java thread maps to a separate core?

我的问题是并行版本实际上比顺序版本花费的时间更长。我想这可能是因为线程都被映射到同一个内核线程,或者它们被分配到同一个内核。有没有办法确保每个 Java 线程都映射到一个单独的核心?

I have read other posts about this problem but nothing seems to help.

我已阅读有关此问题的其他帖子,但似乎没有任何帮助。

Here is the end of main() and all of run() for the KnapsackThread class (which extends Thread). Notice that they way I use slice and extra to calculate myLowBound and myHiBound ensure that each thread will not overlap in domain of the dynProgMatrix. Therefore there will be no race conditions.

这里是 KnapsackThread 类(它扩展了 Thread)的 main() 和所有 run() 的结束。请注意,我使用 slice 和 extra 来计算 myLowBound 和 myHiBound 的方式确保每个线程不会在 dynProgMatrix 的域中重叠。因此不会有竞争条件。

    dynProgMatrix = new int[totalItems+1][capacity+1];
    for (int w = 0; w<= capacity; w++)
        dynProgMatrix[0][w] = 0;
    for(int i=0; i<=totalItems; i++)
        dynProgMatrix[i][0] = 0;
    slice = Math.max(1,
            (int) Math.floor((double)(dynProgMatrix[0].length)/threads.length));
    extra = (dynProgMatrix[0].length) % threads.length;

    barrier = new CyclicBarrier(threads.length);
    for (int i = 0; i <  threads.length; i++){
        threads[i] = new KnapsackThread(Integer.toString(i));
    }
    for (int i = 0; i < threads.length; i++){
        threads[i].start();
    }

    for (int i = 0; i < threads.length; i++){
        try {
            threads[i].join();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

public void run(){
    int myRank = Integer.parseInt(this.getName());

    int myLowBound;
    int myHiBound;

    if (myRank < extra){
        myLowBound = myRank * (slice + 1);
        myHiBound = myLowBound + slice;
    }
    else{
        myLowBound = myRank * slice + extra;
        myHiBound = myLowBound + slice - 1;
    }

    if(myHiBound > capacity){
        myHiBound = capacity;
    }

    for(int i = 1; i <= totalItems; i++){
        for (int w = myLowBound; w <= myHiBound; w++){

            if (allItems[i].weight <= w){
               if (allItems[i].profit + dynProgMatrix[i-1][w-allItems[i].weight]
                        > dynProgMatrix[i-1][w])
                {
                    dynProgMatrix[i][w] = allItems[i].profit +
                                      dynProgMatrix[i-1][w- allItems[i].weight];
                }
                else{
                    dynProgMatrix[i][w] = dynProgMatrix[i-1][w];
                }
            }
            else{
                dynProgMatrix[i][w] = dynProgMatrix[i-1][w];
            }
        }
        // now place a barrier to sync up the threads
        try {
            barrier.await(); 
        } catch (InterruptedException ex) { 
            ex.printStackTrace();
            return;
        } catch (BrokenBarrierException ex) { 
            ex.printStackTrace(); 
            return;
        }
    }
}

Update:

更新:

I have written another version of the knapsack that uses brute force. This version has very little synchronization because I only need to update a bestSoFar variable at the end of a single thread's execution. Therefore, each thread pretty much should execute completely in parallel except for that small critical section at the end.

我写了另一个版本的使用蛮力的背包。这个版本几乎没有同步,因为我只需要在单个线程执行结束时更新 bestSoFar 变量。因此,除了最后的那个小临界区之外,每个线程几乎都应该完全并行执行。

I ran this versus the sequential brute force and still it takes longer. I don't see any other explanation than that my threads are being run sequentially, either because they are being mapped to the same core or to the same native thread.

我运行这个与顺序蛮力相比,仍然需要更长的时间。除了我的线程按顺序运行之外,我没有看到任何其他解释,因为它们被映射到同一个核心或同一个本机线程。

Does anybody have any insight?

有人有任何见解吗?

回答by Jon Skeet

I doubt that it will be due to using the same core for all threads. The scheduling is up to the OS, but you should be able to see what's going on if you bring up the performance manager for the OS - it will typically show how busy each core is.

我怀疑这是因为所有线程都使用相同的内核。调度取决于操作系统,但是如果您为操作系统调出性能管理器,您应该能够看到发生了什么 - 它通常会显示每个内核的繁忙程度。

Possible reasons for it taking longer:

需要更长的时间的可能原因:

  • Lots of synchronization (either necessary or unnecessary)
  • The tasks taking such a short time that thread creation is taking a significant proportion of the time
  • Context switching, if you're creating too many threads - for CPU intensive tasks, create as many as threads as you have cores.
  • 大量同步(必要或不必要)
  • 任务花费的时间如此之短,以至于线程创建占用了很大一部分时间
  • 上下文切换,如果您创建了太多线程 - 对于 CPU 密集型任务,创建与内核数量一样多的线程。

回答by arm3nio

I was having the same problem for a while. I had a CPU-hungry program that I divided in 2 threads (double core CPU), but one beautifull day, while processing some more data, it just stopped using both cores. I just raised the heap mem size (-Xmx1536min my case), and it worked fine again.

我有一段时间遇到了同样的问题。我有一个 CPU 饥渴的程序,我将它分成 2 个线程(双核 CPU),但是在美好的一天,在处理更多数据时,它只是停止使用两个内核。我刚刚提高了堆内存大小(-Xmx1536m在我的情况下),它再次正常工作。

回答by Buhb

I suggest you take a look at how long it takes for each of your worker threads before they terminate. Perhaps one of the threads has a much more difficult task. If that's the case, then the overhead caused by synchronization and so on will easily eat up what you've gained from threading.

我建议您查看每个工作线程在终止之前需要多长时间。也许其中一个线程的任务要困难得多。如果是这种情况,那么同步等引起的开销将很容易吃掉您从线程中获得的收益。