Java 中的并行编程

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3350459/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 22:30:13  来源:igfitidea点击:

Parallel programming in Java

javaparallel-processing

提问by Alex Mathew

How can we do Parallel Programming in Java? Is there any special framework for that? How can we make the stuff work?

我们如何在 Java 中进行并行编程?有什么特殊的框架吗?我们怎样才能让这些东西发挥作用?

I will tell you guys what I need, think that I developed a web crawler and it crawls a lot of data from the internet. One crawling system will not make things work properly, so I need more systems working in parallel. If this is the case can I apply parallel computing? Can you guys give me an example?

我会告诉你们我需要什么,认为我开发了一个网络爬虫,它从互联网上爬取了很多数据。一个爬行系统不会让事情正常工作,所以我需要更多的系统并行工作。如果是这种情况,我可以应用并行计算吗?大家可以举个例子吗?

采纳答案by Adil Mehmood

If you are asking about pure parallel programmingi.e. not concurrentprogramming then you should definitely try MPJExpress http://mpj-express.org/. It is a thread-safe implementation of mpiJava and it supports both distributed and shared memory models. I have tried it and found very reliable.

如果您问的是纯并行编程,不是并发编程,那么您绝对应该尝试 MPJExpress http://mpj-express.org/。它是 mpiJava 的线程安全实现,并且支持分布式和共享内存模型。我试过了,发现非常可靠。

1 import mpi.*;  
2  
3 
/**  
4  * Compile:impl specific.  
5  * Execute:impl specific.  
6  */  
7  
8 public class Send {  
9 
10     public static void main(String[] args) throws Exception { 
11 
12         MPI.Init(args); 
13 
14         int rank = MPI.COMM_WORLD.Rank() ; //The current process.
15         int size = MPI.COMM_WORLD.Size() ; //Total number of processes
16         int peer ; 
17 
18         int buffer [] = new int[10]; 
19         int len = 1 ;
20         int dataToBeSent = 99 ; 
21         int tag = 100 ; 
22 
23         if(rank == 0) { 
24 
25             buffer[0] = dataToBeSent ; 
26             peer = 1 ; 
27             MPI.COMM_WORLD.Send(buffer, 0, len, MPI.INT, peer, tag) ; 
28             System.out.println("process <"+rank+"> sent a msg to "+ 29                                "process <"+peer+">") ; 
30 
31         } else if(rank == 1) { 
32 
33             peer = 0 ; 
34             Status status = MPI.COMM_WORLD.Recv(buffer, 0, buffer.length, 35                                                 MPI.INT, peer, tag); 
36             System.out.println("process <"+rank+"> recv'ed a msg\n"+ 37                                "\tdata   <"+buffer[0]    +"> \n"+ 38                                "\tsource <"+status.source+"> \n"+ 39                                "\ttag    <"+status.tag   +"> \n"+ 40                                "\tcount  <"+status.count +">") ; 
41 
42         } 
43 
44         MPI.Finalize(); 
45 
46     }  
47 
48 }

One of the most common functionalities provided by messaging libraries like MPJ Express is the support of point-to-point communication between executing processes. In this context, two processes belonging to the same communicator (for instance the MPI.COMM_WORLD communicator) may communicate with each other by sending and receiving messages. A variant of the Send() method is used to send the message from the sender process. On the other hand, the sent message is received by the receiver process by using a variant of the Recv() method. Both sender and receiver specify a tag that is used to ?nd a matching incoming messages at the receiver side.

MPJ Express 等消息传递库提供的最常见功能之一是支持执行进程之间的点对点通信。在这种情况下,属于同一个通信器(例如 MPI.COMM_WORLD 通信器)的两个进程可以通过发送和接收消息来相互通信。Send() 方法的一个变体用于从发送方进程发送消息。另一方面,接收方进程使用 Recv() 方法的变体接收发送的消息。发送方和接收方都指定了一个标签,用于在接收方找到匹配的传入消息。

After initializing the MPJ Express library using the MPI.Init(args) method on line 12, the program obtains its rank and the size of the MPI.COMM_WORLD communicator. Both processes initialize an integer array of length 10 called buffer on line 18. The sender process—rank 0—stores a value of 10 in the ?rst element of the msg array. A variant of the Send() method is used to send an element of the msg array to the receiver process.

在第 12 行使用 MPI.Init(args) 方法初始化 MPJ Express 库后,程序获得它的等级和 MPI.COMM_WORLD 通信器的大小。两个进程在第 18 行初始化一个长度为 10 的整数数组,称为缓冲区。发送方进程(等级 0)在 msg 数组的第一个元素中存储值 10。Send() 方法的一个变体用于将 msg 数组的一个元素发送到接收器进程。

The sender process calls the Send() method on line 27. The ?rst three arguments are related to the data being sent. The sending bu!er—the bu!er array—is the ?rst argument followed by 0 (o!set) and 1 (count). The data being sent is of MPI.INT type and the destination is 1 (peer variable); the datatype and destination are speci?ed as fourth and ?fth argument to the Send() method. The last and the sixth argument is the tag variable. A tag is used to identify messages at the receiver side. A message tag is typically an identi?er of a particular message in a speci?c communicator. On the other hand the receiver process (rank 1) receives the message using the blocking receive method.

发送方进程在第 27 行调用 Send() 方法。前三个参数与正在发送的数据相关。发送缓冲区——缓冲区数组——是第一个参数,后跟 0 (o!set) 和 1 (count)。正在发送的数据是 MPI.INT 类型,目的地是 1(对等变量);数据类型和目标被指定为 Send() 方法的第四个和第五个参数。最后一个也是第六个参数是标签变量。标签用于在接收方识别消息。消息标签通常是特定通信器中特定消息的标识符。另一方面,接收者进程(等级 1)使用阻塞接收方法接收消息。

回答by stacker

In java parallel processing is done using threads which are part of the runtime library

在 Java 中,并行处理是使用线程完成的,线程是运行时库的一部分

The Concurrency Tutorialshould answer a lot of questions on this topic if you're new to java and parallel programming.

并发教程应该回答了很多问题,关于这个主题,如果你是新来的Java和并行编程。

回答by grigy

I have heard about one at conference a few years ago - ParJava. But I'm not sure about the current status of the project.

几年前我在会议上听说过一个 - ParJava。但我不确定该项目的当前状态。

回答by Manuel Selva

Java supports threads, thus you can have multi threaded Java application. I strongly recommend the Concurrent Programming in Java: Design Principles and Patternsbook for that:

Java 支持线程,因此您可以拥有多线程 Java 应用程序。我强烈推荐Java 中并发编程:设计原则和模式一书:

http://java.sun.com/docs/books/cp/

http://java.sun.com/docs/books/cp/

回答by Dacav

As far as I know, on most operating systems the Threading mechanism of Java should be based on real kernel threads. This is good from the parallel programming prospective. Other languages like Python simply do some time multiplexing of the processor (namely, if you run a heavvy multithreaded application on a multiprocessor machine you'll see only one processor running).

据我所知,在大多数操作系统上,Java 的线程机制应该基于真正的内核线程。从并行编程的角度来看,这很好。其他语言如 Python 只是对处理器进行一些时间多路复用(即,如果您在多处理器机器上运行繁重的多线程应用程序,您将只看到一个处理器在运行)。

You can easily find something just googling it: by example this is the first result for "java threading": http://download-llnw.oracle.com/javase/tutorial/essential/concurrency/

你可以很容易地通过谷歌搜索找到一些东西:例如,这是“java线程”的第一个结果:http: //download-llnw.oracle.com/javase/tutorial/essential/concurrency/

Basically it boils down to extend the Thread class, overload the "run" method with the code belonging to the other thread and call the "start" method on an instance of the class you extended.

基本上它归结为扩展 Thread 类,使用属于另一个线程的代码重载“run”方法,并在您扩展的类的实例上调用“start”方法。

Also if you need to make something thread safe, have a look to the synchronized methods.

另外,如果您需要使线程安全,请查看同步方法

回答by Thorbj?rn Ravn Andersen

回答by Nicolas78

You might want to check out Hadoop. It's designed to have jobs running over an arbitrary amount of boxes and takes care of all the bookkeeping for you. It's inspired by Google's MapReduce and their related tools and so it even comes from web indexing.

您可能想查看 Hadoop。它旨在让作业在任意数量的盒子上运行,并为您处理所有簿记。它的灵感来自 Google 的 MapReduce 及其相关工具,因此它甚至来自网络索引。

回答by Richard

This is the parallel programming resource I've been pointed to in the past:

这是我过去提到的并行编程资源:

http://www.jppf.org/

http://www.jppf.org/

I have no idea whether its any good or not, just that someone recommended it a while ago.

我不知道它是否好用,只是不久前有人推荐了它。

回答by John Channing

You want to look at the Java Parallel Processing Framework (JPPF)

你想看看 Java Parallel Processing Framework ( JPPF)

回答by Bruno Thomas

java.util.concurrency package and the Brian Goetz book "Java concurrency in practice"

java.util.concurrency 包和 Brian Goetz 的书“Java 并发实践”

There is also a lot of resources here about parallel patterns by Ralph Johnson (one of the GoF design pattern author) : http://parlab.eecs.berkeley.edu/wiki/patterns/patterns

这里还有很多关于 Ralph Johnson(GoF 设计模式作者之一)并行模式的资源: http://parlab.eecs.berkeley.edu/wiki/patterns/patterns