如何从 python/numpy 调用 java 函数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10707671/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to call a java function from python/numpy?
提问by Mannaggia
it is clear to me how to extend Python with C++, but what if I want to write a function in Java to be used with numpy?
我很清楚如何用 C++ 扩展 Python,但是如果我想用 Java 编写一个函数来与 numpy 一起使用怎么办?
Here is a simple scenario: I want to compute the average of a numpy array using a Java class. How do I pass the numpy vector to the Java class and gather the result?
这是一个简单的场景:我想使用 Java 类计算 numpy 数组的平均值。如何将 numpy 向量传递给 Java 类并收集结果?
Thanks for any help!
谢谢你的帮助!
回答by Mannaggia
I spent some time on my own question and would like to share my answer as I feel there is not much information on this topic on stackoverflow. I also think Java will become more relevant in scientific computing (e.g. see WEKA package for data mining) because of the improvement of performance and other good software development features of Java.
我花了一些时间在我自己的问题上,并想分享我的答案,因为我觉得在stackoverflow上没有太多关于这个主题的信息。我还认为 Java 将在科学计算中变得更加相关(例如,参见数据挖掘的 WEKA 包),因为 Java 的性能和其他良好的软件开发特性的改进。
In general, it turns out that using the right tools it is much easier to extend Python with Java than with C/C++!
总的来说,事实证明,使用正确的工具使用 Java 扩展 Python 要比使用 C/C++ 容易得多!
Overview and assessment of tools to call Java from Python
从 Python 调用 Java 的工具概述和评估
http://pypi.python.org/pypi/JCC: because of no proper documentation this tool is useless.
Py4J: requires to start the Java process before using python. As remarked by others this is a possible point of failure. Moreover, not many examples of use are documented.
JPype: although development seems to be death, it works well and there are many examples on it on the web (e.g. see http://kogs-www.informatik.uni-hamburg.de/~meine/weka-python/for using data mining libraries written in Java) . Therefore I decided to focus on this tool.
http://pypi.python.org/pypi/JCC:由于没有适当的文档,这个工具没用。
Py4J:使用python前需要启动Java进程。正如其他人所说,这是一个可能的故障点。此外,记录的使用示例并不多。
JPype:虽然开发似乎是死亡,但它运作良好,网上有很多例子(例如,参见http://kogs-www.informatik.uni-hamburg.de/~meine/weka-python/使用用 Java 编写的数据挖掘库)。因此我决定专注于这个工具。
Installing JPype on Fedora 16
在 Fedora 16 上安装 JPype
I am using Fedora 16, since there are some issues when installing JPype on Linux, I describe my approach. Download JPype, then modify setup.pyscript by providing the JDK path, in line 48:
我正在使用 Fedora 16,因为在 Linux 上安装 JPype 时存在一些问题,我描述了我的方法。下载JPype,然后通过提供 JDK 路径修改setup.py脚本,在第 48 行:
self.javaHome = '/usr/java/default'
then run:
然后运行:
sudo python setup.py install
Afters successful installation, check this file:
安装成功后,检查此文件:
/usr/lib64/python2.7/site-packages/jpype/_linux.py
/usr/lib64/python2.7/site-packages/jpype/_linux.py
and remove or rename the method getDefaultJVMPath()into getDefaultJVMPath_old(), then add the following method:
并将方法getDefaultJVMPath()删除或重命名为getDefaultJVMPath_old(),然后添加以下方法:
def getDefaultJVMPath():
return "/usr/java/default/jre/lib/amd64/server/libjvm.so"
Alternative approach: do not make any change in the above file _linux.py, but never use the method getDefaultJVMPath() (or methods which call this method). At the place of using getDefaultJVMPath()provide directly the path to the JVM. Note that there are several paths, for example in my system I also have the following paths, referring to different versions of the JVM (it is not clear to me whether the client or server JVM is better suited):
替代方法:不要对上述文件_linux.py进行任何更改,但永远不要使用 getDefaultJVMPath() 方法(或调用此方法的方法)。在使用getDefaultJVMPath()的地方直接提供到 JVM 的路径。请注意,有几个路径,例如在我的系统中,我也有以下路径,指的是不同版本的 JVM(我不清楚是客户端还是服务器 JVM 更适合):
- /usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre/lib/x86_64/client/libjvm.so
- /usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre/lib/x86_64/server/libjvm.so
- /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so
- /usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre/lib/x86_64/client/libjvm.so
- /usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre/lib/x86_64/server/libjvm.so
- /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so
Finally, add the following line to ~/.bashrc(or run it each time before opening a python interpreter):
最后,将以下行添加到~/.bashrc(或在每次打开 python 解释器之前运行它):
export JAVA_HOME='/usr/java/default'
(The above directory is in reality just a symbolic link to my last version of JDK, which is located at /usr/java/jdk1.7.0_04).
(上面的目录实际上只是指向我最新版本 JDK 的符号链接,它位于/usr/java/jdk1.7.0_04)。
Note that all the tests in the directory where JPype has been downloaded, i.e. JPype-0.5.4.2/test/testsuite.pywill fail (so do not care about them).
注意,JPype 下载目录下的所有测试,即JPype-0.5.4.2/test/testsuite.py都会失败(所以不要管它们)。
To see if it works, test this script in python:
要查看它是否有效,请在 python 中测试此脚本:
import jpype
jvmPath = jpype.getDefaultJVMPath()
jpype.startJVM(jvmPath)
# print a random text using a Java class
jpype.java.lang.System.out.println ('Berlusconi likes women')
jpype.shutdownJVM()
Calling Java classes from Java also using Numpy
也使用 Numpy 从 Java 调用 Java 类
Let's start implementing a Java class containing some functions which I want to apply to numpy arrays. Since there is no concept of state, I use static functions so that I do not need to create any Java object (creating Java objects would not change anything).
让我们开始实现一个 Java 类,其中包含一些我想应用于numpy 数组的函数。因为没有状态的概念,所以我使用静态函数,这样我就不需要创建任何 Java 对象(创建 Java 对象不会改变任何东西)。
/**
* Cookbook to pass numpy arrays to Java via Jpype
* @author Mannaggia
*/
package test.java;
public class Average2 {
public static double compute_average(double[] the_array){
// compute the average
double result=0;
int i;
for (i=0;i<the_array.length;i++){
result=result+the_array[i];
}
return result/the_array.length;
}
// multiplies array by a scalar
public static double[] multiply(double[] the_array, double factor) {
int i;
double[] the_result= new double[the_array.length];
for (i=0;i<the_array.length;i++) {
the_result[i]=the_array[i]*factor;
}
return the_result;
}
/**
* Matrix multiplication.
*/
public static double[][] mult_mat(double[][] mat1, double[][] mat2){
// find sizes
int n1=mat1.length;
int n2=mat2.length;
int m1=mat1[0].length;
int m2=mat2[0].length;
// check that we can multiply
if (n2 !=m1) {
//System.err.println("Error: The number of columns of the first argument must equal the number of rows of the second");
//return null;
throw new IllegalArgumentException("Error: The number of columns of the first argument must equal the number of rows of the second");
}
// if we can, then multiply
double[][] the_results=new double[n1][m2];
int i,j,k;
for (i=0;i<n1;i++){
for (j=0;j<m2;j++){
// initialize
the_results[i][j]=0;
for (k=0;k<m1;k++) {
the_results[i][j]=the_results[i][j]+mat1[i][k]*mat2[k][j];
}
}
}
return the_results;
}
/**
* @param args
*/
public static void main(String[] args) {
// test case
double an_array[]={1.0, 2.0,3.0,4.0};
double res=Average2.compute_average(an_array);
System.out.println("Average is =" + res);
}
}
The name of the class is a bit misleading, as we do not only aim at computing the average of a numpy vector (using the method compute_average), but also multiply a numpy vector by a scalar (method multiply), and finally, the matrix multiplication (method mult_mat).
类的名称有点误导,因为我们不仅旨在计算 numpy 向量的平均值(使用方法compute_average),而且还将 numpy 向量乘以标量(方法乘法),最后是矩阵乘法(方法mult_mat)。
After compiling the above Java class we can now run the following Python script:
编译上述 Java 类后,我们现在可以运行以下 Python 脚本:
import numpy as np
import jpype
jvmPath = jpype.getDefaultJVMPath()
# we to specify the classpath used by the JVM
classpath='/home/mannaggia/workspace/TestJava/bin'
jpype.startJVM(jvmPath,'-Djava.class.path=%s' % classpath)
# numpy array
the_array=np.array([1.1, 2.3, 4, 6,7])
# build a JArray, not that we need to specify the Java double type using the jpype.JDouble wrapper
the_jarray2=jpype.JArray(jpype.JDouble, the_array.ndim)(the_array.tolist())
Class_average2=testPkg.Average2
res2=Class_average2.compute_average(the_jarray2)
np.abs(np.average(the_array)-res2) # ok perfect match!
# now try to multiply an array
res3=Class_average2.multiply(the_jarray2,jpype.JDouble(3))
# convert to numpy array
res4=np.array(res3) #ok
# matrix multiplication
the_mat1=np.array([[1,2,3], [4,5,6], [7,8,9]],dtype=float)
#the_mat2=np.array([[1,0,0], [0,1,0], [0,0,1]],dtype=float)
the_mat2=np.array([[1], [1], [1]],dtype=float)
the_mat3=np.array([[1, 2, 3]],dtype=float)
the_jmat1=jpype.JArray(jpype.JDouble, the_mat1.ndim)(the_mat1.tolist())
the_jmat2=jpype.JArray(jpype.JDouble, the_mat2.ndim)(the_mat2.tolist())
res5=Class_average2.mult_mat(the_jmat1,the_jmat2)
res6=np.array(res5) #ok
# other test
the_jmat3=jpype.JArray(jpype.JDouble, the_mat3.ndim)(the_mat3.tolist())
res7=Class_average2.mult_mat(the_jmat3,the_jmat2)
res8=np.array(res7)
res9=Class_average2.mult_mat(the_jmat2,the_jmat3)
res10=np.array(res9)
# test error due to invalid matrix multiplication
the_mat4=np.array([[1], [2]],dtype=float)
the_jmat4=jpype.JArray(jpype.JDouble, the_mat4.ndim)(the_mat4.tolist())
res11=Class_average2.mult_mat(the_jmat1,the_jmat4)
jpype.java.lang.System.out.println ('Goodbye!')
jpype.shutdownJVM()
回答by user1669710
I consider Jython to be one of the best options - which makes it seamless to use java objects in python. I actually integrated weka with my python programs, and it was super easy. Just import the weka classes and call them as you would in java within the python code.
我认为 Jython 是最好的选择之一 - 这使得在 python 中使用 java 对象变得无缝。我实际上将 weka 与我的 Python 程序集成在一起,这非常简单。只需导入 weka 类并在 python 代码中像在 java 中一样调用它们。
回答by JoshAdel
I'm not sure about numpy support, but the following might be helpful:
我不确定 numpy 支持,但以下内容可能会有所帮助: