Java 中的 PCA 实现
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10604507/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PCA Implementation in Java
提问by Trup
I need implementation of PCA in Java. I am interested in finding something that's well documented, practical and easy to use. Any recommendations?
我需要在 Java 中实现 PCA。我有兴趣找到有据可查、实用且易于使用的东西。有什么建议吗?
回答by LotiLotiLoti
There are now a number of Principal Component Analysis implementations for Java.
现在有许多 Java 的主成分分析实现。
Apache Spark: https://spark.apache.org/docs/2.1.0/mllib-dimensionality-reduction.html#principal-component-analysis-pca
SparkConf conf = new SparkConf().setAppName("PCAExample").setMaster("local"); try (JavaSparkContext sc = new JavaSparkContext(conf)) { //Create points as Spark Vectors List<Vector> vectors = Arrays.asList( Vectors.dense( -1.0, -1.0 ), Vectors.dense( -1.0, 1.0 ), Vectors.dense( 1.0, 1.0 )); //Create Spark MLLib RDD JavaRDD<Vector> distData = sc.parallelize(vectors); RDD<Vector> vectorRDD = distData.rdd(); //Execute PCA Projection to 2 dimensions PCA pca = new PCA(2); PCAModel pcaModel = pca.fit(vectorRDD); Matrix matrix = pcaModel.pc(); }
ND4J: http://nd4j.org/doc/org/nd4j/linalg/dimensionalityreduction/PCA.html
//Create points as NDArray instances List<INDArray> ndArrays = Arrays.asList( new NDArray(new float [] {-1.0F, -1.0F}), new NDArray(new float [] {-1.0F, 1.0F}), new NDArray(new float [] {1.0F, 1.0F})); //Create matrix of points (rows are observations; columns are features) INDArray matrix = new NDArray(ndArrays, new int [] {3,2}); //Execute PCA - again to 2 dimensions INDArray factors = PCA.pca_factor(matrix, 2, false);
Apache Commons Math (single threaded; no framework)
//create points in a double array double[][] pointsArray = new double[][] { new double[] { -1.0, -1.0 }, new double[] { -1.0, 1.0 }, new double[] { 1.0, 1.0 } }; //create real matrix RealMatrix realMatrix = MatrixUtils.createRealMatrix(pointsArray); //create covariance matrix of points, then find eigen vectors //see https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues Covariance covariance = new Covariance(realMatrix); RealMatrix covarianceMatrix = covariance.getCovarianceMatrix(); EigenDecomposition ed = new EigenDecomposition(covarianceMatrix);
Apache Spark:https: //spark.apache.org/docs/2.1.0/mllib-Dimensionity-reduction.html#principal-component-analysis-pca
SparkConf conf = new SparkConf().setAppName("PCAExample").setMaster("local"); try (JavaSparkContext sc = new JavaSparkContext(conf)) { //Create points as Spark Vectors List<Vector> vectors = Arrays.asList( Vectors.dense( -1.0, -1.0 ), Vectors.dense( -1.0, 1.0 ), Vectors.dense( 1.0, 1.0 )); //Create Spark MLLib RDD JavaRDD<Vector> distData = sc.parallelize(vectors); RDD<Vector> vectorRDD = distData.rdd(); //Execute PCA Projection to 2 dimensions PCA pca = new PCA(2); PCAModel pcaModel = pca.fit(vectorRDD); Matrix matrix = pcaModel.pc(); }
ND4J:http://nd4j.org/doc/org/nd4j/linalg/Dimensionityreduction/PCA.html
//Create points as NDArray instances List<INDArray> ndArrays = Arrays.asList( new NDArray(new float [] {-1.0F, -1.0F}), new NDArray(new float [] {-1.0F, 1.0F}), new NDArray(new float [] {1.0F, 1.0F})); //Create matrix of points (rows are observations; columns are features) INDArray matrix = new NDArray(ndArrays, new int [] {3,2}); //Execute PCA - again to 2 dimensions INDArray factors = PCA.pca_factor(matrix, 2, false);
Apache Commons Math(单线程;无框架)
//create points in a double array double[][] pointsArray = new double[][] { new double[] { -1.0, -1.0 }, new double[] { -1.0, 1.0 }, new double[] { 1.0, 1.0 } }; //create real matrix RealMatrix realMatrix = MatrixUtils.createRealMatrix(pointsArray); //create covariance matrix of points, then find eigen vectors //see https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues Covariance covariance = new Covariance(realMatrix); RealMatrix covarianceMatrix = covariance.getCovarianceMatrix(); EigenDecomposition ed = new EigenDecomposition(covarianceMatrix);
Note, Singular Value Decomposition, which can also be used to find Principal Components, has equivalent implementations.
请注意,奇异值分解也可用于查找主成分,具有等效的实现。
回答by NPE
Here is one: PCA Class.
这是一个:PCA 类。
This class contains the methods necessary for a basic Principal Component Analysis with a varimax rotation. Options are available for an analysis using either the covariance or the correlation martix. A parallel analysis, using Monte Carlo simulations, is performed. Extraction criteria based on eigenvalues greater than unity, greater than a Monte Carlo eigenvalue percentile or greater than the Monte Carlo eigenvalue means are available.
此类包含具有最大方差旋转的基本主成分分析所需的方法。选项可用于使用协方差或相关性 martix 的分析。执行使用蒙特卡罗模拟的并行分析。基于大于统一值、大于蒙特卡罗特征值百分位数或大于蒙特卡罗特征值均值的特征值的提取标准是可用的。
回答by sash
check http://weka.sourceforge.net/doc.stable/weka/attributeSelection/PrincipalComponents.htmlweka in fact have many other algorithm that could be used with along with PCA and also weka is adding more algorithm from time to time. so i thing, if you are working on java then switch to weka api.
检查http://weka.sourceforge.net/doc.stable/weka/attributeSelection/PrincipalComponents.htmlweka 实际上有许多其他算法可以与 PCA 一起使用,并且 weka 不时添加更多算法。所以我想,如果您正在使用 Java,那么请切换到 weka api。
回答by hrzafer
Smileis a full-fledged ML library for java. You give its PCA implementation a try. Please see: https://haifengl.github.io/smile/api/java/smile/projection/PCA.html
Smile是一个成熟的 Java 机器学习库。您可以尝试一下它的 PCA 实现。请参阅:https: //haifengl.github.io/smile/api/java/smile/projection/PCA.html
There is also PCA tutorialwith Smile but the tutorial uses Scala.
还有Smile 的PCA教程,但该教程使用 Scala。
回答by Vlad11
You can see a few implementations of PCA in the DataMelt project:
您可以在 DataMelt 项目中看到 PCA 的一些实现:
https://jwork.org/dmelt/code/index.php?keyword=PCA
https://jwork.org/dmelt/code/index.php?keyword=PCA
(they are rewritten in Jython). They include some graphical examples for dimensionality reduction. They show the usage of several Java packages, such as JSAT, DatumBox and others.
(它们是用 Jython 重写的)。它们包括一些降维的图形示例。它们展示了几个 Java 包的用法,例如 JSAT、DatumBox 等。