Java 集群库
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2129269/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java Clustering Library
提问by user238384
I am looking for a light weight clustering library in java. I don't need 100s of clustering algo in that library just 5 to 7 algo would be fine for me.
我正在寻找一个轻量级的 Java 聚类库。我不需要那个库中的 100 个聚类算法,只需 5 到 7 个算法就可以了。
I am sure, you are going to ask: "what kind of algo do you need and for what purpose" :). I just need to do classification of my data with the help of clustering. For example K means.
我敢肯定,您会问:“您需要什么样的算法以及用于什么目的”:)。我只需要在聚类的帮助下对我的数据进行分类。例如 K 的意思。
P.S: I know about weka but I don't want to use it as it is not specifically for clustering only.
PS:我知道weka,但我不想使用它,因为它不是专门用于聚类的。
回答by Sean Owen
Apache Mahout implements many clustering algorithms, via Hadoop. It's a little heavy for what you want, but: http://cwiki.apache.org/MAHOUT/syntheticcontroldata.html
Apache Mahout 通过 Hadoop 实现了许多集群算法。你想要的东西有点重,但是:http: //cwiki.apache.org/MAHOUT/syntheticcontroldata.html
Also you might be able to dig out and adapt the user clustering code from Mahout's TreeClusteringRecommender class, which uses clustering for recommender engine purposes.
此外,您还可以从 Mahout 的 TreeClusteringRecommender 类中挖掘和调整用户聚类代码,该类将聚类用于推荐引擎目的。
回答by Binary Nerd
I would take a look at JUNG. It has a number of clustering algorithms implemented, although I'm not sure if K-means is one of them.
我会看看JUNG。它实现了许多聚类算法,尽管我不确定 K-means 是否是其中之一。
Another option might be to take a look at Knime, an Eclipse based workflow editor. This includes a number of clustering primitives you can use as part of a workflow, including K-means.
另一种选择可能是查看Knime,这是一个基于 Eclipse 的工作流编辑器。这包括许多可以用作工作流程一部分的聚类原语,包括 K-means。
回答by Tim Gee
There are some open-source clustering algorithms in Java available here, available under the GPL. Requires the Java Colt library (for matrices). http://open.trickl.com/
此处提供了一些 Java 中的开源聚类算法,可在 GPL 下使用。需要 Java Colt 库(用于矩阵)。 http://open.trickl.com/
回答by Has QUIT--Anony-Mousse
There is also ELKI, an open-source university project similar to WEKA, but with the focus on cluster analysis and outlier detection instead of machine learning algorithms. It's pretty advanced, uses index structures for efficiency, and has at least a dozen clustering algorithms.
还有ELKI,一个类似于 WEKA 的开源大学项目,但重点是聚类分析和异常值检测,而不是机器学习算法。它非常先进,使用索引结构来提高效率,并且至少有十几种聚类算法。
回答by lynxoid
回答by Wilfred Springer
If Scala also works for you, then you might want to check this version of KMeans in Scala:
如果 Scala 也适用于您,那么您可能需要在 Scala 中检查此版本的 KMeans:
https://github.com/wspringer/kmeans
https://github.com/wspringer/kmeans
A related blog post is here:
相关的博客文章在这里:
回答by Mark
Take a look at org.apache.commons.math4.ml.clustering.KMeansPlusPlusClustererin Apache's Commons Mathlibrary.
查看Apache 的Commons Math库中的org.apache.commons.math4.ml.clustering.KMeansPlusPlusClusterer。
回答by Phil
If you want some basic clustering algorithms in Java, you can check my software:
如果你想要一些Java中的基本聚类算法,你可以查看我的软件:
http://www.philippe-fournier-viger.com/spmf/
http://www.philippe-fournier-viger.com/spmf/
It offers an implementation of KMeans and a hierarchical clustering algorithm.
它提供了 KMeans 的实现和层次聚类算法。
The other algorithms offered are for pattern mining. Totally, there are 47 algorithms. But only 2 for clustering. Another thing: there is a simple GUI for launching the algorithms.
提供的其他算法用于模式挖掘。总共有47种算法。但只有 2 个用于聚类。另一件事:有一个用于启动算法的简单 GUI。