C++ 最快的通用机器学习库?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3167024/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 12:14:12  来源:igfitidea点击:

Fastest general machine learning library?

c++machine-learning

提问by griffin

Wekais probably the most popular general purpose machine learning library. But it can be quite slow in my experience.

Weka可能是最流行的通用机器学习库。但根据我的经验,它可能会很慢。

I have been looking at Shark, Waffles, dlib, Plearn, and MLC++as alternatives. Of these, Shark and dlib look the most promising.

我一直在寻找SharkWafflesdlibPlearnMLC++作为替代品。其中,Shark 和 dlib 看起来最有前途。

Does anyone have any experience when it comes to performance testing of these libraries?

有没有人在这些库的性能测试方面有任何经验?

回答by Davis King

For me, what matters most would be "Does this toolkit have the algorithm or feature I want to try out?" Since these toolkits provide a fairly diverse set of features you should first try to narrow down what it is you want to do.

对我来说,最重要的是“这个工具包是否有我想尝试的算法或功能?” 由于这些工具包提供了一组相当多样化的功能,因此您应该首先尝试缩小您想要做什么的范围。

So, for example, if you have a burning desire to try out different evolutionary optimization algorithms then I would go with something like Shark.

因此,例如,如果您非常渴望尝试不同的进化优化算法,那么我会选择Shark 之类的东西。

On the other hand, I prefer dlibfor most of my work, but that doesn't necessarily mean a lot, since I wrote it :) However, if you are interested in binary classification then let me suggest my current favorite method for that, the svm_c_ekm_trainer. I frequently use this to train non-linear SVMs on datasets of hundreds of thousands of points. It usually runs in a few minutes (or sometimes even seconds) while the classic SMO algorithm for this would take hours or days to finish.

另一方面,在我的大部分工作中,我更喜欢dlib,但这并不一定意味着很多,因为我写了它:) 但是,如果您对二元分类感兴趣,那么让我建议我目前最喜欢的方法,该svm_c_ekm_trainer。我经常使用它来在数十万个点的数据集上训练非线性 SVM。它通常在几分钟(有时甚至是几秒钟)内运行,而经典的 SMO 算法则需要数小时或数天才能完成。

There were also some good answers to a similar question asked not too long ago: Which machine learning library to use.

不久前提出的类似问题也有一些很好的答案:使用哪个机器学习库