如何在 Windows 环境下使用 Mahout?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2735741/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use Mahout in a Windows environment?
提问by user249210
I am trying to use Mahout in an application running on Windows. I want to build clusters from a lucene index using k-means.
我正在尝试在 Windows 上运行的应用程序中使用 Mahout。我想使用 k-means 从 lucene 索引构建集群。
As soon as I have to create sequence files (creating vectors from a lucene index), I get a Hadoop-Exception, since Hadoop makes command line calls to programs unknown in a Windows environment (e.g. chmod). Running in Cygwin is not an option, since I want to be able to run the App from eclipse.
一旦我必须创建序列文件(从 lucene 索引创建向量),我就会得到一个 Hadoop 异常,因为 Hadoop 对 Windows 环境中未知的程序(例如 chmod)进行命令行调用。在 Cygwin 中运行不是一种选择,因为我希望能够从 Eclipse 运行应用程序。
So my question is
所以我的问题是
回答by bajafresh4life
The only way you can run Hadoop on a Windows environment is to install Cygwin. For more info, see this blog post:
在 Windows 环境中运行 Hadoop 的唯一方法是安装 Cygwin。有关更多信息,请参阅此博客文章:
http://hayesdavis.net/2008/06/14/running-hadoop-on-windows/
http://hayesdavis.net/2008/06/14/running-hadoop-on-windows/
Cygwin will provide all the command-line utilities (like chmod) that Hadoop relies on. You can still run your Hadoop jobs from within Eclipse if you want.
Cygwin 将提供 Hadoop 依赖的所有命令行实用程序(如 chmod)。如果需要,您仍然可以从 Eclipse 中运行您的 Hadoop 作业。
回答by Peter Wippermann
Do you know the SequenceFile
API? Have a look here: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.htmlYou can try to write/read the data by yourself.
你知道SequenceFile
API吗?看看这里:http: //hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.html你可以尝试自己写/读数据。
I think you can run Mahout from eclipse in Windowns in stand-alone mode. But you will appear several short comings and barriers. You should try how far you come.
我认为您可以在独立模式下在 Windowns 中从 eclipse 运行 Mahout。但是你会出现几个缺点和障碍。你应该试试你能走多远。
In my opinion you shouldn't insist on running mahout from eclipse. ;-)
在我看来,您不应该坚持从 eclipse 运行 mahout。;-)
回答by Alexander Davliatov
You can use a virtual machine to run you Hadoop environment. As for me, the best solution is using http://hortonworks.com/project. Everything works pretty.
您可以使用虚拟机来运行您的 Hadoop 环境。对我来说,最好的解决方案是使用http://hortonworks.com/项目。一切都很好。