如何使用 Scala IDE 和 Maven 构建 Spark 应用程序?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31687479/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to build spark application using Scala IDE and Maven?
提问by HHH
I'm new to Scala, Spark and Maven and would like to build spark application described here. It uses the Mahout library.
我是 Scala、Spark 和 Maven 的新手,想构建这里描述的 Spark 应用程序。它使用 Mahout 库。
I have Scala IDE install and would like to use Maven to build the dependencies (which are the Mahout library as well as Spark lib). I couldn't find a good tutorial to start. Could someone help me figure it out?
我安装了 Scala IDE,并想使用 Maven 来构建依赖项(即 Mahout 库和 Spark 库)。我找不到一个好的教程来开始。有人可以帮我弄清楚吗?
回答by suztomo
First try compiling simple application with Maven in Scala IDE. The key of Maven project is directory structure and pom.xml. Although I don't use Scala IDE, this document seems helpful. http://scala-ide.org/docs/tutorials/m2eclipse/
首先尝试在 Scala IDE 中使用 Maven 编译简单的应用程序。Maven项目的关键是目录结构和pom.xml。虽然我不使用 Scala IDE,但这个文档似乎很有帮助。 http://scala-ide.org/docs/tutorials/m2eclipse/
Next step is to add dependency on Spark in pom.xml you can follow this document. http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/
下一步是在 pom.xml 中添加对 Spark 的依赖,您可以按照此文档进行操作。 http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/
For latest version of Spark and Mahout artifacts you can check them here: http://mvnrepository.com/artifact/org.apache.sparkhttp://mvnrepository.com/artifact/org.apache.mahout
对于最新版本的 Spark 和 Mahout 工件,您可以在此处查看它们:http: //mvnrepository.com/artifact/org.apache.spark http://mvnrepository.com/artifact/org.apache.mahout
Hope this helps.
希望这可以帮助。
回答by saurzcode
You need following tools to get started ( based on recent availability) -
您需要以下工具才能开始使用(基于最近的可用性)-
Scala IDE for Eclipse – Download latest version of Scala IDE from here.
Scala Version – 2.11 ( make sure scala compiler is set to this version as well)
Spark Version 2.2 ( provided in maven dependency)
- winutils.exe
用于 Eclipse 的 Scala IDE – 从这里下载最新版本的 Scala IDE 。
Scala 版本 – 2.11(确保 Scala 编译器也设置为此版本)
Spark 2.2 版(在 maven 依赖中提供)
- winutils.exe
For running in Windows environment , you need hadoop binaries in windows format. winutils provides that and we need to set hadoop.home.dir system property to bin path inside which winutils.exe is present. You can download winutils.exe hereand place at path like this – c:/hadoop/bin/winutils.exe
要在 Windows 环境中运行,您需要 Windows 格式的 hadoop 二进制文件。winutils 提供了这一点,我们需要将 hadoop.home.dir 系统属性设置为 winutils.exe 所在的 bin 路径。你可以在这里下载 winutils.exe并放置在这样的路径 - c:/hadoop/bin/winutils.exe
And, you can define Spark Core Dependency in your Maven POM.XML for your project, to get started with.
并且,您可以在您的 Maven POM.XML 中为您的项目定义 Spark Core Dependency,以开始使用。
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
And in your Java/Scala class define this property, to run on your local environmet on Windows -
并在您的 Java/Scala 类中定义此属性,以在 Windows 上的本地环境中运行 -
System.setProperty("hadoop.home.dir", "c://hadoop//");
More details and full setup details can be found here.
可以在此处找到更多详细信息和完整设置详细信息。

