.scala 文件的 spark-submit

Question

提问by Codejoy

I have been running some test spark scala code using probably a bad way of doing things with spark-shell:

我一直在运行一些测试 spark scala 代码，使用 spark-shell 的方法可能很糟糕：

spark-shell --conf spark.neo4j.bolt.password=Stuffffit --packages neo4j-contrib:neo4j-spark-connector:2.0.0-M2,graphframes:graphframes:0.2.0-spark2.0-s_2.11 -i neo4jsparkCluster.scala

This would execute my code on spark and pop into the shell when done.

这将在 spark 上执行我的代码并在完成后弹出 shell。

Now that I am trying to run this on a cluster, I think I need to use spark-submit, to which I thought would be:

现在我正在尝试在集群上运行它，我想我需要使用 spark-submit，我认为应该是：

spark-submit --conf spark.neo4j.bolt.password=Stuffffit --packages neo4j-contrib:neo4j-spark-connector:2.0.0-M2,graphframes:graphframes:0.2.0-spark2.0-s_2.11 -i neo4jsparkCluster.scala

but it does not like the .scala file, somehow does it have to be compiled into a class? the scala code is a simple scala file with several helper classes defined in it and no real main class so to speak. I don't see int he help files but maybe I am missing it, can I just spark-submit a file or do I have to somehow give it the class? Thus changing my scala code?

但它不喜欢 .scala 文件，不知何故它必须被编译成一个类？scala 代码是一个简单的 scala 文件，其中定义了几个帮助程序类，可以说没有真正的主类。我没有看到他的帮助文件，但也许我错过了它，我可以直接提交一个文件还是我必须以某种方式给它上课？从而改变我的Scala代码？

I did add this to my scala code too:

我也将它添加到我的 Scala 代码中：

went from this

从此

val conf = new SparkConf.setMaster("local").setAppName("neo4jspark")


val sc = new SparkContext(conf)

To this:

对此：

val sc = new SparkContext(new SparkConf().setMaster("spark://192.20.0.71:7077")

Answer 1

回答by shridharama

There are 2 quick and dirty ways of doing this:

有两种快速而肮脏的方法可以做到这一点：

Without modifying the scala file

不修改scala文件

Simply use the spark shell with the -iflag:

只需使用带有-i标志的火花壳：

$SPARK_HOME/bin/spark-shell -i neo4jsparkCluster.scala

Modifying the scala file to include a main method

修改 scala 文件以包含主要方法

a. Compile:

一个。编译：

scalac -classpath <location of spark jars on your machine> neo4jsparkCluster

b. Submit it to your cluster:

湾将其提交到您的集群：

/usr/lib/spark/bin/spark-submit --class <qualified class name> --master <> .

Answer 2

回答by zachdb86

You will want to package your scala application with sbt and include Spark as a dependency within your build.sbt file.

您需要使用 sbt 打包您的 scala 应用程序，并将 Spark 作为依赖项包含在您的 build.sbt 文件中。

See the self contained applications section of the quickstart guide for full instructions https://spark.apache.org/docs/latest/quick-start.html

有关完整说明，请参阅快速入门指南的自包含应用程序部分https://spark.apache.org/docs/latest/quick-start.html

Answer 3

回答by Zouzias

You can take a look at the following Hello World example for Spark which packages your application as @zachdb86 already mentioned.

您可以查看以下 Spark 的 Hello World 示例，该示例将您的应用程序打包为已经提到的 @zachdb86。

spark-hello-world

火花你好世界

.scala 文件的 spark-submit

提问by Codejoy

回答by shridharama

回答by zachdb86

回答by Zouzias

相关推荐

最近更新

标签

.scala 文件的 spark-submit

提问by Codejoy

回答by shridharama

回答by zachdb86

回答by Zouzias

相关推荐

scala Spark 中有哪些连接类型？

scala 如何根据另一个数据帧的值（主键）计算火花数据帧中的行数？

scala Apache Spark：获取每个分区的记录数

scala Scala通过表达式向数据框添加新列

相关推荐

最近更新

标签