如何在 Intellij IDEA 上调试基于 Scala 的 Spark 程序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39885281/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to debug a scala based Spark program on Intellij IDEA
提问by lserlohn
I am currently building my development IDE using Intellij IDEA. I followed exactly the same way as http://spark.apache.org/docs/latest/quick-start.html
我目前正在使用 Intellij IDEA 构建我的开发 IDE。我遵循与http://spark.apache.org/docs/latest/quick-start.html完全相同的方式
build.sbt file
build.sbt 文件
name := "Simple Project"
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
Sample Program File
示例程序文件
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object MySpark {
def main(args: Array[String]){
val logFile = "/IdeaProjects/hello/testfile.txt"
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
If I use command line:
如果我使用命令行:
sbt package
and then
接着
spark-submit --class "MySpark" --master local[4] target/scala-2.11/myspark_2.11-1.0.jar
I am able to generate jar package and spark runs well.
我能够生成 jar 包并且 spark 运行良好。
However, I want to use Intellij IDEA to debug the program in the IDE. How can I setup the configuration, so that if I click "debug", it will automatically generate the jar package and automatically launch the task by executing "spark-submit-" command line.
但是,我想使用Intellij IDEA在IDE中调试程序。如何设置配置,以便如果我单击“调试”,它会自动生成 jar 包并通过执行“spark-submit-”命令行自动启动任务。
I just want everything could be simple as "one click" on the debug button in Intellij IDEA.
我只想让一切都变得简单,就像在 Intellij IDEA 中的调试按钮上“单击一下”一样简单。
Thanks.
谢谢。
回答by Sandeep Purohit
you can simple add below spark options
您可以简单地添加以下火花选项
export SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=7777
And create the Debug configuration as follows
并创建调试配置如下
Rub-> Edit Configuration -> Click on "+" left top cornor -> Remote -> set port and name
擦-> 编辑配置-> 点击左上角的“+”-> 远程-> 设置端口和名称
After above configuration run spark application with spark-submit or sbt run and then run debug which is created in configuration. and add checkpoints for debug.
在上述配置之后,使用 spark-submit 或 sbt run 运行 spark 应用程序,然后运行在配置中创建的调试。并添加调试检查点。
回答by Alfredo Gimenez
If you're using the scala plugin and have your project configured as an sbt project, it should basically work out of the box.
如果您正在使用 scala 插件并将您的项目配置为 sbt 项目,那么它基本上应该是开箱即用的。
Go to Run->Edit Configurations...and add your run configuration normally.
转到Run->Edit Configurations...并正常添加您的运行配置。
Since you have a mainclass, you probably want to add a new Applicationconfiguration.
由于您有一个main类,您可能想要添加一个新Application配置。
You can also just click on the blue square icon, to the left of your maincode.
您也可以单击main代码左侧的蓝色方形图标。
Once your run configuration is set up, you can use the Debug feature.
设置运行配置后,您可以使用调试功能。
回答by Jeffrey
I've run into this when I switch between 2.10 and 2.11. SBT expects the primary object to be in src->main->scala-2.10 or src->main->scala-2.11 depending on your version.
我在 2.10 和 2.11 之间切换时遇到了这个问题。SBT 期望主要对象位于 src->main->scala-2.10 或 src->main->scala-2.11 中,具体取决于您的版本。
回答by balaudt
It is similar to the solution provided here: Debugging Spark Applications. You create a Remote debug run configuration in Idea and pass Java debug parameters to the spark-submit command. The only catch is you need to start the remote debug config in Idea after triggering the spark-submit command. I read somewhere that a Thread.sleep just before your debug point should enable you to do this and I too was able to successfully use the suggestion.
它类似于此处提供的解决方案:调试 Spark 应用程序。您在 Idea 中创建远程调试运行配置并将 Java 调试参数传递给 spark-submit 命令。唯一的问题是您需要在触发 spark-submit 命令后在 Idea 中启动远程调试配置。我在某处读到在调试点之前的 Thread.sleep 应该使您能够执行此操作,并且我也能够成功使用该建议。

