Java 在同一个 JVM 中检测到多个 SparkContext

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34879414/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 16:04:37  来源:igfitidea点击:

Multiple SparkContext detected in the same JVM

javaapache-sparkjvm

提问by Guforu

according my last questionI have to define the Multiple SparkContext for my unique JVM.

根据我的最后一个问题,我必须为我独特的 JVM 定义 Multiple SparkContext。

I did it in the next way (using Java):

我采用了下一种方式(使用 Java):

SparkConf conf = new SparkConf();
conf.setAppName("Spark MultipleContest Test");
conf.set("spark.driver.allowMultipleContexts", "true");
conf.setMaster("local");

After that I create the next source code:

之后我创建下一个源代码:

SparkContext sc = new SparkContext(conf);
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

and later in the code:

然后在代码中:

JavaSparkContext ctx = new JavaSparkContext(conf);
JavaRDD<Row> testRDD = ctx.parallelize(AllList);

After the code executing I got next error message:

代码执行后,我收到下一条错误消息:

16/01/19 15:21:08 WARN SparkContext: Multiple running SparkContexts detected in the same JVM!
org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. The currently running SparkContext was created at:
org.apache.spark.SparkContext.<init>(SparkContext.scala:81)
test.MLlib.BinarryClassification.main(BinaryClassification.java:41)
    at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning.apply(SparkContext.scala:2083)
    at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning.apply(SparkContext.scala:2065)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.SparkContext$.assertNoOtherContextIsRunning(SparkContext.scala:2065)
    at org.apache.spark.SparkContext$.setActiveContext(SparkContext.scala:2151)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:2023)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
    at test.MLlib.BinarryClassification.main(BinaryClassification.java:105)

The numbers 41and 105are the lines, where both objects are defined in Java code. My question is, is it possible to execute multiple SparkContext on the same JVM and how to do it, if I already use the set-method ?

数字41105是行,其中两个对象都在 Java 代码中定义。我的问题是,如果我已经使用了set-method ,是否可以在同一个 JVM 上执行多个 SparkContext 以及如何执行?

采纳答案by mattinbits

Are you sure you need the JavaSparkContext as a separate context? The previous question that you refer to doesn't say so. If you already have a Spark Context you can create a new JavaSparkContext from it, rather than create a separate context:

您确定需要将 JavaSparkContext 作为单独的上下文吗?你提到的上一个问题没有这么说。如果你已经有一个 Spark Context,你可以从它创建一个新的 JavaSparkContext,而不是创建一个单独的上下文:

SparkConf conf = new SparkConf();
conf.setAppName("Spark MultipleContest Test");
conf.set("spark.driver.allowMultipleContexts", "true");
conf.setMaster("local");

SparkContext sc = new SparkContext(conf);
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

//Create a Java Context which is the same as the scala one under the hood
JavaSparkContext.fromSparkContext(sc)

回答by Khaled Sorino

the SparkContext is running by default, so u have to stop this context: sc.stop then you can continue without any pb

默认情况下 SparkContext 正在运行,因此您必须停止此上下文:sc.stop 然后您可以在没有任何 pb 的情况下继续

回答by Gaurang Shah

Rather than using SparkContextyou should use buildermethod on SparkSessionwhich more roubustly instantiates the spark and SQL context and ensures that there is not context conflict.

而不是用SparkContext你应该使用builder的方法,SparkSession其中更roubustly实例火花和SQL上下文,并确保没有背景的冲突。

import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().appName("demo").getOrCreate()