如何在 Eclipse 环境中的 spark 中设置堆大小?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39023022/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 23:16:02  来源:igfitidea点击:

How to set heap size in spark within the Eclipse environment?

eclipseapache-sparkheap-memory

提问by Yassir S

I am trying to run the simple following code using spark within Eclipse:

我正在尝试在 Eclipse 中使用 spark 运行以下简单的代码:

import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object jsonreader {  
  def main(args: Array[String]): Unit = {
    println("Hello, world!")
    val conf = new SparkConf()
      .setAppName("TestJsonReader")
      .setMaster("local")
      .set("spark.driver.memory", "3g") 
    val sc = new SparkContext(conf)

    val sqlContext = new SQLContext(sc)
    val df = sqlContext.read.format("json").load("text.json")

    df.printSchema()
    df.show   
  }
}

However, I get the following errors:

但是,我收到以下错误:

16/08/18 18:05:28 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration.

I followed different tutorials like this one: How to set Apache Spark Executor memory. Most of time either I use --driver-memoryoption (not possible with Eclipse) or by modifiying the spark configuration but there is no corresponding file.

我遵循了像这样的不同教程:How to set Apache Spark Executor memory。大多数情况下,我要么使用--driver-memory选项(Eclipse 不可能),要么通过修改 spark 配置,但没有相应的文件。

Does anyone have any idea about how to solve this issue within Eclipse environment?

有没有人知道如何在 Eclipse 环境中解决这个问题?

回答by abaghel

In Eclipse go to Run > Run Configurations... > Arguments > VM argumentsand set max heapsize like -Xmx512m.

在 Eclipse 中,转到Run > Run Configurations... > Arguments > VM arguments并设置 max heapsize like -Xmx512m

回答by Duy Bui

I had this issue as well and this is how I solved it. Thought it might be helpful.

我也有这个问题,这就是我解决它的方法。认为它可能会有所帮助。

val conf: SparkConf = new SparkConf().setMaster("local[4]").setAppName("TestJsonReader").set("spark.driver.host", "localhost")
conf.set("spark.testing.memory", "2147480000")

回答by user8789594

Working fine for me once modifying the script as conf.set("spark.testing.memory", "2147480000")

一旦将脚本修改为 conf.set("spark.testing.memory", "2147480000")

complete code below:

完整代码如下:

import scala.math.random
import org.apache.spark._

object SparkPi {
  def main(args: Array[String]) {
    val conf: SparkConf = new SparkConf().setMaster("local").setAppName("Spark Pi").set("spark.driver.host", "localhost")

     conf.set("spark.testing.memory", "2147480000")         // if you face any memory issues


    val spark = new SparkContext(conf)
    val slices = if (args.length > 0) args(0).toInt else 2
    val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow

    val count = spark.parallelize(1 until n, slices).map { i =>
      val x = random * 2 - 1
      val y = random * 2 - 1
      if (x * x + y * y < 1) 1 else 0
    }.reduce(_ + _)

    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
  }
}


Step-2

第2步

Run it as “Scala Application”

Step-3 Creating JAR file and Execution:

Step-3 创建 JAR 文件并执行:

bin/spark-submit --class SparkPi --master local SparkPi.jar

回答by allojo

You need to increase also spark.testing.memory if you are running locally

如果您在本地运行,则还需要增加 spark.testing.memory

spark.driver.memory, 571859200 spark.testing.memory, 2147480000

spark.driver.memory, 571859200 spark.testing.memory, 2147480000

回答by StrongYoung

You can set "spark.driver.memory" option by edit the "spark-defaults.conf" file in "${SPARK_HOME}/conf/", By default, there is no file called "spark-defaults.conf" in the directory of "${SPARK_HOME}/conf/", but there is a file "spark-defaults.conf.template", you can use the following command to create "spark-defaults.conf" file:

您可以通过编辑“${SPARK_HOME}/conf/”中的“spark-defaults.conf”文件来设置“spark.driver.memory”选项,默认情况下,该文件中没有名为“spark-defaults.conf”的文件“${SPARK_HOME}/conf/”目录下,但是有一个文件“spark-defaults.conf.template”,可以使用以下命令创建“spark-defaults.conf”文件:

cp spark-defaults.conf.template spark-defaults.conf

then, edit it:

然后,编辑它:

# Example:
# spark.master                     spark://master:7077
# spark.eventLog.enabled           true
# spark.eventLog.dir               hdfs://namenode:8021/directory
# spark.serializer                 org.apache.spark.serializer.KryoSerializer
# spark.driver.memory              5g
# spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"


spark.driver.memory              3g

回答by Stanislav

In my case mvnstopped packaging the project, with the same exception (java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200.).

在我的情况下,mvn停止打包项目,但有相同的例外 ( java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200.)。

I started debugging this issue by changing the settings for the VM heap size: export MAVEN_OPTS="-Xms1024m -Xmx4096m -XX:PermSize=1024m". It did not work..

我通过更改 VM 堆大小的设置开始调试此问题: export MAVEN_OPTS="-Xms1024m -Xmx4096m -XX:PermSize=1024m"。这没用..

Then I tried adding to spark configthe spark.driver.memoryoption equal to 1g[SparkConfig.set("spark.driver.memory","1g")].

然后我尝试添加到等于[ ]spark configspark.driver.memory选项。1gSparkConfig.set("spark.driver.memory","1g")

At the end it turned out that my java installation had somehow gotten messesed up. I reinstalledthe JDK(to a newer version) and had to set up again the JAVA_HOMEpaths and then everything was working from the terminal.

最后发现我的 java 安装不知何故搞砸了。我重新安装JDK(到更新的版本)并且不得不再次设置JAVA_HOME路径,然后一切都在终端上工作。

If upgrading, to use Netbeans/Intellij/Eclipsesomeone would need to configure the JDKsetting in each one of them to point to the new installation of the Java Development Kit.

如果升级,使用Netbeans/ Intellij/Eclipse有人会需要配置JDK在他们中的每一个设置为指向Java开发工具包的新安装。