如何在 Eclipse 环境中的 spark 中设置堆大小?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39023022/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to set heap size in spark within the Eclipse environment?
提问by Yassir S
I am trying to run the simple following code using spark within Eclipse:
我正在尝试在 Eclipse 中使用 spark 运行以下简单的代码:
import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object jsonreader {
def main(args: Array[String]): Unit = {
println("Hello, world!")
val conf = new SparkConf()
.setAppName("TestJsonReader")
.setMaster("local")
.set("spark.driver.memory", "3g")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val df = sqlContext.read.format("json").load("text.json")
df.printSchema()
df.show
}
}
However, I get the following errors:
但是,我收到以下错误:
16/08/18 18:05:28 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration.
I followed different tutorials like this one: How to set Apache Spark Executor memory. Most of time either I use --driver-memory
option (not possible with Eclipse) or by modifiying the spark configuration but there is no corresponding file.
我遵循了像这样的不同教程:How to set Apache Spark Executor memory。大多数情况下,我要么使用--driver-memory
选项(Eclipse 不可能),要么通过修改 spark 配置,但没有相应的文件。
Does anyone have any idea about how to solve this issue within Eclipse environment?
有没有人知道如何在 Eclipse 环境中解决这个问题?
回答by abaghel
In Eclipse go to Run > Run Configurations... > Arguments > VM argumentsand set max heapsize like -Xmx512m
.
在 Eclipse 中,转到Run > Run Configurations... > Arguments > VM arguments并设置 max heapsize like -Xmx512m
。
回答by Duy Bui
I had this issue as well and this is how I solved it. Thought it might be helpful.
我也有这个问题,这就是我解决它的方法。认为它可能会有所帮助。
val conf: SparkConf = new SparkConf().setMaster("local[4]").setAppName("TestJsonReader").set("spark.driver.host", "localhost")
conf.set("spark.testing.memory", "2147480000")
回答by user8789594
Working fine for me once modifying the script as conf.set("spark.testing.memory", "2147480000")
一旦将脚本修改为 conf.set("spark.testing.memory", "2147480000")
complete code below:
完整代码如下:
import scala.math.random
import org.apache.spark._
object SparkPi {
def main(args: Array[String]) {
val conf: SparkConf = new SparkConf().setMaster("local").setAppName("Spark Pi").set("spark.driver.host", "localhost")
conf.set("spark.testing.memory", "2147480000") // if you face any memory issues
val spark = new SparkContext(conf)
val slices = if (args.length > 0) args(0).toInt else 2
val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
val count = spark.parallelize(1 until n, slices).map { i =>
val x = random * 2 - 1
val y = random * 2 - 1
if (x * x + y * y < 1) 1 else 0
}.reduce(_ + _)
println("Pi is roughly " + 4.0 * count / n)
spark.stop()
}
}
Step-2
第2步
Run it as “Scala Application”
Step-3 Creating JAR file and Execution:
Step-3 创建 JAR 文件并执行:
bin/spark-submit --class SparkPi --master local SparkPi.jar
回答by allojo
You need to increase also spark.testing.memory if you are running locally
如果您在本地运行,则还需要增加 spark.testing.memory
spark.driver.memory, 571859200 spark.testing.memory, 2147480000
spark.driver.memory, 571859200 spark.testing.memory, 2147480000
回答by StrongYoung
You can set "spark.driver.memory" option by edit the "spark-defaults.conf" file in "${SPARK_HOME}/conf/", By default, there is no file called "spark-defaults.conf" in the directory of "${SPARK_HOME}/conf/", but there is a file "spark-defaults.conf.template", you can use the following command to create "spark-defaults.conf" file:
您可以通过编辑“${SPARK_HOME}/conf/”中的“spark-defaults.conf”文件来设置“spark.driver.memory”选项,默认情况下,该文件中没有名为“spark-defaults.conf”的文件“${SPARK_HOME}/conf/”目录下,但是有一个文件“spark-defaults.conf.template”,可以使用以下命令创建“spark-defaults.conf”文件:
cp spark-defaults.conf.template spark-defaults.conf
then, edit it:
然后,编辑它:
# Example:
# spark.master spark://master:7077
# spark.eventLog.enabled true
# spark.eventLog.dir hdfs://namenode:8021/directory
# spark.serializer org.apache.spark.serializer.KryoSerializer
# spark.driver.memory 5g
# spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
spark.driver.memory 3g
回答by Stanislav
In my case mvn
stopped packaging the project, with the same exception (java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200.
).
在我的情况下,mvn
停止打包项目,但有相同的例外 ( java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200.
)。
I started debugging this issue by changing the settings for the VM heap size: export MAVEN_OPTS="-Xms1024m -Xmx4096m -XX:PermSize=1024m"
. It did not work..
我通过更改 VM 堆大小的设置开始调试此问题: export MAVEN_OPTS="-Xms1024m -Xmx4096m -XX:PermSize=1024m"
。这没用..
Then I tried adding to spark config
the spark.driver.memory
option equal to 1g
[SparkConfig.set("spark.driver.memory","1g")
].
然后我尝试添加到等于[ ]spark config
的spark.driver.memory
选项。1g
SparkConfig.set("spark.driver.memory","1g")
At the end it turned out that my java installation had somehow gotten messesed up. I reinstalledthe JDK
(to a newer version) and had to set up again the JAVA_HOME
paths and then everything was working from the terminal.
最后发现我的 java 安装不知何故搞砸了。我重新安装了JDK
(到更新的版本)并且不得不再次设置JAVA_HOME
路径,然后一切都在终端上工作。
If upgrading, to use Netbeans
/Intellij
/Eclipse
someone would need to configure the JDK
setting in each one of them to point to the new installation of the Java Development Kit.
如果升级,使用Netbeans
/ Intellij
/Eclipse
有人会需要配置JDK
在他们中的每一个设置为指向Java开发工具包的新安装。