bash spark提交中的多个驱动程序java选项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44129459/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 16:11:12  来源:igfitidea点击:

Multiple driver-java-options in spark submit

bashapache-spark

提问by eboni

I am using a spark-submit specified in a bash script as:

我正在使用 bash 脚本中指定的 spark-submit 为:

CLUSTER_OPTIONS=" \
--master yarn-cluster \
--files     file:///${CONF_DIR}/app.conf#app.conf,file:///${CONF_DIR}/log4j-executor.xml#log4j.xml \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.xml" \
--driver-java-options '-Dlog4j.configuration=file:log4j.xml -Dconfig.file=app.conf' \
--keytab ${KEYTAB} \
--principal ${PRINCIPAL} \
"

I am finding that app conf is not being picked up as I receive this error:

我发现 app conf 没有被接收,因为我收到这个错误:

Error: Unrecognized option: -Dconfig.file=file:app.conf'

I have also attempted different ways to encapsulate the driver-java-options:

我还尝试了不同的方法来封装 driver-java-options:

1)

1)

--driver-java-options \"-Dlog4j.configuration=file:log4j.xml -Dconfig.file=app.conf\" \

Error: Unrecognized option: -Dconfig.file=file:app.conf"

2)

2)

--driver-java-options "-Dlog4j.configuration=file:log4j.xml -Dconfig.file=file:transformation.conf" \


./start_app.sh: line 30: -Dconfig.file=file:app.conf --keytab /app/conf/keytab/principal.keytab --principal principal : No such file or directory

How can i specify multiple driver-java-optionsfor use by my Spark app?

如何指定多个driver-java-options供我的 Spark 应用程序使用?

N.B. I am using Spark 1.5.0

注意我使用的是 Spark 1.5.0

回答by user3008410

Just writing this because it was so odd. The way I got this to work, it was not until I made --driver-java-options the firstof all arguments. I left it as is so you get the entirety.

写这个是因为太奇怪了。我让这个工作的方式,直到我将 --driver-java-options 作为所有参数中的第一个。我保持原样,所以你得到了全部。

Using pyspark Local mode

使用 pyspark 本地模式

/opt/apache-spark/spark-2.3.0-bin-hadoop2.7/bin/spark-submit \
    --driver-java-options "-Xms2G -Doracle.jdbc.Trace=true -Djava.util.logging.config.file=/opt/apache-spark/spark-2.3.0-bin-hadoop2.7/conf/oraclejdbclog.properties -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=1098 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.net.preferIPv4Stack=true -Djava.rmi.server.hostname=192.168.2.120 -Dcom.sun.management.jmxremote.rmi.port=1095" \
    --driver-memory $_driver_memory \
    --executor-memory $_executor_memory \
    --total-executor-cores $_total_executor_cores \
    --verbose \
    --jars /opt/apache-spark/jars/log4j-1.2.17.jar main.py \
    --dbprefix  \
    --copyfrom 

Hope this helps someone.

希望这可以帮助某人。

回答by Nonontb

Try to use:

尝试使用:

 --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.xml -Dconfig.file=app.conf"

In my case, it works great along --files as you used it

就我而言,它与 --files 一起使用时效果很好

You may want to add:

您可能想要添加:

--conf "spark.executor.extraJavaOptions=...." 

if the files are accessed from the executors

如果文件是从执行程序访问的

Hope it helps, Regards

希望它有帮助,问候

回答by Antonio Cachuan

Working 2018 in Spark 2.3.0

在 Spark 2.3.0 中工作 2018

spark2-submit \
--class com.demo.Main \
--master yarn --deploy-mode client \
--driver-memory 10G --driver-cores 8 --executor-memory 13G --executor-cores 4 \
--num-executors 10 \
--verbose \
--conf "spark.driver.extraJavaOptions=-Dconfig.file=/$HOME/application.conf -Dlog4j.configuration=$HOME/log4j.properties" \
--conf "spark.executor.extraJavaOptions=-Dconfig.file=$HOME/application.conf -Dlog4j.configuration=$HOME/log4j.properties" \
--files "application.conf,log4j.properties" \
$HOME/ingestion-1.0-SNAPSHOT-jar-with-dependencies.jar