Java 在使用 JAR 运行 spark-submit 时,如何将程序参数传递给主函数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36024565/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I pass program-argument to main function in running spark-submit with a JAR?
提问by Eric Na
I know this is a trivial question, but I could not find the answer on the internet.
我知道这是一个微不足道的问题,但我在互联网上找不到答案。
I am trying to run a Java class with the main
function with program arguments (String[] args
).
我正在尝试使用main
带有程序参数 ( String[] args
)的函数运行 Java 类。
However, when I submit the job using spark-submit
and pass program arguments as I would do with
但是,当我使用spark-submit
和传递程序参数提交作业时
java -cp <some jar>.jar <Some class name> <arg1> <arg2>
it does not read the arg
s.
它不读取arg
s。
The command I tried running was
我尝试运行的命令是
bin/spark-submit analytics-package.jar --class full.package.name.ClassName 1234 someargument someArgument
and this gives
这给
Error: No main class set in JAR; please specify one with --class
and when I tried:
当我尝试时:
bin/spark-submit --class full.package.name.ClassName 1234 someargument someArgument analytics-package.jar
I get
我得到
Warning: Local jar /mnt/disk1/spark/1 does not exist, skipping.
java.lang.ClassNotFoundException: com.relcy.analytics.query.QueryAnalytics
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:176)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:693)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:183)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:208)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:122)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
How can I pass these arguments? They change frequently on each run of the job, and they need to be passed as arguments.
我怎样才能传递这些论点?它们在作业的每次运行中经常更改,并且需要作为参数传递。
采纳答案by Matt Clark
Arguments passed beforethe .jar file will be arguments to the JVM, where as arguments passed afterthe jar file will be passed on to the user's program.
在 .jar 文件之前传递的参数将是 JVM 的参数,而在 jar 文件之后传递的参数将传递给用户的程序。
bin/spark-submit --class classname -Xms256m -Xmx1g something.jar someargument
Here, s
will equal someargument
, whereas the -Xms -Xmx
is passed into the JVM.
在这里,s
将等于someargument
,而-Xms -Xmx
被传递到 JVM。
public static void main(String[] args) {
String s = args[0];
}
回答by Eric Na
I found the correct command from this tutorial.
我从本教程中找到了正确的命令。
The command should be of the form:
该命令应采用以下形式:
bin/spark-submit --class full.package.name.ClassName analytics-package.jar someargument someArgument
回答by Sushruth
spark-submit --class SparkWordCount --master yarn --jars <jar1.jar>,<jar2.jar>
sparkwordcount-1.0.jar /user/user01/input/alice.txt /user/user01/output
回答by rahul
The first unrecognized argument is treated as the primaryResource (jar file in our case). Checkout SparkSubmitArguments.handleUnknown
第一个无法识别的参数被视为主要资源(在我们的例子中是 jar 文件)。结帐SparkSubmitArguments.handleUnknown
All the arguments after the primaryResource as treated as arguments to the application. Checkout SparkSubmitArguments.handleExtraArgs
primaryResource 之后的所有参数都被视为应用程序的参数。结帐SparkSubmitArguments.handleExtraArgs
To better understand how the arguments are parsed, checkout SparkSubmitOptionParser.parse. The above 2 methods are called from this method
为了更好地理解如何解析参数,请查看 SparkSubmitOptionParser.parse。上面2个方法是从这个方法调用的