scala Spark:如何从 spark shell 运行 spark 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27717379/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 06:48:19  来源:igfitidea点击:

Spark : how to run spark file from spark shell

scalaapache-sparkcloudera-cdhcloudera-manager

提问by Ramakrishna


I am using CDH 5.2. I am able to use spark-shellto run the commands.


我正在使用 CDH 5.2。我可以使用spark-shell来运行命令。

  1. How can I run the file(file.spark) which contain spark commands.
  2. Is there any way to run/compile the scala programs in CDH 5.2 without sbt?
  1. 如何运行包含 spark 命令的文件(file.spark)。
  2. 有没有办法在没有 sbt 的情况下在 CDH 5.2 中运行/编译 scala 程序?

Thanks in advance

提前致谢

回答by Ziyao Li

In command line, you can use

在命令行中,您可以使用

spark-shell -i file.scala

to run code which is written in file.scala

运行写入的代码 file.scala

回答by Steve

To load an external file from spark-shell simply do

要从 spark-shell 加载外部文件,只需执行

:load PATH_TO_FILE

This will call everything in your file.

这将调用文件中的所有内容。

I don't have a solution for your SBT question though sorry :-)

抱歉,我没有解决您的 SBT 问题:-)

回答by javadba

You can use either sbt or maven to compile spark programs. Simply add the spark as dependency to maven

您可以使用 sbt 或 maven 来编译 Spark 程序。只需将 spark 作为依赖项添加到 maven

<repository>
      <id>Spark repository</id>
      <url>http://www.sparkjava.com/nexus/content/repositories/spark/</url>
</repository>

And then the dependency:

然后是依赖:

<dependency>
      <groupId>spark</groupId>
      <artifactId>spark</artifactId>
      <version>1.2.0</version>
</dependency>

In terms of running a file with spark commands: you can simply do this:

在使用 spark 命令运行文件方面:您可以简单地执行以下操作:

echo"
   import org.apache.spark.sql.*
   ssc = new SQLContext(sc)
   ssc.sql("select * from mytable").collect
" > spark.input

Now run the commands script:

现在运行命令脚本:

cat spark.input | spark-shell

回答by loneStar

Just to give more perspective to the answers

只是为了给答案提供更多视角

Spark-shell is a scala repl

Spark-shell 是一个 Scala 复制品

You can type :helpto see the list of operation that are possible inside the scala shell

你可以输入:help来查看 scala shell 中可能的操作列表

scala> :help
All commands can be abbreviated, e.g., :he instead of :help.
:edit <id>|<line>        edit history
:help [command]          print this summary or command-specific help
:history [num]           show the history (optional num is commands to show)
:h? <string>             search the history
:imports [name name ...] show import history, identifying sources of names
:implicits [-v]          show the implicits in scope
:javap <path|class>      disassemble a file or class name
:line <id>|<line>        place line(s) at the end of history
:load <path>             interpret lines in a file
:paste [-raw] [path]     enter paste mode or paste a file
:power                   enable power user mode
:quit                    exit the interpreter
:replay [options]        reset the repl and replay all previous commands
:require <path>          add a jar to the classpath
:reset [options]         reset the repl to its initial state, forgetting all session entries
:save <path>             save replayable session to a file
:sh <command line>       run a shell command (result is implicitly => List[String])
:settings <options>      update compiler options, if possible; see reset
:silent                  disable/enable automatic printing of results
:type [-v] <expr>        display the type of an expression without evaluating it
:kind [-v] <expr>        display the kind of expression's type
:warnings                show the suppressed warnings from the most recent line which had any

:load interpret lines in a file

:load 文件中的解释行

回答by Phu Ngo

Tested on both spark-shellversion 1.6.3and spark2-shellversion 2.3.0.2.6.5.179-4, you can directly pipe to the shell's stdin like

spark-shellversion 1.6.3和都进行了测试spark2-shellversion 2.3.0.2.6.5.179-4,您可以直接通过管道连接到外壳的标准输入,例如

spark-shell <<< "1+1"

or in your use case,

或者在您的用例中,

spark-shell < file.spark

回答by amarnath pimple

You can run as you run your shell script. This example to run from command line environment example

您可以在运行 shell 脚本时运行。此示例从命令行环境示例运行

./bin/spark-shell:- this is the path of your spark-shell under bin /home/fold1/spark_program.py:- This is the path where your python program is there.

./bin/spark-shell:- 这是你的 spark-shell 在 bin 下 /home/fold1/spark_program.py的路径 :- 这是你的 python 程序所在的路径。

So:

所以:

./bin.spark-shell /home/fold1/spark_prohram.py