Python 如何在 PySpark 中运行脚本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40028919/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:05:58  来源:igfitidea点击:

How to run a script in PySpark

pythonapache-sparkpyspark

提问by Daniel Rodríguez

I'm trying to run a script in the pyspark environment but so far I haven't been able to. How can I run a script like python script.py but in pyspark? Thanks

我正在尝试在 pyspark 环境中运行脚本,但到目前为止我还没有。如何在 pyspark 中运行像 python script.py 这样的脚本?谢谢

回答by Ulas Keles

You can do: ./bin/spark-submit mypythonfile.py

你可以做: ./bin/spark-submit mypythonfile.py

Running python applications through pysparkis not supported as of Spark 2.0.

pyspark从 Spark 2.0 开始,不支持通过运行 python 应用程序。

回答by Jussi Kujala

pyspark 2.0 and later execute script file in environment variable PYTHONSTARTUP, so you can run:

pyspark 2.0 及更高版本在环境变量中执行脚本文件PYTHONSTARTUP,因此您可以运行:

PYTHONSTARTUP=code.py pyspark

Compared to spark-submitanswer this is useful for running initialization code before using the interactive pyspark shell.

spark-submit回答相比,这对于在使用交互式 pyspark shell 之前运行初始化代码很有用。

回答by Selva

Just spark-submit mypythonfile.pyshould be enough.

刚好spark-submit mypythonfile.py应该够了。

回答by Arun Annamalai

You can execute "script.py" as follows

您可以执行“script.py”如下

pyspark < script.py

or

或者

# if you want to run pyspark in yarn cluster
pyspark --master yarn < script.py

回答by Krish.Venkat

Spark environment provides a command to execute the application file, be it in Scala or Java(need a Jar format), Python and R programming file. The command is,

Spark 环境提供了执行应用程序文件的命令,可以是 Scala 或 Java(需要 Jar 格式)、Python 和 R 编程文件。命令是,

$ spark-submit --master <url> <SCRIPTNAME>.py.

$ spark-submit --master <url> <SCRIPTNAME>.py.

I'm running spark in windows 64bit architecture system with JDK 1.8 version.

我在带有 JDK 1.8 版本的 Windows 64 位架构系统中运行 spark。

P.S find a screenshot of my terminal window. Code snippet

PS找到我的终端窗口的截图。 代码片段