bash 如何将 spark-submit 的整个输出重定向到文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46429962/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 16:28:42  来源:igfitidea点击:

How to redirect entire output of spark-submit to a file

linuxbashapache-spark

提问by timbram

So, I am trying to redirect the output of an apache spark-submit command to text file but some output fails to populate file. Here is the command I am using:

因此,我试图将 apache spark-submit 命令的输出重定向到文本文件,但某些输出无法填充文件。这是我正在使用的命令:

spark-submit something.py > results.txt

I can see the output in the terminal but I do not see it in the file. What am I forgetting or doing wrong here?

我可以在终端中看到输出,但在文件中看不到它。我在这里忘记了什么或做错了什么?

Edit:

编辑:

If I use

如果我使用

spark-submit something.py | less

I can see all the output being piped into less

我可以看到所有的输出都被输送到 less

回答by philantrovert

spark-submitprints most of it's output to STDERR

spark-submit将大部分输出打印到 STDERR

To redirect the entire output to one file, you can use:

要将整个输出重定向到一个文件,您可以使用:

spark-submit something.py > results.txt 2>&1

Or

或者

spark-submit something.py &> results.txt

回答by Avishek Bhattacharya

If you are running the spark-submit on a cluster the logs are stored with the application Id. You can see the logs once the application finishes.

如果您在集群上运行 spark-submit,日志将与应用程序 ID 一起存储。应用程序完成后,您可以查看日志。

yarn logs --applicationId <your applicationId> > myfile.txt

Should fetch you the log of your job

应该为您获取工作日志

The applicationId of your job is given when you submit the spark job. You will be able to see that in the console where you are submitting or from the Hadoop UI.

提交 spark 作业时会给出作业的 applicationId。您将能够在提交的控制台中或从 Hadoop UI 中看到它。