bash 如何将 spark-submit 的整个输出重定向到文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46429962/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to redirect entire output of spark-submit to a file
提问by timbram
So, I am trying to redirect the output of an apache spark-submit command to text file but some output fails to populate file. Here is the command I am using:
因此,我试图将 apache spark-submit 命令的输出重定向到文本文件,但某些输出无法填充文件。这是我正在使用的命令:
spark-submit something.py > results.txt
I can see the output in the terminal but I do not see it in the file. What am I forgetting or doing wrong here?
我可以在终端中看到输出,但在文件中看不到它。我在这里忘记了什么或做错了什么?
Edit:
编辑:
If I use
如果我使用
spark-submit something.py | less
I can see all the output being piped into less
我可以看到所有的输出都被输送到 less
回答by philantrovert
spark-submit
prints most of it's output to STDERR
spark-submit
将大部分输出打印到 STDERR
To redirect the entire output to one file, you can use:
要将整个输出重定向到一个文件,您可以使用:
spark-submit something.py > results.txt 2>&1
Or
或者
spark-submit something.py &> results.txt
回答by Avishek Bhattacharya
If you are running the spark-submit on a cluster the logs are stored with the application Id. You can see the logs once the application finishes.
如果您在集群上运行 spark-submit,日志将与应用程序 ID 一起存储。应用程序完成后,您可以查看日志。
yarn logs --applicationId <your applicationId> > myfile.txt
Should fetch you the log of your job
应该为您获取工作日志
The applicationId of your job is given when you submit the spark job. You will be able to see that in the console where you are submitting or from the Hadoop UI.
提交 spark 作业时会给出作业的 applicationId。您将能够在提交的控制台中或从 Hadoop UI 中看到它。