Python: subprocess.call, stdout to file, stderr to file, 在屏幕上实时显示stderr
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18344932/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: subprocess.call, stdout to file, stderr to file, display stderr on screen in real time
提问by Ben S.
I have a command line tool (actually, several) that I am writing a wrapper for in Python.
我有一个命令行工具(实际上是几个),我正在用 Python 为其编写包装器。
The tool is generally used like this:
该工具通常是这样使用的:
$ path_to_tool -option1 -option2 > file_out
The user gets the output written to file_out, and is also able to see various status messages of the tool as it is running.
用户将输出写入 file_out,并且还能够在工具运行时查看工具的各种状态消息。
I want to replicate this behavior, while also logging stderr (the status messages) to a file.
我想复制这种行为,同时还将 stderr(状态消息)记录到文件中。
What I have is this:
我有的是这个:
from subprocess import call
call(['path_to_tool','-option1','option2'], stdout = file_out, stderr = log_file)
This works fine EXCEPT that stderr is not written to the screen. I can add code to print the contents of the log_file to the screen of course, but then the user will see it after everything is done rather than while it is happening.
这工作正常,除了 stderr 没有写入屏幕。我当然可以添加代码来将 log_file 的内容打印到屏幕上,但是用户将在一切完成后而不是在它发生时看到它。
To recap, desired behavior is:
回顾一下,所需的行为是:
- use call(), or subprocess()
- direct stdout to a file
- direct stderr to a file, while also writing stderr to the screen in real time as if the tool had been called directly from the command line.
- 使用 call() 或 subprocess()
- 将标准输出直接输出到文件
- 将 stderr 直接写入文件,同时将 stderr 实时写入屏幕,就像直接从命令行调用该工具一样。
I have a feeling I'm either missing something really simple, or this is much more complicated than I thought...thanks for any help!
我有一种感觉,我要么错过了一些非常简单的东西,要么这比我想象的要复杂得多……感谢您的帮助!
EDIT: this only needs to work on Linux.
编辑:这只需要在 Linux 上工作。
采纳答案by abarnert
You cando this with subprocess
, but it's not trivial. If you look at the Frequently Used Argumentsin the docs, you'll see that you can pass PIPE
as the stderr
argument, which creates a new pipe, passes one side of the pipe to the child process, and makes the other side available to use as the stderr
attribute.*
你可以用来做到这一点subprocess
,但这不是微不足道的。如果您查看文档中的常用参数,您会发现可以PIPE
作为stderr
参数传递,这会创建一个新管道,将管道的一侧传递给子进程,并使另一侧可用作该stderr
属性*。
So, you will need to service that pipe, writing to the screen and to the file. In general, getting the details right for this is very tricky.** In your case, there's only one pipe, and you're planning on servicing it synchronously, so it's not that bad.
因此,您需要维护该管道,写入屏幕和文件。一般来说,获得正确的细节是非常棘手的。** 在您的情况下,只有一个管道,并且您计划同步维护它,所以还不错。
import subprocess
proc = subprocess.Popen(['path_to_tool', '-option1', 'option2'],
stdout=file_out, stderr=subprocess.PIPE)
for line in proc.stderr:
sys.stdout.write(line)
log_file.write(line)
proc.wait()
(Note that there are some issues using for line in proc.stderr:
—basically, if what you're reading turns out not to be line-buffered for any reason, you can sit around waiting for a newline even though there's actually half a line worth of data to process. You can read chunks at a time with, say, read(128)
, or even read(1)
, to get the data more smoothly if necessary. If you need to actually get every byte as soon as it arrives, and can't afford the cost of read(1)
, you'll need to put the pipe in non-blocking mode and read asynchronously.)
(请注意,使用 -for line in proc.stderr:
基本上存在一些问题,如果您正在阅读的内容由于任何原因没有被行缓冲,您可以坐下来等待换行符,即使实际上有半行数据需要处理。你可以用,比如说,一次读取的块read(128)
,甚至read(1)
,让数据更加顺畅,如果必要的。如果你需要真正得到每一个字节尽快到达,而无法承担的成本read(1)
,你会需要将管道置于非阻塞模式并异步读取。)
But if you're on Unix, it might be simpler to use the tee
command to do it for you.
但是,如果您使用的是 Unix,则使用该tee
命令为您执行此操作可能会更简单。
For a quick&dirty solution, you can use the shell to pipe through it. Something like this:
对于快速而肮脏的解决方案,您可以使用外壳通过它。像这样的东西:
subprocess.call('path_to_tool -option1 option2 2|tee log_file 1>2', shell=True,
stdout=file_out)
But I don't want to debug shell piping; let's do it in Python, as shown in the docs:
但我不想调试 shell 管道;让我们用 Python 来做,如文档中所示:
tool = subprocess.Popen(['path_to_tool', '-option1', 'option2'],
stdout=file_out, stderr=subprocess.PIPE)
tee = subprocess.Popen(['tee', 'log_file'], stdin=tool.stderr)
tool.stderr.close()
tee.communicate()
Finally, there are a dozen or more higher-level wrappers around subprocesses and/or the shell on PyPI—sh
, shell
, shell_command
, shellout
, iterpipes
, sarge
, cmd_utils
, commandwrapper
, etc. Search for "shell", "subprocess", "process", "command line", etc. and find one you like that makes the problem trivial.
最后,有一打或周围子过程更多更高级别的包装和/或上PyPI-壳sh
,shell
,shell_command
,shellout
,iterpipes
,sarge
,cmd_utils
,commandwrapper
,等等搜索“壳”,“子”,“过程”,“命令行”等,然后找到一个你喜欢的,使问题变得微不足道。
What if you need to gather both stderr and stdout?
如果您需要收集 stderr 和 stdout 怎么办?
The easy way to do it is to just redirect one to the other, as Sven Marnach suggests in a comment. Just change the Popen
parameters like this:
最简单的方法是将一个重定向到另一个,正如 Sven Marnach 在评论中所建议的那样。只需Popen
像这样更改参数:
tool = subprocess.Popen(['path_to_tool', '-option1', 'option2'],
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
And then everywhere you used tool.stderr
, use tool.stdout
instead—e.g., for the last example:
然后在你使用的任何地方tool.stderr
,使用tool.stdout
替代——例如,对于最后一个例子:
tee = subprocess.Popen(['tee', 'log_file'], stdin=tool.stdout)
tool.stdout.close()
tee.communicate()
But this has some tradeoffs. Most obviously, mixing the two streams together means you can't log stdout to file_out and stderr to log_file, or copy stdout to your stdout and stderr to your stderr. But it also means the ordering can be non-deterministic—if the subprocess always writes two lines to stderr before writing anything to stdout, you might end up getting a bunch of stdout between those two lines once you mix the streams. And it means they have to share stdout's buffering mode, so if you were relying on the fact that linux/glibc guarantees stderr to be line-buffered (unless the subprocess explicitly changes it), that may no longer be true.
但这有一些权衡。最明显的是,将两个流混合在一起意味着您不能将 stdout 记录到 file_out 并将 stderr 记录到 log_file,或者将 stdout 复制到 stdout 并将 stderr 复制到 stderr。但这也意味着排序可能是不确定的——如果子进程在向 stdout 写入任何内容之前总是向 stderr 写入两行,那么一旦混合流,您可能最终会在这两行之间得到一堆 stdout。这意味着它们必须共享 stdout 的缓冲模式,所以如果您依赖于 linux/glibc 保证 stderr 是行缓冲的(除非子进程明确更改它),那可能不再正确。
If you need to handle the two processes separately, it gets more difficult. Earlier, I said that servicing the pipe on the fly is easy as long as you only have one pipe and can service it synchronously. If you have two pipes, that's obviously no longer true. Imagine you're waiting on tool.stdout.read()
, and new data comes in from tool.stderr
. If there's too much data, it can cause the pipe to overflow and the subprocess to block. But even if that doesn't happen, you obviously won't be able to read and log the stderr data until something comes in from stdout.
如果您需要分别处理这两个过程,则变得更加困难。早些时候,我说过,只要您只有一个管道并且可以同步维护它,即时维护管道很容易。如果你有两个管道,这显然不再正确。想象一下,您正在等待tool.stdout.read()
,新数据来自tool.stderr
。如果数据太多,可能会导致管道溢出和子进程阻塞。但即使这没有发生,您显然也无法读取和记录 stderr 数据,直到有东西从 stdout 传入。
If you use the pipe-through-tee
solution, that avoids the initial problem…?but only by creating a new project that's just as bad. You have two tee
instances, and while you're calling communicate
on one, the other one is sitting around waiting forever.
如果您使用管道直通tee
解决方案,就可以避免最初的问题……但只能通过创建一个同样糟糕的新项目。你有两个tee
实例,当你调用communicate
一个时,另一个就坐在那里永远等待。
So, either way, you need some kind of asynchronous mechanism. You can do this is with threads, a select
reactor, something like gevent
, etc.
因此,无论哪种方式,您都需要某种异步机制。你可以用线程、select
反应器、类似的东西来做到这一点gevent
。
Here's a quick and dirty example:
这是一个快速而肮脏的例子:
proc = subprocess.Popen(['path_to_tool', '-option1', 'option2'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
def tee_pipe(pipe, f1, f2):
for line in pipe:
f1.write(line)
f2.write(line)
t1 = threading.Thread(target=tee_pipe, args=(proc.stdout, file_out, sys.stdout))
t2 = threading.Thread(target=tee_pipe, args=(proc.stderr, log_file, sys.stderr))
t3 = threading.Thread(proc.wait)
t1.start(); t2.start(); t3.start()
t1.join(); t2.join(); t3.join()
However, there are some edge cases where that won't work. (The problem is the order in which SIGCHLD and SIGPIPE/EPIPE/EOF arrive. I don't think any of that will affect us here, since we're not sending any input… but don't trust me on that without thinking it through and/or testing.) The subprocess.communicate
function from 3.3+ gets all the fiddly details right. But you may find it a lot simpler to use one of the async-subprocess wrapper implementations you can find on PyPI and ActiveState, or even the subprocess stuff from a full-fledged async framework like Twisted.
但是,有一些边缘情况是行不通的。(问题是 SIGCHLD 和 SIGPIPE/EPIPE/EOF 到达的顺序。我认为这些都不会影响我们这里,因为我们没有发送任何输入......但不要在没有考虑的情况下相信我通过和/或测试。)subprocess.communicate
来自 3.3+的函数可以正确处理所有繁琐的细节。但是您可能会发现使用可以在 PyPI 和 ActiveState 上找到的异步子流程包装器实现之一,甚至是来自 Twisted 等成熟异步框架的子流程内容,要简单得多。
* The docs don't really explain what pipes are, almost as if they expect you to be an old Unix C hand…?But some of the examples, especially in the Replacing Older Functions with the subprocess
Modulesection, show how they're used, and it's pretty simple.
* 文档并没有真正解释管道是什么,几乎就好像他们希望你是一个老的 Unix C 手……?但是一些例子,特别是在用subprocess
模块替换旧函数部分,展示了它们是如何使用的,而且很简单。
** The hard part is sequencing two or more pipes properly. If you wait on one pipe, the other may overflow and block, preventing your wait on the other one from ever finishing. The only easy way to get around this is to create a thread to service each pipe. (On most *nix platforms, you can use a select
or poll
reactor instead, but making that cross-platform is amazingly difficult.) The sourceto the module, especially communicate
and its helpers, shows how to do it. (I linked to 3.3, because in earlier versions, communicate
itself gets some important things wrong…) This is why, whenever possible, you want to use communicate
if you need more than one pipe. In your case, you can't use communicate
, but fortunately you don't need more than one pipe.
** 困难的部分是正确排序两个或多个管道。如果您在一个管道上等待,另一个可能会溢出并阻塞,从而阻止您对另一个管道的等待永远完成。解决这个问题的唯一简单方法是创建一个线程来为每个管道提供服务。(在大多数 *nix 平台上,您可以使用 aselect
或poll
reactor 代替,但要实现跨平台是非常困难的。)模块的源代码,尤其是communicate
它的助手,展示了如何做到这一点。(我链接到 3.3,因为在早期版本中,communicate
它本身会出错一些重要的事情......)这就是为什么,只要有可能,communicate
如果您需要多个管道,您就希望使用。在您的情况下,您不能使用communicate
,但幸运的是您不需要多个管道。
回答by Brandt
I think what you are looking for is something like:
我认为你正在寻找的是这样的:
import sys, subprocess
p = subprocess.Popen(cmdline,
stdout=sys.stdout,
stderr=sys.stderr)
To have the output/log written to a file I would modify my cmdline
to include usual redirects, as it would be done on a plain linux bash/shell. For instance, I would append tee
to the command-line: cmdline += ' | tee -a logfile.txt'
要将输出/日志写入文件,我将修改我的cmdline
以包含通常的重定向,就像在普通的 linux bash/shell 上完成的那样。例如,我会附加tee
到命令行:cmdline += ' | tee -a logfile.txt'
Hope that helps.
希望有帮助。
回答by Timmmm
I had to make a few changes to @abarnert's answer for Python 3. This seems to work:
我不得不对@abernert 对 Python 3 的回答进行一些更改。这似乎有效:
def tee_pipe(pipe, f1, f2):
for line in pipe:
f1.write(line)
f2.write(line)
proc = subprocess.Popen(["/bin/echo", "hello"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
# Open the output files for stdout/err in unbuffered mode.
out_file = open("stderr.log", "wb", 0)
err_file = open("stdout.log", "wb", 0)
stdout = sys.stdout
stderr = sys.stderr
# On Python3 these are wrapped with BufferedTextIO objects that we don't
# want.
if sys.version_info[0] >= 3:
stdout = stdout.buffer
stderr = stderr.buffer
# Start threads to duplicate the pipes.
out_thread = threading.Thread(target=tee_pipe,
args=(proc.stdout, out_file, stdout))
err_thread = threading.Thread(target=tee_pipe,
args=(proc.stderr, err_file, stderr))
out_thread.start()
err_thread.start()
# Wait for the command to finish.
proc.wait()
# Join the pipe threads.
out_thread.join()
err_thread.join()