Python 打印大量格式化数据时如何避免 Broken Pipe 错误?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15793886/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 21:01:57  来源:igfitidea点击:

How to avoid a Broken Pipe error when printing a large amount of formatted data?

pythonformatstring-formattingioerrorbroken-pipe

提问by Thanasis Petsas

I am trying to print a list of tuples formatted in my stdout. For this, I use the str.formatmethod. Everything works fine, but when I pipe the output to see the first lines using the headcommand a IOErroroccurs.

我正在尝试打印在我的stdout. 为此,我使用str.format方法。一切正常,但是当我通过管道输出以查看使用head命令 a的第一行时IOError

Here is my code:

这是我的代码:

# creating the data
data = []$
for i in range(0,  1000):                                            
  pid = 'pid%d' % i
  uid = 'uid%d' % i
  pname = 'pname%d' % i
  data.append( (pid, uid, pname) )

# find max leghed string for each field
pids, uids, pnames = zip(*data)
max_pid = len("%s" % max( pids) )
max_uid = len("%s" % max( uids) )
max_pname = len("%s" % max( pnames) )

# my template for the formatted strings
template = "{0:%d}\t{1:%d}\t{2:%d}" % (max_pid, max_uid, max_pname)

# print the formatted output to stdout
for pid, uid, pname in data:
  print template.format(pid, uid, pname)

And here is the error I get after running the command: python myscript.py | head

这是我运行命令后得到的错误: python myscript.py | head

Traceback (most recent call last):
  File "lala.py", line 16, in <module>
    print template.format(pid, uid, pname)
IOError: [Errno 32] Broken pipe

Can anyone help me on this?

谁可以帮我这个事?

I tried to put printin a try-exceptblock to handle the error, but after that there was another message in the console:

我试图放入print一个try-except块来处理错误,但之后控制台中有另一条消息:

close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr

I also tried to flush immediately the data through a two consecutive sys.stdout.writeand sys.stdout.flushcalls, but nothing happend..

我也试图通过连续两次立即刷新数据 sys.stdout.writesys.stdout.flush电话,但没有happend ..

采纳答案by Martijn Pieters

headreads from stdoutthen closesit. This causes printto fail, internally it writes to sys.stdout, now closed.

head读取stdout然后关闭它。这会导致print失败,它在内部写入sys.stdout,现在已关闭。

You can simply catchthe IOErrorand exit silently:

你可以简单地赶上IOError和退出默默:

try:
    for pid, uid, pname in data:
        print template.format(pid, uid, pname)
except IOError:
    # stdout is closed, no point in continuing
    # Attempt to close them explicitly to prevent cleanup problems:
    try:
        sys.stdout.close()
    except IOError:
        pass
    try:
        sys.stderr.close()
    except IOError:
        pass

回答by Barron

The behavior you are seeing is linked to the buffered output implementation in Python3. The problem can be avoided using the -u option or setting environmental variable PYTHONUNBUFFERED=x. See the man pages for more information on -u.

您看到的行为与 Python3 中的缓冲输出实现相关联。使用 -u 选项或设置环境变量 PYTHONUNBUFFERED=x 可以避免该问题。有关 -u 的更多信息,请参阅手册页。

$ python2.7 testprint.py | echo

Exc: <type 'exceptions.IOError'>
$ python3.5 testprint.py | echo

Exc: <class 'BrokenPipeError'>
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
$ python3.5 -u testprint.py | echo

Exc: <class 'BrokenPipeError'>
$ export PYTHONUNBUFFERED=x
$ python3.5 testprint.py | echo

Exc: <class 'BrokenPipeError'>

回答by Walker Hale IV

In general, I try to catch the most specific error I can get away with. In this case it is BrokenPipeError:

一般来说,我会尝试捕捉我可以摆脱的最具体的错误。在这种情况下,它是BrokenPipeError

try:
    # I usually call a function here that generates all my output:
    for pid, uid, pname in data:
        print template.format(pid, uid, pname)
except BrokenPipeError as e:
    pass  # Ignore. Something like head is truncating output.
finally:
    sys.stderr.close()

If this is at the end of execution, I find I only need to close sys.stderr. If I don't close sys.stderr, I'll get a BrokenPipeError but without a stack trace.

如果这是在执行结束时,我发现我只需要关闭sys.stderr. 如果我不 close sys.stderr,我会得到一个 BrokenPipeError 但没有堆栈跟踪。

This seems to be the minimum fix for writing tools that output to pipelines.

这似乎是编写输出到管道的工具的最低限度。

回答by Gringo Suave

Had this problem with Python3 and debug logging piped into head as well. If your script talks to the network or does file IO, simply dropping IOError's is not a good solution. Despite mentions here, I was not able to catch BrokenPipeError for some reason.

Python3 有这个问题,调试日志也通过管道传输到 head 中。如果您的脚本与网络通信或执行文件 IO,那么简单地删除 IOError 并不是一个好的解决方案。尽管这里提到了,但由于某种原因我无法捕获 BrokenPipeError。

Found a blog post talking about restoring the default signal handler for sigpipe: http://newbebweb.blogspot.com/2012/02/python-head-ioerror-errno-32-broken.html

找到了一篇关于恢复 sigpipe 的默认信号处理程序的博客文章:http://newbebweb.blogspot.com/2012/02/python-head-ioerror-errno-32-broken.html

In short, you add the following to your script before the bulk of the output:

简而言之,在大量输出之前将以下内容添加到脚本中:

if log.isEnabledFor(logging.DEBUG):  # optional
    # set default handler to no-op
    from signal import signal, SIGPIPE, SIG_DFL
    signal(SIGPIPE, SIG_DFL)

This seems to happen with head, but not other programs such as grep---as mentioned head closes stdout. If you don't use head with the script often, it may not be worth worrying about.

这似乎发生在 head 上,但不是其他程序,例如 grep --- 正如前面提到的 head 关闭 stdout。如果您不经常在脚本中使用 head ,则可能不值得担心。