Linux 为什么打印到标准输出这么慢?可以加速吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3857052/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 23:41:26  来源:igfitidea点击:

Why is printing to stdout so slow? Can it be sped up?

pythonlinuxprintingstdouttty

提问by Russ

I've always been amazed/frustrated with how long it takes to simply output to the terminal with a print statement. After some recent painfully slow logging I decided to look into it and was quite surprised to find that almost allthe time spent is waiting for the terminal to process the results.

我一直对使用打印语句简单地输出到终端需要多长时间感到惊讶/沮丧。在最近的一些痛苦缓慢的日志记录之后,我决定研究它,并且非常惊讶地发现几乎所有的时间都在等待终端处理结果。

Can writing to stdout be sped up somehow?

可以以某种方式加速写入标准输出吗?

I wrote a script ('print_timer.py' at the bottom of this question) to compare timing when writing 100k lines to stdout, to file, and with stdout redirected to /dev/null. Here is the timing result:

我写了一个脚本(print_timer.py这个问题底部的' ')来比较将 100k 行写入 stdout、文件以及将 stdout 重定向到/dev/null. 下面是计时结果:

$ python print_timer.py
this is a test
this is a test
<snipped 99997 lines>
this is a test
-----
timing summary (100k lines each)
-----
print                         :11.950 s
write to file (+ fsync)       : 0.122 s
print with stdout = /dev/null : 0.050 s

Wow. To make sure python isn't doing something behind the scenes like recognizing that I reassigned stdout to /dev/null or something, I did the redirection outside the script...

哇。为了确保 python 没有在幕后做一些事情,比如识别出我将 stdout 重新分配给 /dev/null 或其他东西,我在脚本之外进行了重定向......

$ python print_timer.py > /dev/null
-----
timing summary (100k lines each)
-----
print                         : 0.053 s
write to file (+fsync)        : 0.108 s
print with stdout = /dev/null : 0.045 s

So it isn't a python trick, it is just the terminal. I always knew dumping output to /dev/null sped things up, but never figured it was that significant!

所以它不是python技巧,它只是终端。我一直都知道将输出转储到 /dev/null 会加快速度,但从未想过这有那么重要!

It amazes me how slow the tty is. How can it be that writing to physical disk is WAY faster than writing to the "screen" (presumably an all-RAM op), and is effectively as fast as simply dumping to the garbage with /dev/null?

让我惊讶的是 tty 的速度有多慢。写入物理磁盘怎么可能比写入“屏幕”(大概是全内存操作)快得多,并且与简单地使用 /dev/null 倾倒到垃圾中一样快?

This linktalks about how the terminal will block I/O so it can "parse [the input], update its frame buffer, communicate with the X server in order to scroll the window and so on"... but I don't fully get it. What can be taking so long?

此链接讨论终端将如何阻止 I/O,以便它可以“解析 [输入]、更新其帧缓冲区、与 X 服务器通信以滚动窗口等等”......但我没有完全得到它。什么可以花这么长时间?

I expect there is no way out (short of a faster tty implementation?) but figure I'd ask anyway.

我希望没有出路(缺少更快的 tty 实现?)但我还是想问一下。



UPDATE: after reading some comments I wondered how much impact my screen size actually has on the print time, and it does have some significance. The really slow numbers above are with my Gnome terminal blown up to 1920x1200. If I reduce it very small I get...

更新:在阅读了一些评论后,我想知道我的屏幕尺寸实际上对打印时间有多大影响,它确实有一定的意义。上面真正慢的数字是我的 Gnome 终端被炸毁到 1920x1200。如果我将其减小得非常小,我会得到...

-----
timing summary (100k lines each)
-----
print                         : 2.920 s
write to file (+fsync)        : 0.121 s
print with stdout = /dev/null : 0.048 s

That is certainly better (~4x), but doesn't change my question. It only addsto my question as I don't understand why the terminal screen rendering should slow down an application writing to stdout. Why does my program need to wait for screen rendering to continue?

那当然更好(~4x),但不会改变我的问题。它只会增加我的问题,因为我不明白为什么终端屏幕渲染应该减慢写入标准输出的应用程序。为什么我的程序需要等待屏幕渲染才能继续?

Are all terminal/tty apps not created equal? I have yet to experiment. It really seems to me like a terminal should be able to buffer all incoming data, parse/render it invisibly, and only render the most recent chunk that is visible in the current screen configuration at a sensible frame rate. So if I can write+fsync to disk in ~0.1 seconds, a terminal should be able to complete the same operation in something of that order (with maybe a few screen updates while it did it).

不是所有的终端/tty 应用程序都是平等的吗?我还没有尝试。在我看来,终端应该能够缓冲所有传入的数据,以不可见的方式解析/渲染它,并且仅以合理的帧速率渲染当前屏幕配置中可见的最新块。因此,如果我可以在大约 0.1 秒内将 + fsync 写入磁盘,则终端应该能够以该顺序完成相同的操作(在执行过程中可能会有一些屏幕更新)。

I'm still kind of hoping there is a tty setting that can be changed from the application side to make this behaviour better for programmer. If this is strictly a terminal application issue, then this maybe doesn't even belong on StackOverflow?

我仍然希望有一个可以从应用程序端更改的 tty 设置,以使这种行为对程序员更好。如果这完全是终端应用程序问题,那么这可能甚至不属于 StackOverflow?

What am I missing?

我错过了什么?



Here is the python program used to generate the timing:

这是用于生成计时的python程序:

import time, sys, tty
import os

lineCount = 100000
line = "this is a test"
summary = ""

cmd = "print"
startTime_s = time.time()
for x in range(lineCount):
    print line
t = time.time() - startTime_s
summary += "%-30s:%6.3f s\n" % (cmd, t)

#Add a newline to match line outputs above...
line += "\n"

cmd = "write to file (+fsync)"
fp = file("out.txt", "w")
startTime_s = time.time()
for x in range(lineCount):
    fp.write(line)
os.fsync(fp.fileno())
t = time.time() - startTime_s
summary += "%-30s:%6.3f s\n" % (cmd, t)

cmd = "print with stdout = /dev/null"
sys.stdout = file(os.devnull, "w")
startTime_s = time.time()
for x in range(lineCount):
    fp.write(line)
t = time.time() - startTime_s
summary += "%-30s:%6.3f s\n" % (cmd, t)

print >> sys.stderr, "-----"
print >> sys.stderr, "timing summary (100k lines each)"
print >> sys.stderr, "-----"
print >> sys.stderr, summary

采纳答案by Russ

Thanks for all the comments! I've ended up answering it myself with your help. It feels dirty answering your own question, though.

感谢所有的评论!我最终在你的帮助下自己回答了这个问题。不过,回答你自己的问题感觉很脏。

Question 1: Why is printing to stdout slow?

问题 1:为什么打印到标准输出很慢?

Answer:Printing to stdout is notinherently slow. It is the terminal you work with that is slow. And it has pretty much zero to do with I/O buffering on the application side (eg: python file buffering). See below.

答:打印到标准输出本身并不慢。这是您使用的终端很慢。它与应用程序端的 I/O 缓冲几乎为零(例如:python 文件缓冲)。见下文。

Question 2: Can it be sped up?

问题二:可以加速吗?

Answer:Yes it can, but seemingly not from the program side (the side doing the 'printing' to stdout). To speed it up, use a faster different terminal emulator.

答:是的,它可以,但似乎不是从程序方面(对标准输出进行“打印”的一侧)。要加快速度,请使用更快的不同终端仿真器。

Explanation...

解释...

I tried a self-described 'lightweight' terminal program called wtermand got significantlybetter results. Below is the output of my test script (at the bottom of the question) when running in wtermat 1920x1200 in on the same system where the basic print option took 12s using gnome-terminal:

我尝试了一个自我描述的“轻量级”终端程序,wterm并获得了明显更好的结果。下面是我的测试脚本(在问题的底部)在wterm同一系统上以 1920x1200运行时的输出,其中基本打印选项使用 gnome-terminal 需要 12 秒:

-----
timing summary (100k lines each)
-----
print                         : 0.261 s
write to file (+fsync)        : 0.110 s
print with stdout = /dev/null : 0.050 s

0.26s is MUCH better than 12s! I don't know whether wtermis more intelligent about how it renders to screen along the lines of how I was suggesting (render the 'visible' tail at a reasonable frame rate), or whether it just "does less" than gnome-terminal. For the purposes of my question I've got the answer, though. gnome-terminalis slow.

0.26s 比 12s 好得多!我不知道wterm它如何按照我建议的方式呈现屏幕是否更智能(以合理的帧速率呈现“可见”尾部),或者它是否只是“做得少于” gnome-terminal。不过,就我的问题而言,我已经得到了答案。 gnome-terminal是慢的。

So - If you have a long running script that you feel is slow and it spews massive amounts of text to stdout... try a different terminal and see if it is any better!

所以 - 如果你有一个长时间运行的脚本,你觉得它很慢并且它向标准输出喷出大量文本......尝试不同的终端,看看它是否更好!

Note that I pretty much randomly pulled wtermfrom the ubuntu/debian repositories. This linkmight be the same terminal, but I'm not sure. I did not test any other terminal emulators.

请注意,我几乎是wterm从 ubuntu/debian 存储库中随机提取的。 这个链接可能是同一个终端,但我不确定。我没有测试任何其他终端模拟器。



Update: Because I had to scratch the itch, I tested a whole pile of other terminal emulators with the same script and full screen (1920x1200). My manually collected stats are here:

更新:因为我不得不挠痒痒,所以我用相同的脚本和全屏 (1920x1200) 测试了一大堆其他终端模拟器。我手动收集的统计数据在这里:

wterm           0.3s
aterm           0.3s
rxvt            0.3s
mrxvt           0.4s
konsole         0.6s
yakuake         0.7s
lxterminal        7s
xterm             9s
gnome-terminal   12s
xfce4-terminal   12s
vala-terminal    18s
xvt              48s

The recorded times are manually collected, but they were pretty consistent. I recorded the best(ish) value. YMMV, obviously.

记录的时间是手动收集的,但它们非常一致。我记录了最佳(ish)值。YMMV,显然。

As a bonus, it was an interesting tour of some of the various terminal emulators available out there! I'm amazed my first 'alternate' test turned out to be the best of the bunch.

作为奖励,这是对一些可用的各种终端仿真器的有趣之旅!我很惊讶我的第一个“替代”测试结果是最好的。

回答by shuttle87

Printing to the terminal is going to be slow. Unfortunately short of writing a new terminal implementation I can't really see how you'd speed this up significantly.

打印到终端会很慢。不幸的是,由于没有编写新的终端实现,我真的不知道您将如何显着加快速度。

回答by Hasturkun

Your redirection probably does nothing as programs can determine whether their output FD points to a tty.

您的重定向可能没有任何作用,因为程序可以确定其输出 FD 是否指向 tty。

It's likely that stdout is line buffered when pointing to a terminal (the same as C's stdoutstream behaviour).

当指向终端时,stdout 很可能是行缓冲的(与 C 的stdout流行为相同)。

As an amusing experiment, try piping the output to cat.

作为一个有趣的实验,尝试将输出传输到cat.



I've tried my own amusing experiment, and here are the results.

我已经尝试了我自己的有趣实验,这是结果。

$ python test.py 2>foo
...
$ cat foo
-----
timing summary (100k lines each)
-----
print                         : 6.040 s
write to file                 : 0.122 s
print with stdout = /dev/null : 0.121 s

$ python test.py 2>foo |cat
...
$ cat foo
-----
timing summary (100k lines each)
-----
print                         : 1.024 s
write to file                 : 0.131 s
print with stdout = /dev/null : 0.122 s

回答by Katriel

I can't talk about the technical details because I don't know them, but this doesn't surprise me: the terminal was not designed for printing lots of data like this. Indeed, you even provide a link to a load of GUI stuff that it has to do every time you want to print something! Notice that if you call the script with pythonwinstead, it does not take 15 seconds; this is entirely a GUI issue. Redirect stdoutto a file to avoid this:

我不能谈论技术细节,因为我不了解它们,但这并不让我感到惊讶:终端不是为打印大量这样的数据而设计的。事实上,您甚至提供了一个指向大量 GUI 内容的链接,每次您想打印某些内容时它都必须执行这些操作!请注意,如果您使用pythonw代替调用脚本,则不会花费 15 秒;这完全是一个 GUI 问题。重定向stdout到文件以避免这种情况:

import contextlib, io
@contextlib.contextmanager
def redirect_stdout(stream):
    import sys
    sys.stdout = stream
    yield
    sys.stdout = sys.__stdout__

output = io.StringIO
with redirect_stdout(output):
    ...

回答by Liudvikas Bukys

In addition to the output probably defaulting to a line-buffered mode, output to a terminal is also causing your data to flow into a terminal and serial line with a maximum throughput, or a pseudo-terminal and a separate process that is handling a display event loop, rendering characters from some font, moving display bits to implement a scrolling display. The latter scenario is probably spread over multiple processes (e.g. telnet server/client, terminal app, X11 display server) so there are context switching and latency issues too.

除了可能默认为行缓冲模式的输出之外,输出到终端还会导致您的数据以最大吞吐量流入终端和串行线,或伪终端和处理显示的单独进程事件循环,从某种字体渲染字符,移动显示位以实现滚动显示。后一种情况可能分布在多个进程(例如 telnet 服务器/客户端、终端应用程序、X11 显示服务器)上,因此也存在上下文切换和延迟问题。

回答by Pi Delport

How can it be that writing to physical disk is WAY faster than writing to the "screen" (presumably an all-RAM op), and is effectively as fast as simply dumping to the garbage with /dev/null?

写入物理磁盘怎么可能比写入“屏幕”(大概是全内存操作)快得多,并且与简单地使用 /dev/null 倾倒到垃圾中一样快?

Congratulations, you have just discovered the importance of I/O buffering. :-)

恭喜,您刚刚发现了 I/O 缓冲的重要性。:-)

The disk appearsto be faster, because it is highly buffered: all Python's write()calls are returning before anything is actually written to physical disk. (The OS does this later, combining many thousands of individual writes into a big, efficient chunks.)

磁盘看起来更快,因为它是高度缓冲的:所有 Python 的write()调用都在实际写入物理磁盘之前返回。(操作系统稍后会这样做,将数千个单独的写入组合成一个大的、高效的块。)

The terminal, on the other hand, does little or no buffering: each individual print/ write(line)waits for the fullwrite (i.e. display to output device) to complete.

另一方面,终端很少或不做缓冲:每个人print/write(line)等待完整的写入(即显示到输出设备)完成。

To make the comparison fair, you must make the file test use the same output buffering as the terminal, which you can do by modifying your example to:

为了使比较公平,您必须使文件测试使用与终端相同的输出缓冲,您可以通过将示例修改为:

fp = file("out.txt", "w", 1)   # line-buffered, like stdout
[...]
for x in range(lineCount):
    fp.write(line)
    os.fsync(fp.fileno())      # wait for the write to actually complete

I ran your file writing test on my machine, and with buffering, it also 0.05s here for 100,000 lines.

我在我的机器上运行了你的文件写入测试,并且通过缓冲,它在这里 100,000 行也是 0.05 秒。

However, with the above modifications to write unbuffered, it takes 40 seconds to write only 1,000 lines to disk. I gave up waiting for 100,000 lines to write, but extrapolating from the previous, it would take over an hour.

但是,通过上述修改为无缓冲写入,仅将 1,000 行写入磁盘需要 40 秒。我放弃了等待 100,000 行写,但从之前的推断,需要一个多小时

That puts the terminal's 11 seconds into perspective, doesn't it?

这使终端的 11 秒进入透视,不是吗?

So to answer your original question, writing to a terminal is actually blazingly fast, all things considered, and there's not a lot of room to make it much faster (but individual terminals do vary in how much work they do; see Russ's comment to this answer).

因此,要回答您的原始问题,考虑到所有因素,写入终端实际上​​非常快,并且没有太多空间可以使其更快(但各个终端的工作量确实有所不同;请参阅 Russ 对此的评论回答)。

(You could add more write buffering, like with disk I/O, but then you wouldn't see what was written to your terminal until after the buffer gets flushed. It's a trade-off: interactivity versus bulk efficiency.)

(您可以添加更多的写入缓冲,例如磁盘 I/O,但是在缓冲区被刷新之前您不会看到写入到终端的内容。这是一个权衡:交互性与批量效率。)