python内置的open()函数中的缓冲有什么用?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29712445/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the use of buffering in python's built-in open() function?
提问by Srivishnu
Python Documentation : https://docs.python.org/2/library/functions.html#open
Python 文档:https: //docs.python.org/2/library/functions.html#open
open(name[, mode[, buffering]])
The above documentation says "The optional buffering argument specifies the file's desired buffer size: 0 means unbuffered, 1 means line buffered, any other positive value means use a buffer of (approximately) that size (in bytes). A negative buffering means to use the system default.If omitted, the system default is used.".
When I use
上面的文档说“可选的缓冲参数指定文件所需的缓冲区大小:0 表示未缓冲,1 表示行缓冲,任何其他正值表示使用(大约)该大小(以字节为单位)的缓冲区。负缓冲意味着使用系统默认值。如果省略,则使用系统默认值。”。
当我使用
filedata = open(file.txt,"r",0)
or
或者
filedata = open(file.txt,"r",1)
or
或者
filedata = open(file.txt,"r",2)
or
或者
filedata = open(file.txt,"r",-1)
or
或者
filedata = open(file.txt,"r")
The output has no change. Each line shown above prints at same speed.
output:
输出没有变化。上面显示的每一行都以相同的速度打印。
输出:
Mr. Bean is a British television programme series of fifteen 25-
minute episodes written by Robin Driscoll and starring Rowan Atkinson as
the title character. Different episodes were also written by Robin
Driscoll and Richard Curtis, and one by Ben Elton. Thirteen of the
episodes were broadcast on ITV, from the pilot on 1 January 1990, until
"Goodnight Mr. Bean" on 31 October 1995. A clip show, "The Best Bits of
Mr. Bean", was broadcast on 15 December 1995, and one episode, "Hair by
Mr. Bean of London", was not broadcast until 2006 on Nickelodeon.
憨豆先生是一个由 15 个 25-
由罗宾·德里斯科尔编写并由罗温·艾金森主演的分钟剧集
标题字符。罗宾也写了不同的情节
德里斯科尔和理查德柯蒂斯,以及本埃尔顿之一。其中十三
剧集在 ITV 上播出,从 1990 年 1 月 1 日的试播期开始,直到
1995 年 10 月 31 日的“豆豆先生晚安”。剪辑节目“The Best Bits of
憨豆先生”于1995年12月15日播出,其中一集“Hair by
伦敦的憨豆先生”,直到 2006 年才在 Nickelodeon 上播出。
Then how the buffering parameter in the open() function is useful? What value
那么open()函数中的buffering参数有什么用呢?什么价值
of that buffering parameter is best to use?
那个缓冲参数最好用?
采纳答案by Asad Saeeduddin
Enabling buffering means that you're not directly interfacing with the OS's representation of a file, or its file system API. Instead, a chunk of data is read from the raw OS filestream into a buffer until it is consumed, at which point more data is fetched into the buffer. In terms of the objects you get, you'll get a BufferedIOBase
object wrapping an underlying RawIOBase
(which represents the raw file stream).
启用缓冲意味着您不会直接与操作系统的文件表示或其文件系统 API 进行交互。取而代之的是,从原始操作系统文件流中将一大块数据读取到缓冲区中,直到它被消耗为止,此时更多的数据被提取到缓冲区中。就您获得的对象而言,您将获得一个BufferedIOBase
包装底层RawIOBase
(代表原始文件流)的对象。
What is the benefit of this? Well interfacing with the raw stream might have high latency, because the operating system has to fool around with physical objects like the hard disk, and this may not be acceptable in all cases. Let's say you want to read three letters from a file every 5ms and your file is on a crusty old hard disk, or even a network file system. Instead of trying to read from the raw filestream every 5ms, it is better to load a bunch of bytes from the file into a buffer in memory, then consume it at will.
这有什么好处?与原始流的良好接口可能具有很高的延迟,因为操作系统必须在硬盘等物理对象上四处游荡,这可能并非在所有情况下都可以接受。假设您想每 5 毫秒从一个文件中读取三个字母,并且您的文件位于一个老旧的硬盘上,甚至是网络文件系统上。与其每 5ms 尝试从原始文件流中读取一次,不如将一堆字节从文件加载到内存中的缓冲区中,然后随意使用它。
What size of buffer you choose will depend on how you're consuming the data. For the example above, a buffer size of 1 char would be awful, 3 chars would be alright, and any large multiple of 3 chars that doesn't cause a noticeable delay for your users would be ideal.
您选择的缓冲区大小取决于您使用数据的方式。对于上面的示例,1 个字符的缓冲区大小会很糟糕,3 个字符就可以了,任何不会对您的用户造成明显延迟的 3 个字符的大倍数都是理想的。