Linux:何时使用分散/聚集 IO(readv、writev)与带有 fread 的大缓冲区
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10520182/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Linux: When to use scatter/gather IO (readv, writev) vs a large buffer with fread
提问by Jimm
In scatterand gather(i.e. readv
and writev
), Linux reads into multiple buffers and writes from multiple buffers.
在分散和收集(即readv
和writev
)中,Linux 读入多个缓冲区并从多个缓冲区写入。
If say, I have a vector of 3 buffers, I can use readv
, OR I can use a single buffer, which is of combined size of 3 buffers and do fread
.
如果说,我有一个包含 3 个缓冲区的向量,我可以使用readv
,或者我可以使用单个缓冲区,它是 3 个缓冲区的组合大小,并且 do fread
。
Hence, I am confused: For which cases should scatter/gather be used and when should a single large buffer be used?
因此,我很困惑:在哪些情况下应该使用分散/收集,何时应该使用单个大缓冲区?
采纳答案by ArjunShankar
The main convenience offered by readv
, writev
is:
通过提供的主要便利readv
,writev
是:
- It allows working with non contiguous blocks of data. i.e. buffers need notbe part of an array, but separately allocated.
- The I/O is 'atomic'. i.e. If you do a
writev
, all the elements in the vector will be written in one contiguous operation, and writes done by other processes will not occur in between them.
- 它允许使用不连续的数据块。即缓冲器需要不是一个阵列的一部分,但单独分配。
- I/O 是“原子的”。即如果您执行 a
writev
,向量中的所有元素将在一个连续操作中写入,并且其他进程完成的写入将不会发生在它们之间。
e.g. say, your data is naturally segmented, and comes from different sources:
例如,您的数据是自然分段的,并且来自不同的来源:
struct foo *my_foo;
struct bar *my_bar;
struct baz *my_baz;
my_foo = get_my_foo();
my_bar = get_my_bar();
my_baz = get_my_baz();
Now, all three 'buffers' are notone big contiguous block. But you want to write them contiguously into a file, for whatever reason (say for example, they are fields in a file header for a file format).
现在,所有三个“缓冲区”是不是一个大的连续的块。但是您想将它们连续写入文件,无论出于何种原因(例如,它们是文件格式的文件头中的字段)。
If you use write
you have to choose between:
如果您使用,write
您必须在以下选项之间进行选择:
- Copying them over into one block of memory using, say,
memcpy
(overhead), followed by a singlewrite
call. Then the write will be atomic. - Making three separate calls to
write
(overhead). Also,write
calls from other processes can intersperse between these writes (not atomic).
- 使用(例如)
memcpy
(开销)将它们复制到一个内存块中,然后进行一次write
调用。然后写入将是原子的。 - 对
write
(开销)进行三个单独的调用。此外,write
来自其他进程的调用可以穿插在这些写入(非原子)之间。
If you use writev
instead, its all good:
如果您writev
改为使用,则一切正常:
- You make exactly one system call, and no
memcpy
to make a single buffer from the three. - Also, the three buffers are written atomically, as one block write. i.e. if other processes also write, then these writes will not come in between the writes of the three vectors.
- 您只进行一个系统调用,而不是
memcpy
从三个系统调用中创建一个缓冲区。 - 此外,三个缓冲区以原子方式写入,作为一个块写入。即如果其他进程也写入,那么这些写入将不会出现在三个向量的写入之间。
So you would do something like:
所以你会做这样的事情:
struct iovec iov[3];
iov[0].iov_base = my_foo;
iov[0].iov_len = sizeof (struct foo);
iov[1].iov_base = my_bar;
iov[1].iov_len = sizeof (struct bar);
iov[2].iov_base = my_baz;
iov[2].iov_len = sizeof (struct baz);
bytes_written = writev (fd, iov, 3);
Sources:
资料来源: