Linux：何时使用分散/聚集 IO（readv、writev）与带有 fread 的大缓冲区

Question

提问by Jimm

In scatterand gather(i.e. readvand writev), Linux reads into multiple buffers and writes from multiple buffers.

在分散和收集（即readv和writev）中，Linux 读入多个缓冲区并从多个缓冲区写入。

If say, I have a vector of 3 buffers, I can use readv, OR I can use a single buffer, which is of combined size of 3 buffers and do fread.

如果说，我有一个包含 3 个缓冲区的向量，我可以使用readv，或者我可以使用单个缓冲区，它是 3 个缓冲区的组合大小，并且 do fread。

Hence, I am confused: For which cases should scatter/gather be used and when should a single large buffer be used?

因此，我很困惑：在哪些情况下应该使用分散/收集，何时应该使用单个大缓冲区？

Answer 1

采纳答案by ArjunShankar

The main convenience offered by readv, writevis:

通过提供的主要便利readv，writev是：

It allows working with non contiguous blocks of data. i.e. buffers need notbe part of an array, but separately allocated.
The I/O is 'atomic'. i.e. If you do a writev, all the elements in the vector will be written in one contiguous operation, and writes done by other processes will not occur in between them.

它允许使用不连续的数据块。即缓冲器需要不是一个阵列的一部分，但单独分配。
I/O 是“原子的”。即如果您执行 a writev，向量中的所有元素将在一个连续操作中写入，并且其他进程完成的写入将不会发生在它们之间。

e.g. say, your data is naturally segmented, and comes from different sources:

例如，您的数据是自然分段的，并且来自不同的来源：

struct foo *my_foo;
struct bar *my_bar;
struct baz *my_baz;

my_foo = get_my_foo();
my_bar = get_my_bar();
my_baz = get_my_baz();

Now, all three 'buffers' are notone big contiguous block. But you want to write them contiguously into a file, for whatever reason (say for example, they are fields in a file header for a file format).

现在，所有三个“缓冲区”是不是一个大的连续的块。但是您想将它们连续写入文件，无论出于何种原因（例如，它们是文件格式的文件头中的字段）。

If you use writeyou have to choose between:

如果您使用，write您必须在以下选项之间进行选择：

Copying them over into one block of memory using, say, memcpy(overhead), followed by a single writecall. Then the write will be atomic.
Making three separate calls to write(overhead). Also, writecalls from other processes can intersperse between these writes (not atomic).

使用（例如）memcpy（开销）将它们复制到一个内存块中，然后进行一次write调用。然后写入将是原子的。
对write（开销）进行三个单独的调用。此外，write来自其他进程的调用可以穿插在这些写入（非原子）之间。

If you use writevinstead, its all good:

如果您writev改为使用，则一切正常：

You make exactly one system call, and no memcpyto make a single buffer from the three.
Also, the three buffers are written atomically, as one block write. i.e. if other processes also write, then these writes will not come in between the writes of the three vectors.

您只进行一个系统调用，而不是memcpy从三个系统调用中创建一个缓冲区。
此外，三个缓冲区以原子方式写入，作为一个块写入。即如果其他进程也写入，那么这些写入将不会出现在三个向量的写入之间。

So you would do something like:

所以你会做这样的事情：

struct iovec iov[3];

iov[0].iov_base = my_foo;
iov[0].iov_len = sizeof (struct foo);
iov[1].iov_base = my_bar;
iov[1].iov_len = sizeof (struct bar);
iov[2].iov_base = my_baz;
iov[2].iov_len = sizeof (struct baz);

bytes_written = writev (fd, iov, 3);

Sources:

资料来源：

Linux：何时使用分散/聚集 IO（readv、writev）与带有 fread 的大缓冲区

提问by Jimm

采纳答案by ArjunShankar

相关推荐

最近更新

标签

Linux：何时使用分散/聚集 IO（readv、writev）与带有 fread 的大缓冲区

提问by Jimm

采纳答案by ArjunShankar

相关推荐

Linux 和 Unix 有什么区别？

Linux 用 .a 文件编译 c 文件的命令

linux 文件退格问题

Linux 即使 nm 指示此符号存在于共享库中，也未定义对符号的引用

相关推荐

最近更新

标签