Linux 从标准输入读取（）

Question

提问by Ravi Gupta

Consider the following line of code:

考虑以下代码行：

while((n = read(STDIN_FILENO, buff, BUFSIZ)) > 0)

As per my understanding read/writefunctions are a part of non-buffered I/O. So does that mean read()function will read only one character per call from stdio? Or in other words, the value of n will be

根据我的理解，read/write函数是非缓冲 I/O 的一部分。那么这是否意味着read()函数每次从 stdio 调用只会读取一个字符？或者换句话说，n 的值将是

    -1  in case of error
n =  0  in case of EOF
     1  otherwise

If it is not the case, when would the above read()function will return and why?

如果不是这样，上面的read()函数什么时候会返回，为什么？

Note: I was also thinking that read()will wait until it successfully reads BUFSIZnumber of characters from stdin. But what happens in a case number of characters available to read are less than BUFSIZ? Will read wait forever or until EOF arrives (Ctrl + Don unix or Ctrl + Zon windows)?

注意：我还认为read()会等到它成功BUFSIZ从标准输入读取字符数。但是，如果可供读取的字符数少于，会发生BUFSIZ什么？read 会永远等待还是直到 EOF 到达（Ctrl + D在 unix 或Ctrl + ZWindows 上）？

Also, lets say BUFSIZ = 100and stdin = ACtrl+D(i.e EOF immediately following a single character). Now how many times the while loopwill iterate?

另外，让我们说BUFSIZ = 100和stdin = ACtrl+D（即紧跟在单个字符之后的 EOF）。现在while loop将迭代多少次？

Answer 1

回答by Dave

As the read()manpage states:

正如read()联机帮助页所述：

Return Value
On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal. On error, -1 is returned, and errno is set appropriately. In this case it is left unspecified whether the file position (if any) changes.

返回值
成功时，返回读取的字节数（零表示文件结束），文件位置按此数字前进。如果此数字小于请求的字节数，则不是错误；例如，这可能是因为现在实际可用的字节较少（可能因为我们接近文件结尾，或者因为我们正在从管道或终端读取），或者因为 read() 被中断信号。出错时，返回 -1，并适当设置 errno。在这种情况下，未指定文件位置（如果有）是否更改。

So, each read()will read up tothe number of specified bytes; but it may read less. "Non-buffered" means that if you specify read(fd, bar, 1), read will only read one byte. Buffered IO attempts to read in quanta of BUFSIZ, even if you only want one character. This may sound wasteful, but it avoids the overhead of making system calls, which makes it fast.

因此，每个read()将读取多达指定的字节数; 但它可能读得少。“非缓冲”意味着如果您指定read(fd, bar, 1)， read 将只读取一个字节。缓冲 IO 尝试读取的数量BUFSIZ，即使您只需要一个字符。这听起来可能很浪费，但它避免了进行系统调用的开销，从而使其速度更快。

Answer 2

回答by jim mcnamara

read attempts to get all of characters requested.
if EOF happens before all of the requested characters can be returned, it returns what it got after it does this the next read returns -1, to let you know you the file end.

read 尝试获取请求的所有字符。
如果 EOF 发生在所有请求的字符可以返回之前，它会返回它在执行此操作后得到的内容，下一次读取返回 -1，让您知道文件结束。

What happens when it tries to read and there is nothing there involves something called blocking. You can call open to read a file blocking or non-blocking. "blocking" means wait until there is something to return.

当它尝试读取并且没有任何内容时会发生什么涉及称为阻塞的东西。您可以调用 open 来读取阻塞或非阻塞的文件。“阻塞”的意思是等到有东西返回。

This is what you see in a shell waiting for input. It sits there. Until you hit return.

这就是您在等待输入的 shell 中看到的内容。它坐在那里。直到你点击返回。

Non-blocking means that read will return no bytes of data if there are none. Depending on a lot of other factors which would make a completely correct answer unusable for you, read will set errno to something like EWOULDBLOCK, which lets you know why your read returned zero bytes. It is not necessarily a fatal error.

非阻塞意味着如果没有数据， read 将不返回任何字节。根据许多其他因素会使完全正确的答案对您无法使用，read 会将 errno 设置为 EWOULDBLOCK 之类的内容，这让您知道为什么您的 read 返回零字节。这不一定是致命的错误。

Your code could test for a minus to find EOF or errors

您的代码可以测试减号以查找 EOF 或错误

Answer 3

回答by Kyle Jones

The way read() behaves depends on what is being read. For regular files, if you ask for N characters, you get N characters if they are available, less than N if end of file intervenes.

read() 的行为方式取决于正在读取的内容。对于常规文件，如果您要求输入 N 个字符，如果它们可用，您将得到 N 个字符，如果文件末尾介入，则小于 N。

If read() is reading from a terminal in canonical/cooked mode, the tty driver provides data a line at a time. So if you tell read() to get 3 characters or 300, read will hang until the tty driver has seen a newline or the terminal's defined EOF key, and then read() will return with either the number of characters in the line or the number of characters you requested, whichever is smaller.

如果 read() 在规范/cooked 模式下从终端读取，则 tty 驱动程序一次提供一行数据。因此，如果您告诉 read() 获取 3 个字符或 300 个字符，则 read 将挂起，直到 tty 驱动程序看到换行符或终端定义的 EOF 键，然后 read() 将返回该行中的字符数或您请求的字符数，以较小者为准。

If read() is reading from a terminal in non-canonical/raw mode, read will have access to keypresses immediately. If you ask read() to get 3 characters it might return with anywhere from 0 to 3 characters depending on input timing and how the terminal was configured.

如果 read() 正在以非规范/原始模式从终端读取，则 read 将可以立即访问按键。如果您要求 read() 获取 3 个字符，它可能会返回 0 到 3 个字符，具体取决于输入时间和终端的配置方式。

read() will behave differently in the face of signals, returning with less than the requested number of characters, or -1 with errno set to EINTR if a signal interrupted the read before any characters arrived.

read() 将在面对信号时表现不同，返回少于请求的字符数，或者如果信号在任何字符到达之前中断读取，则返回 -1 并将 errno 设置为 EINTR。

read() will behave differently if the descriptor has been configured for non-blocking I/O. read() will return -1 with errno set to EAGAIN or EWOULDBLOCK if no input was immediately available. This applies to sockets.

如果描述符已配置为非阻塞 I/O，则 read() 的行为将有所不同。如果没有立即可用的输入，则 read() 将返回 -1 并将 errno 设置为 EAGAIN 或 EWOULDBLOCK。这适用于套接字。

So as you can see, you should be ready for surprises when you call read(). You won't always get the number of characters you requested, and you might get non-fatal errors like EINTR, which means you should retry the read().

如您所见，当您调用 read() 时，您应该准备好迎接惊喜。你不会总是得到你请求的字符数，你可能会得到像 EINTR 这样的非致命错误，这意味着你应该重试 read()。

Answer 4

回答by Jonathan Leffler

Your code reads:

您的代码如下：

while((n = read(0, buff, BUFSIZ) != 0))

This is flawed - the parentheses mean it is interpreted as:

这是有缺陷的 - 括号意味着它被解释为：

while ((n = (read(0, buff, BUFSIZ) != 0)) != 0)

where the boolean condition is evaluated before the assignment, so nwill only obtain the values 0 (the condition is not true) and 1 (the condition is true).

其中布尔条件在赋值之前进行评估，因此n只会获得值 0（条件不为真）和 1（条件为真）。

You should write:

你应该写：

while ((n = read(0, buff, BUFSIZ)) > 0)

This stops on EOF or a read error, and nlets you know which condition you encountered.

这会在 EOF 或读取错误时停止，并n让您知道遇到的情况。

Apparently, the code above was a typo in the question.

显然，上面的代码是问题中的一个错字。

Unbuffered I/O will read up to the number of characters you read (but not more). It may read less on account of EOF or an error. It may also read less because less is available at the time of the call. Consider a terminal; typically, that will only read up to the end of line because there isn't any more available than that. Consider a pipe; if the feeding process has generated 128 unread bytes, then if BUFSIZ is 4096, you'll only get 128 bytes from the read. A non-blocking file descriptor may return because nothing is available; a socket may return fewer bytes because there isn't more information available yet; a disk read may return fewer bytes because there are fewer than the requested number of bytes left in the file when the read is performed.

无缓冲 I/O 将最多读取您读取的字符数（但不会更多）。由于 EOF 或错误，它可能读取较少。它也可能读取较少，因为在调用时可用的较少。考虑一个终端；通常，这只会读到行尾，因为没有比这更多的可用了。考虑一个管道；如果馈送过程生成了 128 个未读字节，那么如果 BUFSIZ 为 4096，则您将只能从读取中获得 128 个字节。非阻塞文件描述符可能会返回，因为没有可用的；套接字可能返回更少的字节，因为还没有更多可用信息；磁盘读取可能返回较少的字节，因为执行读取时文件中剩余的字节数少于请求的字节数。

In general, though, read()won't return just one byte if you request many bytes.

但是，read()如果您请求很多字节，通常不会只返回一个字节。

Answer 5

回答by R.. GitHub STOP HELPING ICE

When we say readis unbuffered, it means no buffering takes place at the level of your process after the data is pulled off the underlying open file description, which is a potentially-shared resource. If stdinis a terminal, there are likely at least 2 additional buffers in play, however:

当我们说read是无缓冲时，这意味着在数据从底层打开的文件描述中提取后，在您的进程级别不会发生缓冲，这是一种潜在的共享资源。如果stdin是终端，则可能至少有 2 个额外的缓冲区在起作用，但是：

The terminal buffer, which can probably hold 1-4k of data off the line until.
The kernel's cooked/canonical mode buffer for line entry/editing on a terminal, which lets the user perform primitive editing (backspace, backword, erase line, etc.) on the line until it's submitted (to the buffer described above) by pressing enter.

终端缓冲区，它大概可以保存 1-4k 的离线数据，直到。
内核的熟化/规范模式缓冲区，用于在终端上输入/编辑行，它允许用户在行上执行原始编辑（退格、后退、擦除行等），直到通过按 Enter 提交（到上述缓冲区） .

readwill pull whatever has already been submitted, up to the max read length you passed to it, but it cannot pull anything from the line editing buffer. If you want to disable this extra layer of buffering, you need to lookup how to disable cooked/canonical mode for a terminal using tcsetattr, etc.

read将拉取已提交的任何内容，直至您传递给它的最大读取长度，但它无法从行编辑缓冲区中提取任何内容。如果你想禁用这个额外的缓冲层，你需要查找如何使用tcsetattr等来禁用终端的熟/规范模式。

Linux 从标准输入读取（）

提问by Ravi Gupta

回答by Dave

回答by jim mcnamara

回答by Kyle Jones

回答by Jonathan Leffler

回答by R.. GitHub STOP HELPING ICE

相关推荐

最近更新

标签

Linux 从标准输入读取（）

提问by Ravi Gupta

回答by Dave

回答by jim mcnamara

回答by Kyle Jones

回答by Jonathan Leffler

回答by R.. GitHub STOP HELPING ICE

相关推荐

C# 调试 LINQ 查询

在 Linux 上以用户身份运行 Jenkins 作业

C# 我如何调用绘画事件？

Linux OSSEC | 如何添加例外规则

相关推荐

最近更新

标签