C++ 逐字节读取二进制 istream

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5513532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 18:20:22  来源:igfitidea点击:

Reading binary istream byte by byte

c++binaryfilesistream

提问by Adrian McCarthy

I was attempting to read a binary file byte by byte using an ifstream. I've used istream methods like get() before to read entire chunks of a binary file at once without a problem. But my current task lends itself to going byte by byte and relying on the buffering in the io-system to make it efficient. The problem is that I seemed to reach the end of the file several bytes sooner than I should. So I wrote the following test program:

我试图使用 ifstream 逐字节读取二进制文件。我之前使用过像 get() 这样的 istream 方法一次读取整个二进制文件块没有问题。但是我当前的任务适合逐字节处理并依靠 io 系统中的缓冲来使其高效。问题是我似乎比我应该提前几个字节到达文件末尾。于是我写了下面的测试程序:

#include <iostream>
#include <fstream>

int main() {
    typedef unsigned char uint8;
    std::ifstream source("test.dat", std::ios_base::binary);
    while (source) {
        std::ios::pos_type before = source.tellg();
        uint8 x;
        source >> x;
        std::ios::pos_type after = source.tellg();
        std::cout << before << ' ' << static_cast<int>(x) << ' '
                  << after << std::endl;
    }
    return 0;
}

This dumps the contents of test.dat, one byte per line, showing the file position before and after.

这将转储 test.dat 的内容,每行一个字节,显示之前和之后的文件位置。

Sure enough, if my file happens to have the two-byte sequence 0x0D-0x0A (which corresponds to carriage return and line feed), those bytes are skipped.

果然,如果我的文件恰好有两个字节的序列 0x0D-0x0A(对应于回车和换行),这些字节将被跳过。

  • I've opened the stream in binary mode. Shouldn't that prevent it from interpreting line separators?
  • Do extraction operators always use text mode?
  • What's the right way to read byte by byte from a binary istream?
  • 我已经以二进制模式打开了流。这不应该阻止它解释行分隔符吗?
  • 提取操作员总是使用文本模式吗?
  • 从二进制 istream 逐字节读取的正确方法是什么?

MSVC++ 2008 on Windows.

Windows 上的 MSVC++ 2008。

采纳答案by James Kanze

The >> extractors are for formatted input; they skip white space (by default). For single character unformatted input, you can use istream::get()(returns an int, either EOF if the read fails, or a value in the range [0,UCHAR_MAX]) or istream::get(char&)(puts the character read in the argument, returns something which converts to bool, true if the read succeeds, and false if it fails.

>> 提取器用于格式化输入;他们跳过空白(默认情况下)。对于单字符无格式输入,您可以使用 istream::get()(返回int,如果读取失败,则返回EOF,或范围 [0,UCHAR_MAX] 中的值) 或istream::get(char&)(将读取的字符放入参数中,返回转换为的内容bool,如果为 true读取成功,如果失败则返回 false。

回答by stefaanv

there is a read()member function in which you can specify the number of bytes.

有一个read()成员函数,您可以在其中指定字节数。

回答by Lightness Races in Orbit

Why are you using formatted extraction, rather than .read()?

为什么要使用格式化提取,而不是.read()?

回答by Serge Dundich

source.get()

will give you a single byte. It is unformatted input function. operator>> is formatted input function that may imply skipping whitespace characters.

会给你一个字节。它是无格式输入函数。operator>> 是格式化的输入函数,可能意味着跳过空白字符。

回答by Rob?

As others mentioned, you should use istream::read(). But, if you must use formatted extraction, consider std::noskipws.

正如其他人提到的,您应该使用istream::read(). 但是,如果您必须使用格式化提取,请考虑std::noskipws.