C++ 如何将文件内容读入 istringstream?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/132358/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 13:05:59  来源:igfitidea点击:

How to read file content into istringstream?

c++optimizationmemorystreamstringstream

提问by Marcos Bento

In order to improve performance reading from a file, I'm trying to read the entire content of a big (several MB) file into memory and then use a istringstream to access the information.

为了提高从文件读取的性能,我试图将一个大(几 MB)文件的全部内容读入内存,然后使用 istringstream 来访问信息。

My question is, which is the best way to read this information and "import it" into the string stream? A problem with this approach (see bellow) is that when creating the string stream the buffers gets copied, and memory usage doubles.

我的问题是,读取这些信息并将其“导入”到字符串流中的最佳方式是什么?这种方法的一个问题(见下文)是,在创建字符串流时,缓冲区被复制,内存使用量加倍。

#include <fstream>
#include <sstream>

using namespace std;

int main() {
  ifstream is;
  is.open (sFilename.c_str(), ios::binary );

  // get length of file:
  is.seekg (0, std::ios::end);
  long length = is.tellg();
  is.seekg (0, std::ios::beg);

  // allocate memory:
  char *buffer = new char [length];

  // read data as a block:
  is.read (buffer,length);

  // create string stream of memory contents
  // NOTE: this ends up copying the buffer!!!
  istringstream iss( string( buffer ) );

  // delete temporary buffer
  delete [] buffer;

  // close filestream
  is.close();

  /* ==================================
   * Use iss to access data
   */

}

回答by Luc Touraille

std::ifstreamhas a method rdbuf(), that returns a pointer to a filebuf. You can then "push" this filebufinto your stringstream:

std::ifstream有一个方法rdbuf(),它返回一个指向 a 的指针filebuf。然后,您可以将其“推入”filebuf您的stringstream

#include <fstream>
#include <sstream>

int main()
{
    std::ifstream file( "myFile" );

    if ( file )
    {
        std::stringstream buffer;

        buffer << file.rdbuf();

        file.close();

        // operations on the buffer...
    }
}

EDIT: As Martin York remarks in the comments, this might not be the fastest solution since the stringstream's operator<<will read the filebuf character by character. You might want to check his answer, where he uses the ifstream's readmethod as you used to do, and then set the stringstreambuffer to point to the previously allocated memory.

编辑:正如 Martin York 在评论中所说,这可能不是最快的解决方案,因为stringstream'soperator<<将逐个字符读取 filebuf。您可能想要检查他的答案,他在那里使用了您以前使用的ifstream'sread方法,然后将stringstream缓冲区设置为指向先前分配的内存。

回答by Martin York

OK. I am not saying this will be quicker than reading from the file

好的。我并不是说这比从文件中读取要快

But this is a method where you create the buffer once and after the data is read into the buffer use it directly as the source for stringstream.

但这是一种方法,您创建缓冲区一次,然后将数据读入缓冲区后,直接将其用作 stringstream 的源。

N.B.It is worth mentioning that the std::ifstream is buffered. It reads data from the file in (relatively large) chunks. Stream operations are performed against the buffer only returning to the file for another read when more data is needed. So before sucking all data into memory please verify that this is a bottle neck.

NB值得一提的是 std::ifstream 被缓冲。它以(相对较大的)块的形式从文件中读取数据。对缓冲区执行流操作仅在需要更多数据时返回到文件进行另一次读取。因此,在将所有数据吸入内存之前,请确认这是一个瓶颈。

#include <fstream>
#include <sstream>
#include <vector>

int main()
{
    std::ifstream       file("Plop");
    if (file)
    {
        /*
         * Get the size of the file
         */
        file.seekg(0,std::ios::end);
        std::streampos          length = file.tellg();
        file.seekg(0,std::ios::beg);

        /*
         * Use a vector as the buffer.
         * It is exception safe and will be tidied up correctly.
         * This constructor creates a buffer of the correct length.
         *
         * Then read the whole file into the buffer.
         */
        std::vector<char>       buffer(length);
        file.read(&buffer[0],length);

        /*
         * Create your string stream.
         * Get the stringbuffer from the stream and set the vector as it source.
         */
        std::stringstream       localStream;
        localStream.rdbuf()->pubsetbuf(&buffer[0],length);

        /*
         * Note the buffer is NOT copied, if it goes out of scope
         * the stream will be reading from released memory.
         */
    }
}

回答by KeithB

This seems like premature optimization to me. How much work is being done in the processing. Assuming a modernish desktop/server, and not an embedded system, copying a few MB of data during intialization is fairly cheap, especially compared to reading the file off of disk in the first place. I would stick with what you have, measure the system when it is complete, and the decide if the potential performance gains would be worth it. Of course if memory is tight, this is in an inner loop, or a program that gets called often (like once a second), that changes the balance.

这对我来说似乎是过早的优化。在处理中完成了多少工作。假设是现代台式机/服务器,而不是嵌入式系统,在初始化期间复制几 MB 数据相当便宜,尤其是与首先从磁盘读取文件相比。我会坚持你所拥有的,在系统完成时对其进行测量,然后决定潜在的性能提升是否值得。当然,如果内存紧张,这是在一个内部循环中,或者一个经常被调用的程序(比如每秒一次),这会改变平衡。

回答by luke

Another thing to keep in mind is that file I/O is always going to be the slowest operation. Luc Touraille's solution is correct, but there are other options. Reading the entire file into memory at once will be much faster than separate reads.

要记住的另一件事是文件 I/O 始终是最慢的操作。Luc Touraille 的解决方案是正确的,但还有其他选择。一次将整个文件读入内存将比单独读取快得多。