将 C++ 向量读取和写入文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2469531/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:34:09  来源:igfitidea点击:

Reading and writing C++ vector to a file

c++file-iovector

提问by jcoder

For some graphics work I need to read in a large amount of data as quickly as possible and would ideally like to directly read and write the data structures to disk. Basically I have a load of 3d models in various file formats which take too long to load so I want to write them out in their "prepared" format as a cache that will load much faster on subsequent runs of the program.

对于某些图形工作,我需要尽快读取大量数据,并且理想情况下希望直接读取数据结构并将其写入磁盘。基本上,我加载了各种文件格式的 3d 模型,加载时间太长,因此我想将它们以“准备好的”格式写出作为缓存,以便在程序的后续运行中加载得更快。

Is it safe to do it like this? My worries are around directly reading into the data of the vector? I've removed error checking, hard coded 4 as the size of the int and so on so that i can give a short working example, I know it's bad code, my question really is if it is safe in c++ to read a whole array of structures directly into a vector like this? I believe it to be so, but c++ has so many traps and undefined behavour when you start going low level and dealing directly with raw memory like this.

这样做安全吗?我担心的是直接读入向量的数据?我已经删除了错误检查,硬编码 4 作为 int 的大小等等,以便我可以给出一个简短的工作示例,我知道这是错误的代码,我的问题实际上是在 C++ 中读取整个数组是否安全结构直接变成这样的向量?我相信确实如此,但是当您开始低级并直接处理像这样的原始内存时,C++ 有很多陷阱和未定义的行为。

I realise that number formats and sizes may change across platforms and compilers but this will only even be read and written by the same compiler program to cache data that may be needed on a later run of the same program.

我意识到数字格式和大小可能会因平台和编译器而异,但这甚至只能由同一编译器程序读取和写入,以缓存以后运行同一程序时可能需要的数据。

#include <fstream>
#include <vector>

using namespace std;

struct Vertex
{
    float x, y, z;
};

typedef vector<Vertex> VertexList;

int main()
{
    // Create a list for testing
    VertexList list;
    Vertex v1 = {1.0f, 2.0f,   3.0f}; list.push_back(v1);
    Vertex v2 = {2.0f, 100.0f, 3.0f}; list.push_back(v2);
    Vertex v3 = {3.0f, 200.0f, 3.0f}; list.push_back(v3);
    Vertex v4 = {4.0f, 300.0f, 3.0f}; list.push_back(v4);

    // Write out a list to a disk file
    ofstream os ("data.dat", ios::binary);

    int size1 = list.size();
    os.write((const char*)&size1, 4);
    os.write((const char*)&list[0], size1 * sizeof(Vertex));
    os.close();


    // Read it back in
    VertexList list2;

    ifstream is("data.dat", ios::binary);
    int size2;
    is.read((char*)&size2, 4);
    list2.resize(size2);

     // Is it safe to read a whole array of structures directly into the vector?
    is.read((char*)&list2[0], size2 * sizeof(Vertex));

}

回答by Peter Alexander

As Laurynas says, std::vectoris guaranteed to be contiguous, so that should work, but it is potentially non-portable.

正如 Laurynas 所说,std::vector保证是连续的,所以应该可以工作,但它可能是不可移植的。

On most systems, sizeof(Vertex)will be 12, but it's not uncommon for the struct to be padded, so that sizeof(Vertex) == 16. If you were to write the data on one system and then read that file in on another, there's no guarantee that it will work correctly.

在大多数系统上,sizeof(Vertex)将是 12,但填充结构的情况并不少见,因此sizeof(Vertex) == 16. 如果您在一个系统上写入数据,然后在另一个系统上读取该文件,则无法保证它会正常工作。

回答by Emile Cormier

You might be interested in the Boost.Serializationlibrary. It knows how to save/load STL containers to/from disk, among other things. It might be overkill for your simple example, but it might become more useful if you do other types of serialization in your program.

您可能对Boost.Serialization库感兴趣。它知道如何将 STL 容器保存到磁盘/从磁盘加载,等等。对于您的简单示例来说,这可能有点矫枉过正,但如果您在程序中进行其他类型的序列化,它可能会变得更有用。

Here's some sample code that does what you're looking for:

这是一些示例代码,可以满足您的要求:

#include <algorithm>
#include <fstream>
#include <vector>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/serialization/vector.hpp>

using namespace std;

struct Vertex
{
    float x, y, z;
};

bool operator==(const Vertex& lhs, const Vertex& rhs)
{
    return lhs.x==rhs.x && lhs.y==rhs.y && lhs.z==rhs.z;
}

namespace boost { namespace serialization {
    template<class Archive>
    void serialize(Archive & ar, Vertex& v, const unsigned int version)
    {
        ar & v.x; ar & v.y; ar & v.z;
    }
} }

typedef vector<Vertex> VertexList;

int main()
{
    // Create a list for testing
    const Vertex v[] = {
        {1.0f, 2.0f,   3.0f},
        {2.0f, 100.0f, 3.0f},
        {3.0f, 200.0f, 3.0f},
        {4.0f, 300.0f, 3.0f}
    };
    VertexList list(v, v + (sizeof(v) / sizeof(v[0])));

    // Write out a list to a disk file
    {
        ofstream os("data.dat", ios::binary);
        boost::archive::binary_oarchive oar(os);
        oar << list;
    }

    // Read it back in
    VertexList list2;

    {
        ifstream is("data.dat", ios::binary);
        boost::archive::binary_iarchive iar(is);
        iar >> list2;
    }

    // Check if vertex lists are equal
    assert(list == list2);

    return 0;
}

Note that I had to implement a serializefunction for your Vertexin the boost::serializationnamespace. This lets the serialization library know how to serialize Vertexmembers.

请注意,我必须在命名空间中serialize为您实现一个函数。这让序列化库知道如何序列化成员。Vertexboost::serializationVertex

I've browsed through the boost::binary_oarchivesource code and it seems that it reads/writes the raw vector array data directly from/to the stream buffer. So it should be pretty fast.

我浏览了boost::binary_oarchive源代码,似乎它直接从/向流缓冲区读取/写入原始向量数组数据。所以它应该很快。

回答by Laurynas Biveinis

std::vectoris guaranteed to be continuous in memory, so, yes.

std::vector保证在内存中是连续的,所以,是的。

回答by mortonjt

I just ran into this exact same problem.

我刚刚遇到了这个完全相同的问题。

First off, these statements are broken

首先,这些陈述被打破了

os.write((const char*)&list[0], size1 * sizeof(Vertex));
is.read((char*)&list2[0], size2 * sizeof(Vertex));

There is other stuff in the Vector data structure, so this will make your new vector get filled up with garbage.

Vector 数据结构中还有其他内容,因此这将使您的新向量充满垃圾。

Solution:
When you are writing your vector into a file, don't worry about the size your Vertex class, just directly write the entire vector into memory.

解决方案:
当您将向量写入文件时,不要担心 Vertex 类的大小,直接将整个向量写入内存即可。

os.write((const char*)&list, sizeof(list));

And then you can read the entire vector into memory at once

然后你可以一次将整个向量读入内存

is.seekg(0,ifstream::end);
long size2 = is.tellg();
is.seekg(0,ifstream::beg);
list2.resize(size2);
is.read((char*)&list2, size2);

回答by Void

Another alternative to explicitly reading and writing your vector<>from and to a file is to replace the underlying allocator with one that allocates memory from a memory mapped file. This would allow you to avoid an intermediate read/write related copy. However, this approach does have some overhead. Unless your file is very large it may not make sense for your particular case. Profile as usual to determine if this approach is a good fit.

另一种显式读取和写入vector<>文件的替代方法是将底层分配器替换为从内存映射文件中分配内存的分配器。这将允许您避免中间读/写相关副本。但是,这种方法确实有一些开销。除非您的文件非常大,否则对于您的特定情况可能没有意义。像往常一样分析以确定这种方法是否合适。

There are also some caveats to this approach that are handled very well by the Boost.Interprocesslibrary. Of particular interest to you may be its allocators and containers.

这种方法也有一些警告,Boost.Interprocess库处理得很好。您可能特别感兴趣的是它的分配器和容器

回答by KeithB

If this is used for caching by the same code, I don't see any problem with this. I've used this same technique on multiple systems without a problem (all Unix based). As an extra precaution, you might want to write a struct with known values at the beginning of the file, and check that it reads ok. You might also want to record the size of the struct in the file. This will save a lot of debugging time in the future if the padding ever changes.

如果这被相同的代码用于缓存,我认为这没有任何问题。我在多个系统上使用了相同的技术,没有问题(所有基于 Unix)。作为额外的预防措施,您可能希望在文件开头编写一个具有已知值的结构,并检查它是否读取正常。您可能还想在文件中记录结构的大小。如果填充发生变化,这将在未来节省大量调试时间。