C++:如何在不使用库的情况下序列化/反序列化对象?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11415850/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C++: how to serialize/deserialize objects without the use of libraries?
提问by Winte Winte
I am trying to understand how serialization/deserialization works in C++ without the use of libraries. I started with simple objects but when deserializing a vector, I found out, that I can't get the vector without having written its size first. Moreover, I don't know which file format I should choose, because, if digits exist before vector's size I can't read it right. Furthermore, I want to do that with classes and map containers. My task is to serialize/deserialize an object like this:
我试图了解在不使用库的情况下如何在 C++ 中进行序列化/反序列化。我从简单的对象开始,但是在反序列化向量时,我发现如果不先写出向量的大小,我就无法获得向量。此外,我不知道应该选择哪种文件格式,因为如果在矢量大小之前存在数字,我将无法正确阅读。此外,我想用类和映射容器来做到这一点。我的任务是序列化/反序列化这样的对象:
PersonInfo
{
unsigned int age_;
string name_;
enum { undef, man, woman } sex_;
}
Person : PersonInfo
{
vector<Person> children_;
map<string, PersonInfo> addrBook_;
}
Currently I know how to serialize simple objects like this:
目前我知道如何序列化这样的简单对象:
vector<PersonInfo> vecPersonInfo;
vecPersonInfo.push_back(*personInfo);
vecPersonInfo.push_back(*oneMorePersonInfo);
ofstream file("file", ios::out | ios::binary);
if (!file) {
cout<<"can not open file";
} else {
vector<PersonInfo>::const_iterator iterator = vecPersonInfo.begin();
for (; iterator != vecPersonInfo.end(); iterator++) {
file<<*iterator;
}
Could you please suggest, how can I do this for this complex object or a good tutorial that explains it clearly?
您能否提出建议,我该如何为这个复杂的对象或一个清楚地解释它的好教程做到这一点?
回答by Vite Falcon
One pattern is to implement an abstract class the defines functions for serialization and the class defines what goes into the serializer and what comes out. An example would be:
一种模式是实现一个抽象类,定义用于序列化的函数,该类定义进入序列化程序的内容和输出的内容。一个例子是:
class Serializable
{
public:
Serializable(){}
virtual ~Serializable(){}
virtual void serialize(std::ostream& stream) = 0;
virtual void deserialize(std::istream& stream) = 0;
};
You then implement Serializable interface for the class/struct that you want to serialize:
然后为要序列化的类/结构实现 Serializable 接口:
struct PersonInfo : public Serializable // Yes! It's possible
{
unsigned int age_;
string name_;
enum { undef, man, woman } sex_;
virtual void serialize(std::ostream& stream)
{
// Serialization code
stream << age_ << name_ << sex_;
}
virtual void deserialize(std::istream& stream)
{
// Deserialization code
stream >> age_ >> name_ >> sex_;
}
};
Rest I believe you know. Here's a few hurdles to pass though and can be done in your leisure:
休息我相信你知道。不过,这里有一些障碍需要通过,您可以在闲暇时完成:
- When you write a string to the stream with spaces in it and try to read it back, you will get only one portion of it and rest of the string 'corrupts' the values read after that.
- How can you program it such that it's cross-platform (little-endian vs big-endian)
- How can your program automatically detect, which class to create when deserializing.
- 当您将一个字符串写入包含空格的流并尝试将其读回时,您只会得到其中的一部分,而字符串的其余部分会“破坏”之后读取的值。
- 您如何对其进行编程以使其跨平台(小端与大端)
- 您的程序如何自动检测反序列化时要创建的类。
Clues:
线索:
- Use custom serializer that has functions to write bool, int, float, strings, etc.
- Use a string to represent the object type being serialized and use factory to create an instance of that object when deserializing.
- Use predefined macros to determine which platform your code is being compiled.
- Always write files in a fixed endian and make the platforms that use the other endianess adjust to that.
- 使用具有编写 bool、int、float、string 等功能的自定义序列化程序。
- 使用字符串表示正在序列化的对象类型,并在反序列化时使用工厂创建该对象的实例。
- 使用预定义的宏来确定正在编译您的代码的平台。
- 始终以固定字节序写入文件,并使使用其他字节序的平台适应该字节序。
回答by Preet Kukreti
The most basic form is to define a "Serialisable" interface (abstract class) that defines virtual read/write methods. You also define a "Stream" interface that provides a common API for basic primitive types (e.g. reading/writing of ints, floats, bytes, chars, seek/reset) and maybe for some compound types (arrays of values e.g. for strings, vectors, etc.) which operates on a stream. You can use the C++ IOStreams if it suits you.
最基本的形式是定义一个“Serialisable”接口(抽象类),它定义了虚拟读/写方法。您还定义了一个“流”接口,该接口为基本原始类型(例如读取/写入整数、浮点数、字节、字符、搜索/重置)和某些复合类型(例如字符串、向量的值数组)提供通用 API等) 对流进行操作。如果适合您,您可以使用 C++ IOStreams。
You also will need to have some id system for a factory to create the corresponding class when loading/deserialising, and for referencing when serializing complex types so that each logical part is tagged/header-ed with proper structure/length information when necessary.
您还需要一些 id 系统,以便在加载/反序列化时为工厂创建相应的类,并在序列化复杂类型时进行引用,以便在必要时使用适当的结构/长度信息对每个逻辑部分进行标记/标头。
Then you can create concrete Stream classes for each medium (like Text File, Binary File, In Memory, Network, etc).
然后,您可以为每种媒体(如文本文件、二进制文件、内存中、网络等)创建具体的 Stream 类。
Each class you want to be serializable then has to inherit the Serializable interface and implement the details (recursively leveraging serializable interfaces defined for other types if a compound/complex class).
您想要序列化的每个类都必须继承 Serializable 接口并实现细节(如果是复合/复杂类,则递归利用为其他类型定义的可序列化接口)。
This is of course a naive and "intrusive" way of adding serialisation (where you must modify the participating classes). You can then use template or preprocessor tricks to make it less intrusive. See Boost or protocol buffers, or any other library for ideas on how this might look in code.
这当然是添加序列化(您必须修改参与类)的一种幼稚且“侵入性”的方式。然后,您可以使用模板或预处理器技巧来减少干扰。请参阅 Boost 或协议缓冲区,或任何其他库,了解有关这在代码中的外观的想法。
You really sure you want to roll your own? It can get really messy, especially when you have pointers, pointers between objects (including cycles), which you also need to fix up/translate at some point before a load/deserialisation is correct for the current run.
你真的确定要自己动手吗?它可能会变得非常混乱,尤其是当您有指针,对象之间的指针(包括循环)时,您还需要在加载/反序列化对当前运行正确之前的某个时间点修复/转换。