C语言 C - 序列化技术

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6002528/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 08:39:40  来源:igfitidea点击:

C - serialization techniques

cserialization

提问by ryyst

I'm writing some code to serialize some data to send it over the network. Currently, I use this primitive procedure:

我正在编写一些代码来序列化一些数据以通过网络发送它。目前,我使用这个原始程序:

  1. create a void*buffer
  2. apply any byte ordering operations such as the htonfamily on the data I want to send over the network
  3. use memcpyto copy the memory into the buffer
  4. send the memory over the network
  1. 创建void*缓冲区
  2. hton在我想通过网络发送的数据上应用任何字节排序操作,例如家庭
  3. 用于memcpy将内存复制到缓冲区中
  4. 通过网络发送内存

The problem is that with various data structures (which often contain void* data so you don't know whether you need to care about byte ordering) the code becomes really bloated with serialization code that's veryspecific to each data structure and can't be reused at all.

问题在于,对于各种数据结构(通常包含 void* 数据,因此您不知道是否需要关心字节顺序),代码变得非常臃肿,序列化代码非常特定于每个数据结构,并且不能完全重用。

What are some good serialization techniques for C that make this easier / less ugly?

什么是 C 的一些好的序列化技术,使这更容易/不那么难看?

-

——

Note: I'm bound to a specific protocol so I cannot freely choose how to serialize my data.

注意:我被绑定到一个特定的协议,所以我不能自由选择如何序列化我的数据。

回答by jstanley

For each data structure, have a serialize_X function (where X is the struct name) which takes a pointer to an X and a pointer to an opaque buffer structure and calls the appropriate serializing functions. You should supply some primitives such as serialize_int which write to the buffer and update the output index. The primitives will have to call something like reserve_space(N) where N is the number of bytes that are required before writing any data. reserve_space() will realloc the void* buffer to make it at least as big as it's current size plus N bytes. To make this possible, the buffer structure will need to contain a pointer to the actual data, the index to write the next byte to (output index) and the size that is allocated for the data. With this system, all of your serialize_X functions should be pretty straightforward, for example:

对于每个数据结构,都有一个 serialize_X 函数(其中 X 是结构名称),它接受一个指向 X 的指针和一个指向不透明缓冲区结构的指针,并调用适当的序列化函数。您应该提供一些原语,例如 serialize_int 写入缓冲区并更新输出索引。原语必须调用reserve_space(N) 之类的东西,其中N 是写入任何数据之前所需的字节数。Reserve_space() 将重新分配 void* 缓冲区,使其至少与当前大小加上 N 字节一样大。为了实现这一点,缓冲区结构将需要包含一个指向实际数据的指针、下一个字节写入的索引(输出索引)以及为数据分配的大小。使用此系统,您的所有 serialize_X 函数都应该非常简单,例如:

struct X {
    int n, m;
    char *string;
}

void serialize_X(struct X *x, struct Buffer *output) {
    serialize_int(x->n, output);
    serialize_int(x->m, output);
    serialize_string(x->string, output);
}

And the framework code will be something like:

框架代码将类似于:

#define INITIAL_SIZE 32

struct Buffer {
    void *data;
    int next;
    size_t size;
}

struct Buffer *new_buffer() {
    struct Buffer *b = malloc(sizeof(Buffer));

    b->data = malloc(INITIAL_SIZE);
    b->size = INITIAL_SIZE;
    b->next = 0;

    return b;
}

void reserve_space(Buffer *b, size_t bytes) {
    if((b->next + bytes) > b->size) {
        /* double size to enforce O(lg N) reallocs */
        b->data = realloc(b->data, b->size * 2);
        b->size *= 2;
    }
}

From this, it should be pretty simple to implement all of the serialize_() functions you need.

由此看来,实现您需要的所有 serialize_() 函数应该非常简单。

EDIT: For example:

编辑:例如:

void serialize_int(int x, Buffer *b) {
    /* assume int == long; how can this be done better? */
    x = htonl(x);

    reserve_space(b, sizeof(int));

    memcpy(((char *)b->data) + b->next, &x, sizeof(int));
    b->next += sizeof(int);
}

EDIT: Also note that my code has some potential bugs. The size of the buffer array is stored in a size_t but the index is an int (I'm not sure if size_t is considered a reasonable type for an index). Also, there is no provision for error handling and no function to free the Buffer after you're done so you'll have to do this yourself. I was just giving a demonstration of the basic architecture that I would use.

编辑:另请注意,我的代码有一些潜在的错误。缓冲区数组的大小存储在 size_t 中,但索引是 int (我不确定 size_t 是否被认为是合理的索引类型)。此外,没有提供错误处理的规定,也没有在完成后释放缓冲区的功能,因此您必须自己执行此操作。我只是展示了我将使用的基本架构。

回答by Assaf Lavie

I would say definitely don't try to implement serialization yourself. It's been done a zillion times and you should use an existing solution. e.g. protobufs: https://github.com/protobuf-c/protobuf-c

我会说绝对不要尝试自己实现序列化。它已经完成了无数次,您应该使用现有的解决方案。例如 protobufs:https: //github.com/protobuf-c/protobuf-c

It also has the advantage of being compatible with many other programming languages.

它还具有与许多其他编程语言兼容的优点。

回答by Bernardo Ramos

I suggest using a library.

我建议使用图书馆。

As I was not happy with the existing ones, I created the Binnlibrary to make our lives easier.

由于我对现有的不满意,我创建了Binn库来让我们的生活更轻松。

Here is an example of using it:

下面是一个使用它的例子:

  binn *obj;

  // create a new object
  obj = binn_object();

  // add values to it
  binn_object_set_int32(obj, "id", 123);
  binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
  binn_object_set_double(obj, "price", 12.50);
  binn_object_set_blob(obj, "picture", picptr, piclen);

  // send over the network
  send(sock, binn_ptr(obj), binn_size(obj));

  // release the buffer
  binn_free(obj);

回答by Charlie Martin

It would help if we knew what the protocol constraints are, but in general your options are really pretty limited. If the data are such that you can make a union of a byte array sizeof(struct) for each struct it might simplify things, but from your description it sounds like you have a more essential problem: if you're transferring pointers (you mention void * data) then those points are very unlikely to be valid on the receiving machine. Why would the data happen to appear at the same place in memory?

如果我们知道协议约束是什么会有所帮助,但总的来说,您的选择非常有限。如果数据是这样的,你可以为每个结构创建一个字节数组 sizeof(struct) 的联合,它可能会简化事情,但从你的描述来看,这听起来你有一个更本质的问题:如果你正在传输指针(你提到void * data) 那么这些点在接收机器上不太可能有效。为什么数据会出现在内存中的同一个地方?