使用 JSON 协议处理版本控制的最佳方法是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10042742/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 18:16:41  来源:igfitidea点击:

What is the best way to handle versioning using JSON protocol?

jsonserializationbinaryversioning

提问by Ted

I am normally writing all parts of the code in C# and when writing protocols that are serialized I use FastSerializer that serializes/deserializes the classes fast and efficient. It is also very easy to use, and fairly straight-forward to do "versioning", ie to handle different versions of the serialization. The thing I normally use, looks like this:

我通常用 C# 编写代码的所有部分,在编写序列化协议时,我使用 FastSerializer 快速有效地序列化/反序列化类。它也非常易于使用,并且可以非常直接地进行“版本控制”,即处理不同版本的序列化。我通常使用的东西是这样的:

public override void DeserializeOwnedData(SerializationReader reader, object context)
{
    base.DeserializeOwnedData(reader, context);
    byte serializeVersion = reader.ReadByte(); // used to keep what version we are using

    this.CustomerNumber = reader.ReadString();
    this.HomeAddress = reader.ReadString();
    this.ZipCode = reader.ReadString();
    this.HomeCity = reader.ReadString();
    if (serializeVersion > 0)
        this.HomeAddressObj = reader.ReadUInt32();
    if (serializeVersion > 1)
        this.County = reader.ReadString();
    if (serializeVersion > 2)
        this.Muni = reader.ReadString();
    if (serializeVersion > 3)
        this._AvailableCustomers = reader.ReadList<uint>();
}

and

public override void SerializeOwnedData(SerializationWriter writer, object context)
{            
    base.SerializeOwnedData(writer, context);
    byte serializeVersion = 4; 
    writer.Write(serializeVersion);


    writer.Write(CustomerNumber);
    writer.Write(PopulationRegistryNumber);            
    writer.Write(HomeAddress);
    writer.Write(ZipCode);
    writer.Write(HomeCity);
    if (CustomerCards == null)
        CustomerCards = new List<uint>();            
    writer.Write(CustomerCards);
    writer.Write(HomeAddressObj);

    writer.Write(County);

    // v 2
    writer.Write(Muni);

    // v 4
    if (_AvailableCustomers == null)
        _AvailableCustomers = new List<uint>();
    writer.Write(_AvailableCustomers);
}

So its easy to add new things, or change the serialization completely if one chooses to.

所以很容易添加新东西,或者如果你愿意的话,可以完全改变序列化。

However, I now want to use JSON for reasons not relevant right here =) I am currently using DataContractJsonSerializerand I am now looking for a way to have the same flexibility I have using the FastSerializer above.

但是,我现在想使用 JSON 的原因与此处无关 =) 我目前正在使用DataContractJsonSerializer,我现在正在寻找一种方法来获得与使用上述 FastSerializer 相同的灵活性。

So the question is; what is the best way to create a JSON protocol/serialization and to be able to detail the serialization as above, so that I do not break the serialization just because another machine hasn't yet updated their version?

所以问题是;创建 JSON 协议/序列化并能够如上所述详细说明序列化的最佳方法是什么,这样我就不会因为另一台机器尚未更新其版本而破坏序列化?

回答by monsur

The key to versioning JSON is to always add new properties, and never remove or rename existing properties. This is similar to how protocol buffers handle versioning.

JSON 版本控制的关键是始终添加新属性,永远不要删除或重命名现有属性。这类似于协议缓冲区处理版本控制的方式

For example, if you started with the following JSON:

例如,如果您从以下 JSON 开始:

{
  "version": "1.0",
  "foo": true
}

And you want to rename the "foo" property to "bar", don't just rename it. Instead, add a new property:

并且您想将“foo”属性重命名为“bar”,不要只是重命名它。相反,添加一个新属性:

{
  "version": "1.1",
  "foo": true,
  "bar": true
}

Since you are never removing properties, clients based on older versions will continue to work. The downside of this method is that the JSON can get bloated over time, and you have to continue maintaining old properties.

由于您永远不会删除属性,因此基于旧版本的客户端将继续工作。这种方法的缺点是 JSON 会随着时间的推移变得臃肿,您必须继续维护旧属性。

It is also important to clearly define your "edge" cases to your clients. Suppose you have an array property called "fooList". The "fooList" property could take on the following possible values: does not exist/undefined (the property is not physically present in the JSON object, or it exists and is set to "undefined"), null, empty list or a list with one or more values. It is important that clients understand how to behave, especially in the undefined/null/empty cases.

向客户明确定义“边缘”案例也很重要。假设您有一个名为“fooList”的数组属性。“fooList”属性可以采用以下可能的值:不存在/未定义(该属性不存在于 JSON 对象中,或者它存在并设置为“未定义”)、空、空列表或带有一个或多个值。客户了解如何表现很重要,尤其是在未定义/空/空的情况下。

I would also recommend reading up on how semantic versioningworks. If you introduce a semantic versioning scheme to your version numbers, then backwards compatible changes can be made on a minor version boundary, while breaking changes can be made on a major version boundary (both clients and servers would have to agree on the same major version). While this isn't a property of the JSON itself, this is useful for communicating the types of changes a client should expect when the version changes.

我还建议阅读语义版本控制的工作原理。如果您在版本号中引入语义版本控制方案,则可以在次要版本边界上进行向后兼容的更改,而可以在主要版本边界上进行重大更改(客户端和服务器都必须就相同的主要版本达成一致) )。虽然这不是 JSON 本身的属性,但这对于传达客户端在版本更改时应该期望的更改类型很有用。

回答by shashankaholic

Google's java based gson libraryhas an excellent versioning support for json. It could prove a very handy if you are thinking going java way.

Google 的基于 java 的gson 库对 json 有很好的版本支持。如果您正在考虑采用 Java 方式,它可能会非常方便。

There is nice and easy tutorial here.

有好的,易于教程这里

回答by Adrian Salazar

Don't use DataContractJsonSerializer, as the name says, the objects that are processed through this class will have to:

不要使用DataContractJsonSerializer,顾名思义,通过这个类处理的对象必须:

a) Be marked with [DataContract] and [DataMember] attributes.

a) 标有[DataContract] 和[DataMember] 属性。

b) Be strictly compliant with the defined "Contract" that is, nothing less and nothing more that it is defined. Any extra or missing [DataMember] will make the deserialization to throw an exception.

b) 严格遵守已定义的“合同”,即不低于其所定义的内容。任何额外或缺失的 [DataMember] 都会使反序列化抛出异常。

If you want to be flexible enough, then use the JavaScriptSerializer if you want to go for the cheap option... or use this library:

如果你想要足够灵活,那么如果你想要便宜的选择,那么使用 JavaScriptSerializer ......或者使用这个库:

http://json.codeplex.com/

http://json.codeplex.com/

This will give you enough control over your JSON serialization.

这将使您对 JSON 序列化有足够的控制权。

Imagine you have an object in its early days.

想象一下,您有一个早期的对象。

public class Customer
{ 
    public string Name;

    public string LastName;
}

Once serialized it will look like this:

序列化后,它将如下所示:

{ Name: "John", LastName: "Doe" }

{ 姓名:“约翰”,姓氏:“多伊”}

If you change your object definition to add / remove fields. The deserialization will occur smoothly if you use, for example, JavaScriptSerializer.

如果您更改对象定义以添加/删除字段。例如,如果您使用 JavaScriptSerializer,反序列化将顺利进行。

public class Customer
{ 
    public string Name;

    public string LastName;

    public int Age;
}

If yo try to de-serialize the last json to this new class, no error will be thrown. The thing is that your new fields will be set to their defaults. In this example: "Age" will be set to zero.

如果您尝试将最后一个 json 反序列化为这个新类,则不会抛出任何错误。问题是您的新字段将设置为其默认值。在本例中:“年龄”将设置为零。

You can include, in your own conventions, a field present in all your objects, that contains the version number. In this case you can tell the difference between an empty field or a version inconsistence.

您可以按照自己的约定,在所有对象中包含一个包含版本号的字段。在这种情况下,您可以区分空字段或版本不一致之间的区别。

So lets say: You have your class Customer v1 serialized:

所以让我们说:你有你的类 Customer v1 序列化:

{ Version: 1, LastName: "Doe", Name: "John" }

You want to deserialize into a Customer v2 instance, you will have:

您想反序列化为 Customer v2 实例,您将拥有:

{ Version: 1, LastName: "Doe", Name: "John", Age: 0}

You can somehow, detect what fields in your object are somehow reliable and what's not. In this case you know that your v2 object instance is coming from a v1 object instance, so the field Age should not be trusted.

您可以以某种方式检测对象中的哪些字段在某种程度上可靠,哪些不可靠。在这种情况下,您知道 v2 对象实例来自 v1 对象实例,因此不应信任字段 Age。

I have in mind that you should use also a custom attribute, e.g. "MinVersion", and mark each field with the minimum supported version number, so you get something like this:

我记得你还应该使用一个自定义属性,例如“MinVersion”,并用支持的最低版本号标记每个字段,这样你就会得到这样的东西:

public class Customer
{ 
    [MinVersion(1)]
    public int Version;

    [MinVersion(1)]
    public string Name;

    [MinVersion(1)]
    public string LastName;

    [MinVersion(2)]
    public int Age;
}

Then later you can access this meta-data and do whatever you might need with that.

然后,您可以访问此元数据并执行您可能需要的任何操作。

回答by Lie Ryan

It doesn't matter what serializing protocol you use, the techniques to version APIs are generally the same.

无论您使用什么序列化协议,版本 API 的技术通常是相同的。

Generally you need:

一般你需要:

  1. a way for the consumer to communicate to the producer the API version it accepts (though this is not always possible)
  2. a way for the producer to embed versioning information to the serialized data
  3. a backward compatible strategy to handle unknown fields
  1. 消费者向生产者传达其接受的 API 版本的一种方式(尽管这并不总是可行的)
  2. 生产者将版本信息嵌入序列化数据的一种方式
  3. 处理未知字段的向后兼容策略

In a web API, generally the API version that the consumer accepts is embedded in the Accept header (e.g. Accept: application/vnd.myapp-v1+json application/vnd.myapp-v2+jsonmeans the consumer can handle either version 1 and version 2 of your API) or less commonly in the URL (e.g. https://api.twitter.com/1/statuses/user_timeline.json). This is generally used for major versions (i.e. backward incompatible changes). If the server and the client does not have a matching Accept header, then the communication fails (or proceeds in best-effort basis or fallback to a default baseline protocol, depending on the nature of the application).

在 Web API 中,消费者接受的 API 版本通常嵌入在 Accept 标头中(例如,Accept: application/vnd.myapp-v1+json application/vnd.myapp-v2+json意味着消费者可以处理您的 API 的版本 1 和版本 2)或不太常见的在 URL 中(例如https://api.twitter.com/1/statuses/user_timeline.json)。这通常用于主要版本(即向后不兼容的更改)。如果服务器和客户端没有匹配的 Accept 标头,则通信失败(或尽最大努力进行或回退到默认基线协议,具体取决于应用程序的性质)。

The producer then generates a serialized data in one of the requested version, then embed this version info into the serialized data (e.g. as a field named version). The consumer should use the version information embedded in the data to determine how to parse the serialized data. The version information in the data should also contain minor version (i.e. for backward compatible changes), generally consumers should be able to ignore the minor version information and still process the data correctly although understanding the minor version may allow the client to make additional assumptions about how the data should be processed.

然后,生产者以所请求的版本之一生成序列化数据,然后将此版本信息嵌入到序列化数据中(例如作为名为 的字段version)。消费者应该使用嵌入在数据中的版本信息来确定如何解析序列化数据。数据中的版本信息还应该包含次要版本(即向后兼容的更改),通常消费者应该能够忽略次要版本信息并仍然正确处理数据,尽管了解次要版本可能会让客户端做出额外的假设应该如何处理数据。

A common strategy to handle unknown fields is like how HTML and CSS are parsed. When the consumer sees an unknown fields they should ignore it, and when the data is missing a field that the client is expecting, it should use a default value; depending on the nature of the communication, you may also want to specify some fields that are mandatory (i.e. missing fields is considered fatal error). Fields added within minor versions should always be optional field; minor version can add optional fields or change fields semantic as long as it's backward compatible, while major version can delete fields or add mandatory fields or change fields semantic in a backward incompatible manner.

处理未知字段的常见策略类似于 HTML 和 CSS 的解析方式。当消费者看到一个未知字段时,他们应该忽略它,当数据缺少客户端期望的字段时,它应该使用默认值;根据通信的性质,您可能还想指定一些必填字段(即,缺少字段被视为致命错误)。在次要版本中添加的字段应始终为可选字段;次要版本可以添加可选字段或更改字段语义,只要向后兼容即可,而主要版本可以删除字段或添加必填字段或以向后不兼容的方式更改字段语义。

In an extensible serialization format (like JSON or XML), the data should be self-descriptive, in other words, the field names should always be stored together with the data; you should not rely on the specific data being available on specific positions.

在可扩展的序列化格式(如 JSON 或 XML)中,数据应该是自描述的,换句话说,字段名称应该始终与数据一起存储;您不应依赖特定职位上可用的特定数据。