尝试减少 JSON 大小是否值得付出努力?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11160941/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 18:25:22  来源:igfitidea点击:

Is it worth the effort to try to reduce JSON size?

jsonperformancehttpbandwidth

提问by Attila O.

I am submitting relatively lots of data from a mobile application (up to 1000 JSON objects), that I would normally encode like this:

我正在从移动应用程序提交相对大量的数据(最多 1000 个 JSON 对象),我通常会像这样编码:

[{
    id: 12,
    score: 34,
    interval: 5678,
    sub: 9012
}, {
    id: ...
}, ...]

I could make the payload smaller by submitting an array of arrays instead:

我可以通过提交数组数组来使有效负载更小:

[[12, 34, 5678, 9012], [...], ...]

to save some space on the property names, and recreate the objects on the server (as the schema is fixed, or at least it is a contract between the server and the client).

在属性名称上节省一些空间,并在服务器上重新创建对象(因为架构是固定的,或者至少它是服务器和客户端之间的契约)。

The payload in then submitted in a POSTrequest, most likely over a 3G connection (or could be wifi).

然后在POST请求中提交有效负载,最有可能通过 3G 连接(或可能是 wifi)。

It looks like I am saving some bandwidth by using nested arrays, but I'm not sure it is noticeable when gzipis applied, and I'm not sure how to precisely and objectively measure the difference.

看起来我通过使用嵌套数组节省了一些带宽,但我不确定在应用gzip时它是否明显,我不确定如何精确和客观地测量差异。

On the other hand, the nested arrays don't feellike a good idea: they are less readable and thus harder to spot errors while debugging. Also, since we're flushing readability down the toilet, we could just flatten the array, since each child array has a fixed number of elements, the server could just slice it up and reconstruct the objects again.

另一方面,嵌套数组感觉不是一个好主意:它们的可读性较差,因此在调试时更难发现错误。此外,由于我们正在降低可读性,我们可以将数组展平,因为每个子数组都有固定数量的元素,服务器可以将其切片并再次重建对象。

Any further reading material on this topic is much appreciated.

非常感谢有关此主题的任何进一步阅读材料。

采纳答案by John Gietzen

JSONH, aka hpack, https://github.com/WebReflection/JSONHdoes something very similar to your example:

JSONH,又名 hpack,https://github.com/WebReflection/JSONH与您的示例非常相似:

[{
    id: 12,
    score: 34,
    interval: 5678,
    sub: 9012
}, {
    id: 98,
    score: 76,
    interval: 5432,
    sub: 1098
}, ...]

Would turn into:

会变成:

[["id","score","interval","sub"],12,34,5678,9012,98,76,5432,1098,...]

回答by artoonie

JSON is meant for readability. You could have an intermediate format if you're concerned about space. Create a serialize/deserialize function which takes a JSON file and creates a compressed binary storing your data as compactly as is reasonable, then read that format on the other end of the line.

JSON 旨在提高可读性。如果您担心空间,您可以使用中间格式。创建一个序列化/反序列化函数,该函数接受一个 JSON 文件并创建一个压缩的二进制文件,以尽可能紧凑的方式存储您的数据,然后在行的另一端读取该格式。

See: http://en.wikipedia.org/wiki/JsonFirst sentence: "JSON...is a lightweight text-based open standard designed for human-readable data interchange."

请参阅:http: //en.wikipedia.org/wiki/Json第一句:“JSON...是一种轻量级的基于文本的开放标准,专为人类可读的数据交换而设计。”

Essentially, my point is that humans would always see the JSON, and machines would primarily see the binary. You get the best of both worlds: readability and small data transfer (at the cost of a tiny amount of computation).

本质上,我的观点是人类总是会看到 JSON,而机器主要会看到二进制文件。您可以获得两全其美:可读性和小数据传输(以少量计算为代价)。

回答by usr

Gzip will replace the recurring parts of your message with small back-references to their first occurence. The algorithm is pretty "dumb" but for this kind of repetitive data it is great. I think you won't see noticeable decreases in over-the-wire size because your object "structure" is sent only once.

Gzip 将替换消息中重复出现的部分,并对其第一次出现进行小的反向引用。该算法非常“愚蠢”,但对于这种重复数据,它很棒。我认为您不会看到在线大小的明显减少,因为您的对象“结构”只发送一次。

You can roughly test this by zipping two sample JSONs. Or by capturing an HTTP-request using Fiddler. It can show the compressed and uncompressed sizes.

您可以通过压缩两个示例 JSON 来对此进行粗略测试。或者通过使用 Fiddler 捕获 HTTP 请求。它可以显示压缩和未压缩的大小。

回答by ArjunShankar

Since you're using this on a mobile device (you mention 3G), you might actually want to care about size, not readability. Moreover, do you frequently expect to read what is being transmitted over the wire?

由于您在移动设备上使用它(您提到了 3G),您实际上可能想关心大小,而不是可读性。此外,您是否经常希望阅读通过网络传输的内容?

This is a suggestion for an alternate form.

这是对替代形式的建议。

ProtoBufis one option. Google uses it internally, and there is a ProtoBuf 'compiler' which can read .protofiles (containing a message description) and generate Java/C++/Pythonserializers/deserializers, which use a binary form for transmission over the wire. You simply use the generated classes on both ends, and forget about what the object looks like when transmitted over the wire. There is also an Obj-C port maintained externally.

ProtoBuf是一种选择。Google 在内部使用它,并且有一个 ProtoBuf“编译器”,它可以读取.proto文件(包含消息描述)并生成Java/C++/Python序列化器/反序列化器,它们使用二进制形式通过网络传输。您只需在两端使用生成的类,而忘记通过网络传输时对象的外观。还有一个外部维护Obj-C 端口

Here is a comparison of ProtoBuf against XML, on the ProtoBuf website (I know XML is not what you use, still).

这是ProtoBuf 网站上的ProtoBuf 与 XML的比较(我知道 XML 仍然不是您使用的)。

Finally, here is a Python tutorial.

最后,这里有一个Python 教程

回答by Diego Menta

Although is an old question, I'd like to put some words.

虽然是个老问题,但我还是想说几句。

In my experience, large differences in json raw size, amount very little after compression. I prefer to keep it human readable.

根据我的经验,json 原始大小差异很大,压缩后的数量很少。我更喜欢让它保持人类可读性。

In real case numbers: a json file of 1,29MB, and the optimized version of 145KB, when compressed, where of 32KB and 9KB.

在实际案例中:一个 1,29MB 的 json 文件,以及 145KB 的优化版本,压缩后,其中 32KB 和 9KB。

Except in extreme conditions, I think this kind of differences are negligibles and the cost in readability is huge.

除了在极端条件下,我认为这种差异可以忽略不计,可读性的成本是巨大的。

A:

A:

{
  "Code": "FCEB97B6",
  "Date": "\/Date(1437706800000)\/",
  "TotalQuantity": 1,
  "Items": [
    {
      "CapsulesQuantity": 0,
      "Quantity": 1,
      "CurrentItem": {
        "ItemId": "SHIELD_AXA",
        "Order": 30,
        "GroupId": "G_MODS",
        "TypeId": "T_SHIELDS",
        "Level": 0,
        "Rarity": "R4",
        "UniqueId": null,
        "Name": "AXA Shield"
      }
    }
  ],
  "FormattedDate": "2015-Jul.-24"
}

B:

乙:

{
  "fDate": "2016-Mar.-01",
  "totCaps": 9,
  "totIts": 14,
  "rDays": 1,
  "avg": "1,56",
  "cells": {
    "00": {
      "30": 1
    },
    "03": {
      "30": 1
    },
    "08": {
      "25": 1
    },
    "09": {
      "26": 3
    },
    "12": {
      "39": 1
    },
    "14": {
      "33": 1
    },
    "17": {
      "40": 3
    },
    "19": {
      "41": 2
    },
    "20": {
      "41": 1
    }
  }
}

This are fragments of the two files.

这是两个文件的片段。