使用 Google V8 实现最快的 Javascript 对象序列化
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6218524/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Fastest possible Javascript object serialization with Google V8
提问by
I need to serialize moderately complex objects with 1-100's of mixed type properties.
我需要序列化具有 1-100 个混合类型属性的中等复杂对象。
JSON was used originally, then I switched to BSON which is marginally faster.
最初使用 JSON,然后我切换到稍微快一点的 BSON。
Encoding 10000 sample objects
编码 10000 个样本对象
JSON: 1807mS
BSON: 1687mS
MessagePack: 2644mS (JS, modified for BinaryF)
I want an order of magnitude increase; it is having a ridiculously bad impact on the rest of the system.
我想要一个数量级的增加;它对系统的其余部分产生了极其恶劣的影响。
Part of the motivation to move to BSON is the requirement to encode binary data, so JSON is (now) unsuitable. And because it simply skips the binary data present in the objects it is "cheating" in those benchmarks.
迁移到 BSON 的部分动机是需要对二进制数据进行编码,因此 JSON(现在)不合适。并且因为它只是跳过对象中存在的二进制数据,所以它在这些基准测试中是“作弊”的。
Profiled BSON performance hot-spots
剖析 BSON 性能热点
- (unavoidable?) conversion of UTF16 V8 JS strings to UTF8.
- malloc and string ops inside the BSON library
- (不可避免?)将 UTF16 V8 JS 字符串转换为 UTF8。
- BSON 库中的 malloc 和字符串操作
The BSON encoder is based on the Mongo BSON library.
BSON 编码器基于 Mongo BSON 库。
A native V8 binary serializer might be wonderful, yet as JSON is native and quick to serialize I fear even that might not provide the answer. Perhaps my best bet is to optimize the heck out of the BSON library or write my own plus figure out far more efficient way to pull strings out of V8. One tactic might be to add UTF16 support to BSON.
原生 V8 二进制序列化器可能很棒,但由于 JSON 是原生的并且可以快速序列化,我担心即使这样也无法提供答案。也许我最好的选择是优化 BSON 库,或者编写我自己的加上找出更有效的方法从 V8 中提取字符串。一种策略可能是向 BSON 添加 UTF16 支持。
So I'm here for ideas, and perhaps a sanity check.
所以我来这里是为了想法,也许是一个理智的检查。
Edit
编辑
Added MessagePack benchmark. This was modified from the original JS to use BinaryF.
添加了 MessagePack 基准。这是从原始 JS 修改为使用 BinaryF。
The C++ MessagePack library may offer further improvements, I may benchmark it in isolation to compare directly with the BSON library.
C++ MessagePack 库可能会提供进一步的改进,我可能会单独对其进行基准测试以直接与 BSON 库进行比较。
采纳答案by deft_code
For serialization / deserialization protobufis pretty tough to beat. I don't know if you can switch out the transport protocol. But if you can protobuf should definitely be considered.
对于序列化/反序列化,protobuf很难被击败。不知道能不能换掉传输协议。但是如果可以的话protobuf绝对应该考虑。
Take a look at all the answers to Protocol Buffers versus JSON or BSON.
查看Protocol Buffers 与 JSON 或 BSON 的所有答案。
The accepted answer chooses thrift. It is however slower than protobuf. I suspect it was chosen for ease of use (with Java) not speed. These Java benchmarksare very telling.
Of note
接受的答案选择thrift。然而,它比protobuf慢。我怀疑选择它是为了易用性(使用 Java)而不是速度。 这些 Java 基准测试非常有说服力。
值得注意的是
- MongoDB-BSON 45042
- protobuf 6539
- protostuff/protobuf 3318
- MongoDB-BSON 45042
- protobuf 6539
- protostuff/protobuf 3318
The benchmarks are Java, I'd imagine that you can achieve speeds near the protostuff implementation of protobuf, ie 13.5 times faster. Worst case (if for some reason Java is just better for serialization) you can do no worse the the plain unoptimized protobuf implementation which runs 6.8 times faster.
基准测试是 Java,我想您可以实现接近 protobuf protostuff 实现的速度,即快 13.5 倍。最坏的情况(如果出于某种原因,Java 更适合序列化)你可以做得比普通的未优化 protobuf 实现快 6.8 倍。
回答by Luke Bennett
Take a look at MessagePack. It's compatible with JSON. From the docs:
看看MessagePack。它与 JSON 兼容。从文档:
Fast and Compact Serialization
MessagePack is a binary-based efficient object serialization library. It enables to exchange structured objects between many languages like JSON. But unlike JSON, it is very fast and small.
Typical small integer (like flags or error code) is saved only in 1 byte, and typical short string only needs 1 byte except the length of the string itself. [1,2,3] (3 elements array) is serialized in 4 bytes using MessagePack as follows:
快速紧凑的序列化
MessagePack 是一个基于二进制的高效对象序列化库。它能够在多种语言(如 JSON)之间交换结构化对象。但与 JSON 不同的是,它非常快速且小巧。
典型的小整数(如标志或错误代码)只保存在 1 个字节中,典型的短字符串除了字符串本身的长度外只需要 1 个字节。[1,2,3](3 个元素数组)使用 MessagePack 序列化为 4 个字节,如下所示:
回答by Wavey
If you are more interested on the de-serialisation speed, take a look at JBB (Javascript Binary Bundles)library. It is faster than BSON or MsgPack.
如果您对反序列化速度更感兴趣,请查看JBB (Javascript Binary Bundles)库。它比 BSON 或 MsgPack 更快。
From the Wiki, page JBB vs BSON vs MsgPack
:
从维基,页面JBB vs BSON vs MsgPack
:
...
- JBB is about 70% faster than Binary-JSON (BSON) and about 30% faster than MsgPack on decoding speed, even with one negative test-case (#3).
- JBB creates files that (even their compressed versions) are about 61% smaller than Binary-JSON (BSON) and about 55% smaller than MsgPack.
...
...
- JBB 在解码速度上比 Binary-JSON (BSON) 快 70%,比 MsgPack 快 30%,即使有一个负面的测试用例(#3)。
- JBB 创建的文件(即使是它们的压缩版本)比 Binary-JSON (BSON) 小 61%,比 MsgPack 小约 55%。
...
Unfortunately, it's not a streaming format, meaning that you must pre-process your data offline. However there is a plan for converting it into a streaming format (check the milestones).
不幸的是,它不是一种流格式,这意味着您必须离线预处理您的数据。但是,有计划将其转换为流格式(检查里程碑)。