Java Thrift、Protocol Buffers、JSON、EJB、其他的性能比较?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/296650/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Performance comparison of Thrift, Protocol Buffers, JSON, EJB, other?
提问by Parand
We're looking into transport/protocol solutions and were about to do various performance tests, so I thought I'd check with the community if they've already done this:
我们正在研究传输/协议解决方案,并且即将进行各种性能测试,所以我想我会与社区核实他们是否已经这样做了:
Has anyone done server performance tests for simple echo services as well as serialization/deserialization for various messages sizes comparing EJB3, Thrift, and Protocol Buffers on Linux?
有没有人对 Linux 上的 EJB3、Thrift 和 Protocol Buffers 进行过简单回显服务的服务器性能测试以及各种消息大小的序列化/反序列化?
Primarily languages will be Java, C/C++, Python, and PHP.
主要语言将是 Java、C/C++、Python 和 PHP。
Update: I'm still very interested in this, if anyone has done any further benchmarks please let me know. Also, very interesting benchmark showing compressed JSON performing similar / better than Thrift / Protocol Buffers, so I'm throwing JSON into this question as well.
更新:我对此仍然很感兴趣,如果有人做过任何进一步的基准测试,请告诉我。此外,非常有趣的基准测试显示压缩 JSON 的性能与 Thrift / Protocol Buffers 相似/更好,所以我也将 JSON 投入到这个问题中。
采纳答案by Eishay Smith
Latest comparison available here at the thrift-protobuf-compareproject wiki. It includes many other serialization libraries.
可在thrift-protobuf-compare项目 wiki 中获得最新比较。它包括许多其他序列化库。
回答by user38123
You may be interested in this question: "Biggest differences of Thrift vs Protocol Buffers?"
您可能对这个问题感兴趣:“Thrift 与 Protocol Buffers 的最大区别?”
回答by Jon Skeet
One of the things near the top of my "to-do" list for PBs is to port Google's internal Protocol Buffer performance benchmark - it's mostly a case of taking confidential message formats and turning them into entirely bland ones, and then doing the same for the data.
我的 PB 的“待办事项”列表顶部附近的一件事是移植 Google 的内部协议缓冲区性能基准 - 这主要是采用机密消息格式并将它们转换为完全平淡的格式,然后为数据。
When that's been done, I'd imagine you could build the same messages in Thrift and then compare the performance.
完成后,我想您可以在 Thrift 中构建相同的消息,然后比较性能。
In other words, I don't have the data for you yet - but hopefully in the next couple of weeks...
换句话说,我还没有你的数据——但希望在接下来的几周内......
回答by Vladimir Dyuzhev
If the raw net performance is the target, then nothing beats IIOP (see RMI/IIOP). Smallest possible footprint -- only binary data, no markup at all. Serialization/deserialization is very fast too.
如果原始净性能是目标,那么没有什么比 IIOP 更好(参见 RMI/IIOP)。尽可能小的占用空间——只有二进制数据,根本没有标记。序列化/反序列化也非常快。
Since it's IIOP (that is CORBA), almost all languages have bindings.
由于它是 IIOP(即 CORBA),几乎所有语言都有绑定。
But I presume the performance is not the onlyrequirement, right?
但我认为性能不是唯一的要求,对吧?
回答by eishay
I'm in the process of writing some code in an open source project named thrift-protobuf-comparecomparing between protobuf and thrift. For now it covers few serialization aspects, but I intend to cover more. The results (for Thriftand Protobuf) are discussed in my blog, I'll add more when I'll get to it. You may look at the code to compare API, description language and generated code. I'll be happy to have contributions to achieve a more rounded comparison.
我正在一个名为 thrift-protobuf-compare 的开源项目中编写一些代码,比较 protobuf 和 thrift。目前它涵盖了很少的序列化方面,但我打算涵盖更多。结果(对于Thrift和Protobuf)在我的博客中进行了讨论,我会在到达时添加更多内容。您可以查看代码以比较 API、描述语言和生成的代码。我很乐意为实现更全面的比较做出贡献。
回答by StaxMan
I did test performance of PB with number of other data formats (xml, json, default object serialization, hessian, one proprietary one) and libraries (jaxb, fast infoset, hand-written) for data binding task (both reading and writing), but thrift's format(s) was not included. Performance for formats with multiple converters (like xml) had very high variance, from very slow to pretty-darn-fast. Correlation between claims of authors and perceived performance was rather weak. Especially so for packages that made wildest claims.
我用许多其他数据格式(xml、json、默认对象序列化、hessian、一种专有格式)和用于数据绑定任务(读取和写入)的库(jaxb、快速信息集、手写)测试了 PB 的性能,但不包括节俭的格式。具有多个转换器(如 xml)的格式的性能差异很大,从非常慢到非常快。作者的主张与感知表现之间的相关性相当弱。对于提出最疯狂声明的包裹尤其如此。
For what it is worth, I found PB performance to be bit over hyped (usually not by its authors, but others who only know who wrote it). With default settings it did not beat fastest textual xml alternative. With optimized mode (why is this not default?), it was bit faster, comparable with the fastest JSON package. Hessian was rather fast, textual json also. Properietary binary format (no name here, it was company internal) was the slowest. Java object serialization was fast for larger messages, less so for small objects (i.e. high fixed per-operation noverhead). With PB message size was compact, but given all trade-offs you have to do (data is not self-descriptive: if you lose the schema, you lose data; there are indexes of course, and value types, but from what you have reverse-engineer back to field names if you want), I personally would only choose it for specific use cases -- size-sensitive, closely coupled system where interface/format never (or very very rarely) changes.
就其价值而言,我发现 PB 性能有点过分炒作(通常不是由其作者,而是其他只知道谁编写它的人)。使用默认设置,它并没有击败最快的文本 xml 替代方案。使用优化模式(为什么这不是默认的?),它有点快,与最快的 JSON 包相媲美。Hessian 也相当快,文本 json 也是。专有二进制格式(这里没有名称,它是公司内部的)是最慢的。Java 对象序列化对于较大的消息是快速的,对于小对象则较慢(即高固定的每个操作开销)。PB 消息大小是紧凑的,但考虑到您必须做的所有权衡(数据不是自我描述的:如果您丢失架构,您将丢失数据;当然有索引和值类型,但是从您拥有的如果需要,可以反向工程返回字段名称),
My opinion in this is that (a) implementation often matters more than specification (of data format), (b) end-to-end, differences between best-of-breed (for different formats) are usually not big enough to dictate the choice. That is, you may be better off choosing format+API/lib/framework you like using most (or has best tool support), find best implementation, and see if that works fast enough. If (and only if!) not, consider next best alternative.
我的观点是(a)实现通常比(数据格式的)规范更重要,(b)端到端,同类最佳(不同格式)之间的差异通常不足以决定选择。也就是说,您最好选择您最喜欢使用的格式+API/lib/框架(或具有最佳工具支持),找到最佳实现,并查看其是否足够快。如果(且仅当!)不是,请考虑下一个最佳选择。
ps. Not sure what EJB3 here would be. Maybe just plain of Java serialization?
附:不确定这里的 EJB3 是什么。也许只是简单的 Java 序列化?
回答by michaelok
To back up Vladimir's point about IIOP, here's an interesting performance test, that should give some additional info over the google benchmarks, since it compares Thrift and CORBA. (Performance_TIDorb_vs_Thrift_morfeo.pdf // link no longer valid) To quote from the study:
为了支持 Vladimir 关于 IIOP 的观点,这里有一个有趣的性能测试,它应该提供一些关于谷歌基准的额外信息,因为它比较了 Thrift 和 CORBA。(Performance_TIDorb_vs_Thrift_morfeo.pdf // 链接不再有效)引用研究:
- Thrift is very efficient with small data (basic types as operation arguments)
- Thrifts transports are not so efficient as CORBA with medium and large data (struct and >complex types > 1 kilobytes).
- Thrift 对于小数据非常有效(基本类型作为操作参数)
- 对于中型和大型数据(结构体和 > 复杂类型 > 1 KB),Thrifts 传输不如 CORBA 有效。
Another odd limitation, not having to do with performance, is that Thrift is limited to returning only several values as a struct - although this, like performance, can surely be improved perhaps.
另一个与性能无关的奇怪限制是 Thrift 仅限于仅返回几个值作为结构体 - 尽管这与性能一样,肯定可以改进。
It is interesting that the Thrift IDL closely matches the CORBA IDL, nice. I haven't used Thrift, it looks interesting especially for smaller messages, and one of the design goals was for a less cumbersome install, so these are other advantages of Thrift. That said, CORBA has a bad rap, there are many excellent implementations out there like omniORBfor example, which has bindings for Python, that are easy to install and use.
有趣的是,Thrift IDL 与 CORBA IDL 非常匹配,很好。我没有使用过 Thrift,它看起来很有趣,尤其是对于较小的消息,并且设计目标之一是减少安装的麻烦,所以这些是 Thrift 的其他优点。也就是说,CORBA 的名声不好,有许多优秀的实现,例如omniORB,它具有 Python 绑定,易于安装和使用。
Edited: The Thrift and CORBA link is no longer valid, but I did find another useful paper from CERN. They evaluated replacements for their CORBA system, and, while they evaluated Thrift, they eventually went with ZeroMQ. While Thrift performed the fastest in their performance tests, at 9000 msg/sec vs. 8000 (ZeroMQ) and 7000+ RDA (CORBA-based), they chose not to test Thrift further because of other issues notably:
编辑:Thrift 和 CORBA 链接不再有效,但我确实从 CERN 找到了另一篇有用的论文。他们评估了 CORBA 系统的替代品,在评估 Thrift 的同时,他们最终选择了 ZeroMQ。虽然 Thrift 在他们的性能测试中表现最快,9000 msg/sec vs. 8000 (ZeroMQ) 和 7000+ RDA(基于 CORBA),但他们选择不进一步测试 Thrift,因为其他问题值得注意:
It is still an immature product with a buggy implementation
它仍然是一个不成熟的产品,有一个错误的实现
回答by michaelok
I have done a study for spring-boot, mappers (manual, Dozer and MapStruct), Thrift, REST, SOAP and Protocol Buffers integration for my job.
我已经为我的工作对 spring-boot、映射器(手动、Dozer 和 MapStruct)、Thrift、REST、SOAP 和 Protocol Buffers 集成进行了研究。
The server side: https://github.com/vlachenal/webservices-bench
服务器端:https: //github.com/vlachenal/webservices-bench
The client side: https://github.com/vlachenal/webservices-bench-client
客户端:https: //github.com/vlachenal/webservices-bench-client
It is not finished and has been run on my personal computers (I have to ask for servers to complete the tests) ... but results can be consulted on:
它尚未完成并已在我的个人计算机上运行(我必须要求服务器才能完成测试)...但可以参考以下结果:
- Laptop: https://github.com/vlachenal/webservices-bench/blob/master/results.md
- Desktop: https://github.com/vlachenal/webservices-bench/blob/master/results-desktop.md
- 笔记本电脑:https: //github.com/vlachenal/webservices-bench/blob/master/results.md
- 桌面:https: //github.com/vlachenal/webservices-bench/blob/master/results-desktop.md
As conclusion :
作为结论:
- Thrift offers the best performance and is easy to use
- RESTful webservice with JSON content type is pretty close to Thrift performance, is "browser ready to use" and is quite elegant (from my point of view)
- SOAP has very poor performance but offers the best data control
- Protocol Buffers has good performance ... until 3 simultaneous calls ... and I don't know why. It is very difficult to use: I give up (for now) to make for it work with MapStruct and I don't try with Dozer.
- Thrift 提供最佳性能且易于使用
- 具有 JSON 内容类型的 RESTful web 服务非常接近 Thrift 性能,是“浏览器随时可用”并且非常优雅(从我的角度来看)
- SOAP 的性能很差,但提供了最好的数据控制
- Protocol Buffers 具有良好的性能......直到 3 个同时调用......我不知道为什么。使用起来非常困难:我(暂时)放弃使其与 MapStruct 一起工作,并且我不尝试使用 Dozer。
Projects can be completed through pull requests (either for fixes or other results).
项目可以通过拉取请求完成(修复或其他结果)。