存储 PHP 数组的首选方法(json_encode 与序列化)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/804045/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Preferred method to store PHP arrays (json_encode vs serialize)
提问by KyleFarris
I need to store a multi-dimensional associative array of data in a flat file for caching purposes. I might occasionally come across the need to convert it to JSON for use in my web app but the vast majority of the time I will be using the array directly in PHP.
我需要将多维关联数据数组存储在平面文件中以用于缓存目的。我可能偶尔会遇到需要将其转换为 JSON 以在我的 Web 应用程序中使用的情况,但绝大多数时间我将直接在 PHP 中使用该数组。
Would it be more efficient to store the array as JSON or as a PHP serialized array in this text file? I've looked around and it seems that in the newest versions of PHP (5.3), json_decodeis actually faster than unserialize.
在此文本文件中将数组存储为 JSON 或 PHP 序列化数组会更有效吗?我环顾四周,似乎在最新版本的 PHP (5.3) 中,json_decode实际上比unserialize.
I'm currently leaning towards storing the array as JSON as I feel its easier to read by a human if necessary, it can be used in both PHP and JavaScript with very little effort, and from what I've read, it might even be faster to decode (not sure about encoding, though).
我目前倾向于将数组存储为 JSON,因为我觉得如果有必要的话,它更容易被人类阅读,它可以在 PHP 和 JavaScript 中使用,只需很少的努力,从我读过的内容来看,它甚至可能是解码速度更快(但不确定编码)。
Does anyone know of any pitfalls? Anyone have good benchmarks to show the performance benefits of either method?
有谁知道任何陷阱?任何人都有很好的基准来显示这两种方法的性能优势?
采纳答案by Peter Bailey
Depends on your priorities.
取决于您的优先事项。
If performance is your absolute driving characteristic, then by all means use the fastest one. Just make sure you have a full understanding of the differences before you make a choice
如果性能是您绝对的驾驶特性,那么一定要使用最快的。在做出选择之前,请确保您对差异有充分的了解
- Unlike
serialize()you need to add extra parameter to keep UTF-8 characters untouched:json_encode($array, JSON_UNESCAPED_UNICODE)(otherwise it converts UTF-8 characters to Unicode escape sequences). - JSON will have no memory of what the object's original class was (they are always restored as instances of stdClass).
- You can't leverage
__sleep()and__wakeup()with JSON - By default, only public properties are serialized with JSON. (in
PHP>=5.4you can implement JsonSerializableto change this behavior). - JSON is more portable
- 不像
serialize()你需要添加额外的参数来保持 UTF-8 字符不变:(json_encode($array, JSON_UNESCAPED_UNICODE)否则它将 UTF-8 字符转换为 Unicode 转义序列)。 - JSON 不会记住对象的原始类是什么(它们总是作为 stdClass 的实例恢复)。
- 你不能利用
__sleep()和__wakeup()使用 JSON - 默认情况下,只有公共属性使用 JSON 进行序列化。(
PHP>=5.4你可以实现JsonSerializable来改变这种行为)。 - JSON 更便携
And there's probably a few other differences I can't think of at the moment.
可能还有其他一些我暂时想不出来的区别。
A simple speed test to compare the two
一个简单的速度测试来比较两者
<?php
ini_set('display_errors', 1);
error_reporting(E_ALL);
// Make a big, honkin test array
// You may need to adjust this depth to avoid memory limit errors
$testArray = fillArray(0, 5);
// Time json encoding
$start = microtime(true);
json_encode($testArray);
$jsonTime = microtime(true) - $start;
echo "JSON encoded in $jsonTime seconds\n";
// Time serialization
$start = microtime(true);
serialize($testArray);
$serializeTime = microtime(true) - $start;
echo "PHP serialized in $serializeTime seconds\n";
// Compare them
if ($jsonTime < $serializeTime) {
printf("json_encode() was roughly %01.2f%% faster than serialize()\n", ($serializeTime / $jsonTime - 1) * 100);
}
else if ($serializeTime < $jsonTime ) {
printf("serialize() was roughly %01.2f%% faster than json_encode()\n", ($jsonTime / $serializeTime - 1) * 100);
} else {
echo "Impossible!\n";
}
function fillArray( $depth, $max ) {
static $seed;
if (is_null($seed)) {
$seed = array('a', 2, 'c', 4, 'e', 6, 'g', 8, 'i', 10);
}
if ($depth < $max) {
$node = array();
foreach ($seed as $key) {
$node[$key] = fillArray($depth + 1, $max);
}
return $node;
}
return 'empty';
}
回答by Greg
JSONis simpler and faster than PHP's serialization format and should be used unless:
JSON比 PHP 的序列化格式更简单、更快,除非:
- You're storing deeply nested arrays:
json_decode(): "This function will return false if the JSON encoded data is deeper than 127 elements." - You're storing objects that need to be unserialized as the correct class
- You're interacting with old PHP versions that don't support json_decode
- 您正在存储深度嵌套的数组::
json_decode()“如果 JSON 编码数据的深度超过 127 个元素,则此函数将返回 false。” - 您将需要反序列化的对象存储为正确的类
- 您正在与不支持 json_decode 的旧 PHP 版本进行交互
回答by Taco
I've written a blogpost about this subject: "Cache a large array: JSON, serialize or var_export?". In this post it is shown that serialize is the best choice for small to large sized arrays. For very large arrays (> 70MB) JSON is the better choice.
我写了一篇关于这个主题的博文:“缓存一个大数组:JSON、序列化还是 var_export?”。在这篇文章中,表明序列化是从小到大数组的最佳选择。对于非常大的数组(> 70MB),JSON 是更好的选择。
回答by David Goodwin
You might also be interested in https://github.com/phadej/igbinary- which provides a different serialization 'engine' for PHP.
您可能还对https://github.com/phadej/igbinary感兴趣- 它为 PHP 提供了不同的序列化“引擎”。
My random/arbitrary 'performance' figures, using PHP 5.3.5 on a 64bit platform show :
我的随机/任意“性能”数字,在 64 位平台上使用 PHP 5.3.5 显示:
JSON :
JSON :
- JSON encoded in 2.180496931076 seconds
- JSON decoded in 9.8368630409241 seconds
- serialized "String" size : 13993
- 以 2.180496931076 秒编码的 JSON
- JSON 在 9.8368630409241 秒内解码
- 序列化的“字符串”大小:13993
Native PHP :
原生 PHP :
- PHP serialized in 2.9125759601593 seconds
- PHP unserialized in 6.4348418712616 seconds
- serialized "String" size : 20769
- PHP 在 2.9125759601593 秒内序列化
- PHP 在 6.4348418712616 秒内反序列化
- 序列化的“字符串”大小:20769
Igbinary :
繁体字:
- WINigbinary serialized in 1.6099879741669 seconds
- WINigbinrary unserialized in 4.7737920284271 seconds
- WINserialized "String" Size : 4467
- WINigbinary 在 1.6099879741669 秒内序列化
- WINigbinrary 在 4.7737920284271 秒内反序列化
- WIN序列化“字符串”大小:4467
So, it's quicker to igbinary_serialize() and igbinary_unserialize() and uses less disk space.
因此,igbinary_serialize() 和 igbinary_unserialize() 更快,并且使用更少的磁盘空间。
I used the fillArray(0, 3) code as above, but made the array keys longer strings.
我使用了上面的 fillArray(0, 3) 代码,但使数组键更长的字符串。
igbinary can store the same data types as PHP's native serialize can (So no problem with objects etc) and you can tell PHP5.3 to use it for session handling if you so wish.
igbinary 可以存储与 PHP 的本机序列化相同的数据类型(因此对象等没有问题),如果您愿意,您可以告诉 PHP5.3 使用它进行会话处理。
See also http://ilia.ws/files/zendcon_2010_hidden_features.pdf- specifically slides 14/15/16
另见http://ilia.ws/files/zendcon_2010_hidden_features.pdf- 特别是幻灯片 14/15/16
回答by Blunk
Y just tested serialized and json encode and decode, plus the size it will take the string stored.
Y 刚刚测试了序列化和 json 编码和解码,加上存储字符串的大小。
JSON encoded in 0.067085981369 seconds. Size (1277772)
PHP serialized in 0.12110209465 seconds. Size (1955548)
JSON decode in 0.22470498085 seconds
PHP serialized in 0.211947917938 seconds
json_encode() was roughly 80.52% faster than serialize()
unserialize() was roughly 6.02% faster than json_decode()
JSON string was roughly 53.04% smaller than Serialized string
We can conclude that JSON encodes faster and results a smaller string, but unserialize is faster to decode the string.
我们可以得出结论,JSON 编码速度更快并产生更小的字符串,但反序列化解码字符串的速度更快。
回答by Jordan S. Jones
If you are caching information that you will ultimately want to "include" at a later point in time, you may want to try using var_export. That way you only take the hit in the "serialize" and not in the "unserialize".
如果您正在缓存您最终希望在以后“包含”的信息,您可能需要尝试使用var_export。这样你只能在“序列化”而不是“反序列化”中获得命中。
回答by Jeff Whiting
I augmented the test to include unserialization performance. Here are the numbers I got.
我增加了测试以包括反序列化性能。这是我得到的数字。
Serialize
JSON encoded in 2.5738489627838 seconds
PHP serialized in 5.2861361503601 seconds
Serialize: json_encode() was roughly 105.38% faster than serialize()
Unserialize
JSON decode in 10.915472984314 seconds
PHP unserialized in 7.6223039627075 seconds
Unserialize: unserialize() was roughly 43.20% faster than json_decode()
So json seems to be faster for encoding but slow in decoding. So it could depend upon your application and what you expect to do the most.
因此 json 似乎编码速度更快,但解码速度较慢。因此,这可能取决于您的应用程序以及您最希望做的事情。
回答by soyuka
Really nice topic and after reading the few answers, I want to share my experiments on the subject.
非常好的话题,在阅读了几个答案后,我想分享我在该主题上的实验。
I got a use case where some "huge" table needs to be queried almost every time I talk to the database (don't ask why, just a fact). The database caching system isn't appropriate as it'll not cache the different requests, so I though about php caching systems.
我有一个用例,几乎每次我与数据库交谈时都需要查询一些“巨大”的表(不要问为什么,只是一个事实)。数据库缓存系统不合适,因为它不会缓存不同的请求,所以我想了解 php 缓存系统。
I tried apcubut it didn't fit the needs, memory isn't enough reliable in this case. Next step was to cache into a file with serialization.
我试过了,apcu但它不符合需求,在这种情况下内存不够可靠。下一步是通过序列化缓存到文件中。
Table has 14355 entries with 18 columns, those are my tests and stats on reading the serialized cache:
表有 14355 个条目,有 18 列,这些是我在读取序列化缓存时的测试和统计信息:
JSON:
JSON:
As you all said, the major inconvenience with json_encode/json_decodeis that it transforms everything to an StdClassinstance (or Object). If you need to loop it, transforming it to an array is what you'll probably do, and yes it's increasing the transformation time
正如大家所说,json_encode/的主要不便之处json_decode在于它将所有内容转换为StdClass实例(或对象)。如果您需要循环它,将其转换为数组是您可能要做的,是的,它会增加转换时间
average time: 780.2 ms; memory use: 41.5MB; cache file size: 3.8MB
平均时间:780.2 毫秒;内存使用:41.5MB;缓存文件大小:3.8MB
Msgpack
消息包
@hutch mentions msgpack. Pretty website. Let's give it a try shall we?
@hutch 提到了msgpack。漂亮的网站。让我们试一试好吗?
average time: 497 ms; memory use: 32MB; cache file size: 2.8MB
平均时间:497 毫秒;内存使用:32MB;缓存文件大小:2.8MB
That's better, but requires a new extension; compiling sometimes afraid people...
那更好,但需要一个新的扩展;编译有时害怕的人......
IgBinary
二进制
@GingerDog mentions igbinary. Note that I've set the igbinary.compact_strings=Offbecause I care more about reading performances than file size.
@GingerDog 提到了igbinary。请注意,我设置了igbinary.compact_strings=Off是因为我更关心阅读性能而不是文件大小。
average time: 411.4 ms; memory use: 36.75MB; cache file size: 3.3MB
平均时间:411.4 毫秒;内存使用:36.75MB;缓存文件大小:3.3MB
Better than msg pack. Still, this one requires compiling too.
比味精包好。尽管如此,这个也需要编译。
serialize/unserialize
serialize/unserialize
average time: 477.2 ms; memory use: 36.25MB; cache file size: 5.9MB
平均时间:477.2 毫秒;内存使用:36.25MB;缓存文件大小:5.9MB
Better performances than JSON, the bigger the array is, slower json_decodeis, but you already new that.
比 JSON 更好的性能,数组越大,速度越慢json_decode,但你已经很熟悉了。
Those external extensions are narrowing down the file size and seems great on paper. Numbers don't lie*. What's the point of compiling an extension if you get almost the same results that you'd have with a standard PHP function?
这些外部扩展正在缩小文件大小,在纸上看起来很棒。数字不会说谎*。如果您得到的结果几乎与使用标准 PHP 函数所得到的结果相同,那么编译扩展的意义何在?
We can also deduce that depending on your needs, you will choose something different than someone else:
我们还可以推断出,根据您的需求,您会选择与其他人不同的东西:
- IgBinary is really nice and performs better than MsgPack
- Msgpack is better at compressing your datas (note that I didn't tried the igbinary compact.string option).
- Don't want to compile? Use standards.
- IgBinary 非常好,性能比 MsgPack 好
- Msgpack 更擅长压缩您的数据(请注意,我没有尝试 igbinary compact.string 选项)。
- 不想编译?使用标准。
That's it, another serialization methods comparison to help you choose the one!
就是这样,另一种序列化方法比较来帮助您选择一个!
*Tested with PHPUnit 3.7.31, php 5.5.10 - only decoding with a standard hardrive and old dual core CPU - average numbers on 10 same use case tests, your stats might be different
*使用 PHPUnit 3.7.31、php 5.5.10 测试 - 仅使用标准硬盘和旧双核 CPU 进行解码 - 10 次相同用例测试的平均数字,您的统计数据可能不同
回答by urraka
Seems like serialize is the one I'm going to use for 2 reasons:
似乎 serialize 是我要使用的一个,原因有两个:
Someone pointed out that unserialize is faster than json_decode and a 'read' case sounds more probable than a 'write' case.
I've had trouble with json_encode when having strings with invalid UTF-8 characters. When that happens the string ends up being empty causing loss of information.
有人指出反序列化比 json_decode 更快,并且“读取”案例听起来比“写入”案例更有可能。
当字符串包含无效的 UTF-8 字符时,我在使用 json_encode 时遇到了麻烦。当这种情况发生时,字符串最终为空,导致信息丢失。
回答by Jelmer
I made a small benchmark as well. My results were the same. But I need the decode performance. Where I noticed, like a few people above said as well, unserializeis faster than json_decode. unserializetakes roughly 60-70% of the json_decodetime. So the conclusion is fairly simple:
When you need performance in encoding, use json_encode, when you need performance when decoding, use unserialize. Because you can not merge the two functions you have to make a choise where you need more performance.
我也做了一个小基准。我的结果是一样的。但我需要解码性能。我注意到的地方,就像上面几个人所说的,unserialize比json_decode. unserialize大约需要 60-70% 的json_decode时间。所以结论很简单:当你在编码json_encode时需要性能时,使用,当你在解码时需要性能时,使用unserialize. 因为您不能合并这两个功能,所以您必须在需要更高性能的地方做出选择。
My benchmark in pseudo:
我的伪基准:
- Define array $arr with a few random keys and values
- for x < 100; x++; serialize and json_encode a array_rand of $arr
- for y < 1000; y++; json_decode the json encoded string - calc time
- for y < 1000; y++; unserialize the serialized string - calc time
- echo the result which was faster
- 使用一些随机键和值定义数组 $arr
- 对于 x < 100;x++; 序列化和json_encode $arr 的array_rand
- 对于 y < 1000;y++; json_decode json 编码的字符串 - 计算时间
- 对于 y < 1000;y++; 反序列化序列化的字符串 - 计算时间
- 回应更快的结果
On avarage: unserialize won 96 times over 4 times the json_decode. With an avarage of roughly 1.5ms over 2.5ms.
平均而言:反序列化比 json_decode 赢了 4 次,赢了 96 次。在 2.5 毫秒内平均大约为 1.5 毫秒。

