在 PHP 中序列化一个大数组?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1256949/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 01:46:02  来源:igfitidea点击:

serialize a large array in PHP?

phpserialization

提问by JasonDavis

I am curious, is there a size limit on serialize in PHP. Would it be possible to serialize an array with 5,000 keys and values so it can be stored into a cache?

我很好奇,PHP 中的序列化是否有大小限制。是否可以序列化具有 5,000 个键和值的数组,以便将其存储到缓存中?

I am hoping to cache a users friend list on a social network site, the cache will need to be updated fairly often but it will need to be read almost every page load.

我希望在社交网站上缓存用户好友列表,缓存需要经常更新,但几乎在每次加载页面时都需要读取。

On a single server setup I am assuming APC would be better then memcache for this.

在单个服务器设置中,我假设 APC 会比 memcache 更好。

回答by Pascal MARTIN

As quite a couple other people answered already, just for fun, here's a very quick benchmark (do I dare calling it that ? ); consider the following code :

很多其他人已经回答了,只是为了好玩,这里有一个非常快速的基准测试(我敢这么称呼它吗?);考虑以下代码:

$num = 1;

$list = array_fill(0, 5000, str_repeat('1234567890', $num));

$before = microtime(true);
for ($i=0 ; $i<10000 ; $i++) {
    $str = serialize($list);
}
$after = microtime(true);

var_dump($after-$before);
var_dump(memory_get_peak_usage());

I'm running this on PHP 5.2.6 (the one bundled with Ubuntu jaunty).
And, yes, there are only values ; no keys ; and the values are quite simple : no object, no sub-array, no nothing but string.

我在 PHP 5.2.6(与 Ubuntu jaunty 捆绑的那个)上运行它。
而且,是的,只有值;没有钥匙;并且值非常简单:没有对象,没有子数组,只有字符串。

For $num = 1, you get :

对于$num = 1,你得到:

float(11.8147978783)
int(1702688)

For $num = 10, you get :

对于$num = 10,你得到:

float(13.1230671406)
int(2612104)

And, for $num = 100, you get :

并且,对于$num = 100,你得到:

float(63.2925770283)
int(11621760)

So, it seems the bigger each element of the array is, the longer it takes (seems fair, actually). But, for elements 100 times bigger, you don't take 100 times much longer...

因此,似乎数组中的每个元素越大,所需的时间就越长(实际上似乎很公平)。但是,对于大 100 倍的元素,您不会花费 100 倍的时间......


Now, with an array of 50000 elements, instead of 5000, which means this part of the code is changed :


现在,使用 50000 个元素而不是 5000 个元素的数组,这意味着这部分代码已更改:

$list = array_fill(0, 50000, str_repeat('1234567890', $num));

With $num = 1, you get :

随着$num = 1,你得到:

float(158.236332178)
int(15750752)

Considering the time it took for 1, I won't be running this for either $num = 10 nor $num = 100...

考虑到 1 所花费的时间,我不会为 $num = 10 或 $num = 100 运行它......


Yes, of course, in a real situation, you wouldn't be doing this 10000 times ; so let's try with only 10 iterations of the for loop.


是的,当然,在真实情况下,您不会这样做 10000 次;所以让我们尝试只迭代 10 次 for 循环。

For $num = 1:

对于$num = 1

float(0.206310987473)
int(15750752)

For $num = 10:

对于$num = 10

float(0.272629022598)
int(24849832)

And for $num = 100:

而对于$num = 100

float(0.895547151566)
int(114949792)

Yeah, that's almost 1 second -- and quite a bit of memory used ^^
(No, this is not a production server : I have a pretty high memory_limit on this development machine ^^ )

是的,这几乎是 1 秒 - 并且使用了相当多的内存 ^^
(不,这不是生产服务器:我在这台开发机器上有相当高的 memory_limit ^^)


So, in the end, to be a bit shorter than those number -- and, yes, you can have numbers say whatever you want them to --I wouldn't say there is a "limit" as in "hardcoded" in PHP, but you'll end up facing one of those :


所以,最后,比那些数字短一点——是的,你可以让数字说出你想要的任何东西——我不会说在 PHP 中的“硬编码”中存在“限制” ,但你最终会面临其中之一:

  • max_execution_time(generally, on a webserver, it's never more than 30 seconds)
  • memory_limit(on a webserver, it's generally not muco more than 32MB)
  • the load you webserver will have : while 1 of those big serialize-loop was running, it took 1 of my CPU ; if you are having quite a couple of users on the same page at the same time, I let you imagine what it will give ;-)
  • the patience of your user ^^
  • max_execution_time(通常,在网络服务器上,它永远不会超过 30 秒)
  • memory_limit(在网络服务器上,它通常不会超过 32MB muco)
  • 您的网络服务器将具有的负载:当 1 个大型序列化循环正在运行时,它占用了我的 1 个 CPU;如果您同时在同一页面上有相当多的用户,我让您想象一下它会带来什么;-)
  • 用户的耐心^^

But, except if you are really serializing long arrays of big data, I am not sure it will matter that much...
And you must take into consideration the amount of time/CPU-load using that cache might help you gain ;-)

但是,除非您真的要序列化大数据的长数组,否则我不确定它会那么重要……
而且您必须考虑使用该缓存的时间/CPU 负载可能会帮助您获得 ;-)

Still, the best way to know would be to test by yourself, with real data ;-)

尽管如此,最好的了解方法还是自己测试,使用真实数据;-)


And you might also want to take a look at what Xdebugcan do when it comes to profiling: this kind of situation is one of those it is useful for!


而且您可能还想看看Xdebug分析时可以做什么:这种情况是它有用的情况之一!

回答by Byron Whitlock

The serialize()function is only limited by available memory.

序列化()函数只受可用内存的限制。

回答by zombat

There's no limit enforced by PHP. Serialize returns a bytestream representation (string) of the serialized structure, so you would just get a large string.

PHP 没有强制执行的限制。Serialize 返回序列化结构的字节流表示(字符串),因此您只会得到一个大字符串。

回答by Andrew Moore

There is no limit, but remember that serialization and unserialization has a cost.

没有限制,但请记住,序列化和反序列化是有代价的。

Unserialization is exteremely costly.

反序列化的成本非常高。

A less costly way of caching that data would be via var_export()as such (since PHP 5.1.0, it works on objects):

缓存数据的成本较低的方法是通过var_export()(自 PHP 5.1.0 起,它适用于对象):

$largeArray = array(1,2,3,'hello'=>'world',4);

file_put_contents('cache.php', "<?php\nreturn ".
                                var_export($largeArray, true).
                                ';');

You can then simply retrieve the array by doing the following:

然后,您可以通过执行以下操作简单地检索数组:

$largeArray = include('cache.php');

Resources are usually not cache-able.

资源通常不可缓存。

Unfortunately, if you have circular references in your array, you'll need to use serialize().

不幸的是,如果数组中有循环引用,则需要使用serialize().

回答by Paul Dixon

The only practical limit is your available memory, since serialization involves creating a string in memory.

唯一的实际限制是您的可用内存,因为序列化涉及在内存中创建一个字符串。

回答by EffectiX

As suggested by Thinker above:

正如上面的思想者所建议的:

You could use

你可以用

$string = json_encode($your_array_here);

and to decode it

并解码它

$array = json_decode($your_array_here, true);

This returns an array. It works well even if the encoded array was multilevel.

这将返回一个数组。即使编码数组是多级的,它也能很好地工作。

回答by Justin

Ok... more numbers! (PHP 5.3.0 OSX, no opcode cache)

好的……更多的数字!(PHP 5.3.0 OSX,无操作码缓存)

@Pascal's code on my machine for n=1 at 10k iters produces:

@Pascal 在我机器上的代码在 10k iters 时 n=1 产生:

float(18.884856939316)
int(1075900)

I add unserialize() to the above as so.

我将 unserialize() 添加到上述内容中。

$num = 1;

$list = array_fill(0, 5000, str_repeat('1234567890', $num));

$before = microtime(true);
for ($i=0 ; $i<10000 ; $i++) {
    $str = serialize($list);
    $list = unserialize($str);
}
$after = microtime(true);

var_dump($after-$before);
var_dump(memory_get_peak_usage());

produces

产生

float(50.204112052917)
int(1606768) 

I assume the extra 600k or so are the serialized string.

我假设额外的 600k 左右是序列化字符串。

I was curious about var_export and its include/eval partner $str = var_export($list, true);instead of serialize() in the original produces

我对 var_export 及其包含/评估伙伴$str = var_export($list, true);而不是原始产品中的 serialize()很好奇

float(57.064643859863)
int(1066440)

so just a little less memory (at least for this simple example) but way more time already.

所以只是少了一点内存(至少对于这个简单的例子)但是已经有更多的时间了。

adding in eval('$list = '.$str.';');instead of unserialize in the above produces

eval('$list = '.$str.';');上面添加而不是反序列化产生

float(126.62566018105)
int(2944144)

Indicating theres probably a memory leak somewhere when doing eval :-/.

表明在执行 eval :-/ 时某处可能存在内存泄漏。

So again, these aren't great benchmarks (I really should isolate the eval/unserialize by putting the string in a local var or something, but I'm being lazy) but they show the associated trends. var_export seems slow.

同样,这些都不是很好的基准(我真的应该通过将字符串放在本地 var 或其他东西中来隔离 eval/unserialize,但我很懒惰)但它们显示了相关的趋势。var_export 似乎很慢。

回答by Luke

I've just come across an instance where I thought I was hitting an upper limit of serialisation.

我刚刚遇到一个例子,我认为我达到了序列化的上限。

I'm persisting serialised objects to a database using a mysql TEXTfield.

我正在使用 mysqlTEXT字段将序列化对象持久保存到数据库中。

The limit of the available characters for a single-byte characters is 65,535so whilst I can serialize much larger objects than that with PHP It's impossible to unserialize them as they are truncated by the limit of the TEXTfield.

单字节字符的可用字符限制为65,535,因此虽然我可以序列化比使用 PHP 大得多的对象,但无法对它们进行反序列化,因为它们被TEXT字段限制截断了。

回答by Alix Axel

Nope, there is no limit and this:

不,没有限制,这是:

set_time_limit(0);
ini_set('memory_limit ', -1);

unserialize('s:2000000000:"a";');

is why you should have safe.mode = On or a extension like Suhosin installed, otherwise it will eat up all the memory in your system.

这就是为什么你应该有 safe.mode = On 或像 Suhosin 这样的扩展安装,否则它会吃掉你系统中的所有内存。

回答by Thinker

I think better than serialize is json_encodefunction.It got a drawback, that associative arrays and objects are not distinguished, but string result is smaller and easier to read by human, so also to debug and edit.

我认为比序列化更好的是json_encode函数。它有一个缺点,即不区分关联数组和对象,但字符串结果更小,更易于人类阅读,因此也易于调试和编辑。