在 PHP 中强制释放内存

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2461762/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 06:34:01  来源:igfitidea点击:

Force freeing memory in PHP

phpgarbage-collection

提问by DBa

In a PHP program, I sequentially read a bunch of files (with file_get_contents), gzdecodethem, json_decodethe result, analyze the contents, throw most of it away, and store about 1% in an array.

在一个 PHP 程序中,我依次读取一堆文件(带有file_get_contents)、gzdecode它们、json_decode结果、分析内容、丢弃大部分内容,并将大约 1% 存储在一个数组中。

Unfortunately, with each iteration (I traverse over an array containing the filenames), there seems to be some memory lost (according to memory_get_peak_usage, about 2-10 MB each time). I have double- and triple-checked my code; I am not storing unneeded data in the loop (and the needed data hardly exceeds about 10MB overall), but I am frequently rewriting (actually, strings in an array). Apparently, PHP does not free the memory correctly, thus using more and more RAM until it hits the limit.

不幸的是,每次迭代(我遍历包含文件名的数组)时,似乎都会丢失一些内存(根据memory_get_peak_usage,每次大约 2-10 MB)。我对我的代码进行了两次和三次检查;我没有在循环中存储不需要的数据(并且需要的数据总体上几乎不超过 10MB),但我经常重写(实际上是数组中的字符串)。显然,PHP 没有正确释放内存,因此使用越来越多的 RAM,直到达到限制。

Is there any way to do a forced garbage collection? Or, at least, to find out where the memory is used?

有没有办法进行强制垃圾收集?或者,至少,找出内存在哪里使用?

回答by James Lyons

it has to do with memory fragmentation.

它与内存碎片有关。

Consider two strings, concatenated to one string. Each original must remain until the output is created. The output is longer than either input.
Therefore, a new allocation must be made to store the result of such a concatenation. The original strings are freedbut they are small blocks of memory.
In a case of 'str1' . 'str2' . 'str3' . 'str4'you have several temps being created at each . -- and none of them fit in the space thats been freed up. The strings are likely not laid out in contiguous memory (that is, each string is, but the various strings are not laid end to end) due to other uses of the memory. So freeing the string creates a problem because the space can't be reused effectively. So you grow with each tmp you create. And you don't re-use anything, ever.

考虑连接为一个字符串的两个字符串。在创建输出之前,每个原件都必须保留。输出比任一输入都长。
因此,必须进行新的分配来存储这种串联的结果。原始字符串被释放,但它们是小块内存。
如果'str1' . 'str2' . 'str3' . 'str4'您在每个 . ——而且它们都不适合腾出的空间。由于内存的其他用途,这些字符串可能没有布置在连续的内存中(即,每个字符串都是,但不同的字符串不是首尾相连的)。所以释放字符串会产生一个问题,因为空间不能被有效地重用。所以你随着你创建的每个 tmp 增长。而且你永远不会重复使用任何东西。

Using the array based implode, you create only 1 output -- exactly the length you require. Performing only 1 additional allocation. So its much more memory efficient and it doesn't suffer from the concatenation fragmentation. Same is true of python. If you need to concatenate strings, more than 1 concatenation should always be array based:

使用基于数组的内爆,您只需创建 1 个输出——正是您需要的长度。仅执行 1 次额外分配。因此它的内存效率更高,并且不会受到连接碎片的影响。蟒蛇也是如此。如果您需要连接字符串,多于 1 个连接应该始终基于数组:

''.join(['str1','str2','str3'])

in python

在蟒蛇

implode('', array('str1', 'str2', 'str3'))

in PHP

在 PHP 中

sprintf equivalents are also fine.

sprintf 等价物也很好。

The memory reported by memory_get_peak_usage is basically always the "last" bit of memory in the virtual map it had to use. So since its always growing, it reports rapid growth. As each allocation falls "at the end" of the currently used memory block.

memory_get_peak_usage 报告的内存基本上总是它必须使用的虚拟映射中的“最后”内存位。因此,由于它一直在增长,因此它报告了快速增长。由于每次分配都落在当前使用的内存块的“末尾”。

回答by Mo.

In PHP >= 5.3.0, you can call gc_collect_cycles()to force a GC pass.

在 PHP >= 5.3.0 中,可以调用gc_collect_cycles()强制 GC 通过。

Note: You need to have zend.enable_gcenabled in your php.inienabled, or call gc_enable()to activate the circular reference collector.

注意:你需要zend.enable_gc在你的php.inienabled 中启用,或者调用gc_enable()来激活循环引用收集器。

回答by DBa

Found the solution: it was a string concatenation. I was generating the input line by line by concatenating some variables (the output is a CSV file). However, PHP seems not to free the memory used for the old copy of the string, thus effectively clobbering RAM with unused data. Switching to an array-based approach (and imploding it with commas just before fputs-ing it to the outfile) circumvented this behavior.

找到了解决方案:它是一个字符串连接。我通过连接一些变量(输出是一个 CSV 文件)逐行生成输入。但是,PHP 似乎没有释放用于字符串旧副本的内存,从而有效地用未使用的数据破坏 RAM。切换到基于数组的方法(并在 fputs-ing 到输出文件之前用逗号将其内爆)绕过了这种行为。

For some reason - not obvious to me - PHP reported the increased memory usage during json_decode calls, which mislead me to the assumption that the json_decode function was the problem.

出于某种原因——对我来说并不明显——PHP 报告了 json_decode 调用期间内存使用量的增加,这使我误认为 json_decode 函数是问题所在。

回答by Mike B

I've found that PHP's internal memory manager is most-likely to be invoked upon completion of a function. Knowing that, I've refactored code in a loop like so:

我发现 PHP 的内部内存管理器最有可能在函数完成时被调用。知道了这一点,我在循环中重构了代码,如下所示:

while (condition) {
  // do
  // cool
  // stuff
}

to

while (condition) {
  do_cool_stuff();
}

function do_cool_stuff() {
  // do
  // cool
  // stuff
}


EDIT

编辑

I ran this quick benchmark and did not see an increase in memory usage. This leads me to believe the leak is not in json_decode()

我运行了这个快速基准测试,并没有看到内存使用量增加。这让我相信泄漏不在json_decode()

for($x=0;$x<10000000;$x++)
{
  do_something_cool();
}

function do_something_cool() {
  $json = '{"a":1,"b":2,"c":3,"d":4,"e":5}';
  $result = json_decode($json);
  echo memory_get_peak_usage() . PHP_EOL;
}

回答by dkellner

I had the same problem ...

我有同样的问题 ...

I was writing from a db query into csv files. I always allocated one $row, then reassigned it in the next step. Kept running out of memory. Unsetting $row didn't help; putting an 5MB string into $row first (to avoid fragmentation) didn't help; creating an array of $row-s (loading many rows into it + unsetting the whole thing in every 5000th step) didn't help. But it was not the end,to quote a classic.

我正在从 db 查询写入 csv 文件。我总是分配一个 $row,然后在下一步中重新分配它。不断耗尽内存。取消设置 $row 没有帮助;首先将 5MB 字符串放入 $row(以避免碎片)没有帮助;创建一个 $row-s 数组(将许多行加载到其中 + 在每 5000 步中取消设置整个内容)没有帮助。但这不是结束,引用经典。

When I made a separate function that opened the file, transferred 100.000 lines (just enough not to eat up the whole memory) and closed the file, THEN I made subsequent calls to this function (appending to the existing file), I found that for every function exit, PHP removed the garbage. It was a local-variable-space thing.

当我创建一个单独的函数打开文件,传输 100.000 行(刚好不占用整个内存)并关闭文件时,然后我对该函数进行了后续调用(附加到现有文件),我发现对于每次函数退出,PHP 清除垃圾。这是一个局部变量空间的东西。

So here's the magic:

所以这就是魔法:

When a function exits, it frees all local variables.

当函数退出时,它释放所有局部变量。

If you do the job in smaller portions, like 0 to 1000 in the first function call, then 1001 to 2000 and so on, then every time the function returns, your memory will be regained. Garbage collection is very likelyto happen on return from a function. (If it's a relatively slow function eating a lot of memory, we can safely assume it alwayshappens.)

如果你以较小的部分完成这项工作,比如在第一次函数调用中 0 到 1000,然后是 1001 到 2000 等等,那么每次函数返回时,你的记忆都会被重新获得。垃圾收集很可能在函数返回时发生。(如果它是一个消耗大量内存的相对较慢的函数,我们可以安全地假设它总是发生。)

Side note: for reference-passed variables it will obviously not work; a function can only free its inside variables that would be lost anyway on return.

旁注:对于引用传递的变量,它显然不起作用;一个函数只能释放它的内部变量,这些变量在返回时无论如何都会丢失。

I hope this saves your day as it saved mine!

我希望这能像拯救我一样拯救你的一天!

回答by Andy

Call memory_get_peak_usage()after each statement, and ensure you unset()everything you can. If you are iterating with foreach(), use a referenced variable to avoid making a copy of the original (foreach()).

memory_get_peak_usage()在每条语句之后调用,并确保unset()你能做到的一切。如果您使用 进行迭代foreach(),请使用引用变量以避免复制原始变量 ( foreach())。

foreach( $x as &$y)

If PHP is actually leaking memory a forced garbage collection won't make any difference.

如果 PHP 确实在泄漏内存,则强制垃圾收集不会有任何区别。

There's a good article on PHP memory leaks and their detection at IBM

IBM有一篇关于 PHP 内存泄漏及其检测的好文章

回答by symcbean

I was going to say that I wouldn't necessarily expect gc_collect_cycles() to solve the problem - since presumably the files are no longer mapped to zvars. But did you check that gc_enable was called before loading any files?

我想说我不一定希望 gc_collect_cycles() 解决这个问题——因为大概这些文件不再映射到 zvar。但是在加载任何文件之前,您是否检查过 gc_enable 是否被调用?

I've noticed that PHP seems to gobble up memory when doing includes - much more than is required for the source and the tokenized file - this may be a similar problem. I'm not saying that this is a bug though.

我注意到 PHP 在执行包含时似乎会占用内存 - 比源文件和标记化文件所需的要多得多 - 这可能是一个类似的问题。我并不是说这是一个错误。

I believe one workaround would be not to use file_get_contents but rather fopen()....fgets()...fclose() rather than mapping the whole file into memory in one go. But you'd need to try it to confirm.

我相信一种解决方法是不使用 file_get_contents 而是使用 fopen()....fgets()...fclose() 而不是一次性将整个文件映射到内存中。但是你需要尝试它来确认。

HTH

HTH

C.

C。

回答by kvz

There recently was a similar issuewith System_Daemon. Today I isolated my problem to file_get_contents.

最近有一个很类似的问题System_Daemon。今天我将我的问题隔离到file_get_contents.

Could you try using freadinstead? I think this may solve your problem. If it does, it's probably time to do a bugreport over at PHP.

你可以试试用fread吗?我认为这可能会解决您的问题。如果是这样,可能是时候在 PHP 上进行错误报告了。