php 大结果集的 PDO/MySQL 内存消耗

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6895098/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 01:33:57  来源:igfitidea点击:

PDO/MySQL memory consumption with large result set

phpmysqlpdo

提问by Shabbyrobe

I'm having a strange time dealing with selecting from a table with about 30,000 rows.

我在处理从大约 30,000 行的表中进行选择时遇到了奇怪的时间。

It seems my script is using an outrageous amount of memory for what is a simple, forward only walk over a query result.

似乎我的脚本使用了大量的内存来进行简单的、仅向前遍历查询结果的操作。

Please note that this example is a somewhat contrived, absolute bare minimum example which bears very little resemblance to the real code and it cannot be replaced with a simple database aggregation. It is intended to illustrate the point that each row does not need to be retained on each iteration.

请注意,这个例子是一个有点人为的、绝对的最低限度的例子,它与真实代码几乎没有相似之处,并且不能用简单的数据库聚合来代替。它旨在说明不需要在每次迭代中保留每一行的观点。

<?php
$pdo = new PDO('mysql:host=127.0.0.1', 'foo', 'bar', array(
    PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION,
));
$stmt = $pdo->prepare('SELECT * FROM round');
$stmt->execute();

function do_stuff($row) {}

$c = 0;
while ($row = $stmt->fetch()) {
    // do something with the object that doesn't involve keeping 
    // it around and can't be done in SQL
    do_stuff($row);
    $row = null;
    ++$c;
}

var_dump($c);
var_dump(memory_get_usage());
var_dump(memory_get_peak_usage());

This outputs:

这输出:

int(39508)
int(43005064)
int(43018120)

I don't understand why 40 meg of memory is used when hardly any data needs to be held at any one time. I have already worked out I can reduce the memory by a factor of about 6 by replacing "SELECT *" with "SELECT home, away", however I consider even this usage to be insanely high and the table is only going to get bigger.

我不明白为什么在任何时候几乎不需要保存任何数据的情况下使用 40 兆内存。我已经计算出通过将“SELECT *”替换为“SELECT home, away”,我可以将内存减少大约 6 倍,但是我认为即使这种用法也非常高,而且表只会变得更大。

Is there a setting I'm missing, or is there some limitation in PDO that I should be aware of? I'm happy to get rid of PDO in favour of mysqli if it can not support this, so if that's my only option, how would I perform this using mysqli instead?

是否有我遗漏的设置,或者 PDO 中是​​否存在一些我应该注意的限制?如果它不能支持这一点,我很高兴摆脱 PDO 以支持 mysqli,所以如果这是我唯一的选择,我将如何使用 mysqli 执行此操作?

回答by Shabbyrobe

After creating the connection, you need to set PDO::MYSQL_ATTR_USE_BUFFERED_QUERYto false:

创建连接后,需要设置PDO::MYSQL_ATTR_USE_BUFFERED_QUERY为false:

<?php
$pdo = new PDO('mysql:host=127.0.0.1', 'foo', 'bar', array(
    PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION,
));
$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

// snip

var_dump(memory_get_usage());
var_dump(memory_get_peak_usage());

This outputs:

这输出:

int(39508)
int(653920)
int(668136)

Regardless of the result size, the memory usage remains pretty much static.

无论结果大小如何,内存使用量几乎保持不变。

回答by Dan Grossman

The whole result set (all 30,000 rows) is buffered into memory before you can start looking at it.

整个结果集(全部 30,000 行)在您开始查看之前已缓冲到内存中。

You should be letting the database do the aggregation and only asking it for the two numbers you need.

您应该让数据库进行聚合,并且只要求它提供您需要的两个数字。

SELECT SUM(home) AS home, SUM(away) AS away, COUNT(*) AS c FROM round

回答by Jake

The reality of the situation is that if you fetch all rows and expect to be able to iterate over all of them in PHP, at once, they will exist in memory.

实际情况是,如果您获取所有行并希望能够在 PHP 中遍历所有行,那么它们将立即存在于内存中。

If you really don't think using SQL powered expressions and aggregation is the solution you could consider limiting/chunking your data processing. Instead of fetching all rows at once do something like:

如果您真的不认为使用 SQL 驱动的表达式和聚合是解决方案,您可以考虑限制/分块数据处理。不是一次获取所有行,而是执行以下操作:

1)  Fetch 5,000 rows
2)  Aggregate/Calculate intermediary results
3)  unset variables to free memory
4)  Back to step 1 (fetch next set of rows)

Just an idea...

只是一个想法...

回答by Jarod Elliott

I haven't done this before in PHP, but you may consider fetching the rows using a scrollable cursor - see the fetch documentationfor an example.

我之前在 PHP 中没有这样做过,但是您可以考虑使用可滚动游标来获取行 - 请参阅获取文档获取示例。

Instead of returning all the results of your query at once back to your PHP script, it holds the results on the server side and you use a cursor to iterate through them getting one at a time.

不是将查询的所有结果一次返回给 PHP 脚本,而是将结果保存在服务器端,您可以使用游标遍历它们,一次获得一个。

Whilst I have not tested this, it is bound to have other drawbacks such as utilising more server resources and most likely reduced performance due to additional communication with the server.

虽然我没有对此进行测试,但它肯定会有其他缺点,例如使用更多的服务器资源,并且很可能由于与服务器的额外通信而降低了性能。

Altering the fetch style may also have an impact as by default the documentation indicates it will store both an associative array and well as a numerical indexed array which is bound to increase memory usage.

改变 fetch 样式也可能会产生影响,因为默认情况下,文档表明它将存储关联数组和数字索引数组,这必然会增加内存使用量。

As others have suggested, reducing the number of results in the first place is most likely a better option if possible.

正如其他人所建议的那样,如果可能的话,首先减少结果的数量很可能是一个更好的选择。

回答by Alix Axel

Another option would be to do something like:

另一种选择是执行以下操作:

$i = $c = 0;
$query = 'SELECT home, away FROM round LIMIT 2048 OFFSET %u;';

while ($c += count($rows = codeThatFetches(sprintf($query, $i++ * 2048))) > 0)
{
    foreach ($rows as $row)
    {
        do_stuff($row);
    }
}