Laravel 游标和 Laravel 块方法有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45464676/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the difference between laravel cursor and laravel chunk method?
提问by Suraj
I would like to know what is the difference between laravel chunk and laravel cursor method. Which method is more suitable to use? What will be the use cases for both of them? I know that you should use cursor to save memory but how it actually works in the backend?
我想知道laravel chunk 和laravel cursor 方法有什么区别。哪种方法更适合使用?两者的用例是什么?我知道您应该使用游标来节省内存,但它在后端实际上是如何工作的?
A detailed explanation with example would be useful because I have searched on stackoverflow and other sites but I didn't found much information.
带有示例的详细解释会很有用,因为我已经在 stackoverflow 和其他站点上进行了搜索,但没有找到太多信息。
Here is the code snippet's from the laravel documentation.
这是来自 laravel 文档的代码片段。
Chunking Results
分块结果
Flight::chunk(200, function ($flights) {
foreach ($flights as $flight) {
//
}
});
Using Cursors
使用光标
foreach (Flight::where('foo', 'bar')->cursor() as $flight) {
//
}
回答by Oluwatobi Samuel Omisakin
Indeed This question might attract some opinionated answer, however the simple answer is here in Laravel Docs
事实上,这个问题可能会吸引一些固执的答案,但简单的答案在Laravel Docs 中
Just for reference:
仅供参考:
This is chunk:
这是块:
If you need to process thousands of Eloquent records, use the
chunk
command. Thechunk
method will retrieve a "chunk" of Eloquent models, feeding them to a givenClosure
for processing. Using thechunk
method will conserve memory when working with large result sets:
如果您需要处理数千条 Eloquent 记录,请使用该
chunk
命令。该chunk
方法将检索 Eloquent 模型的“块”,将它们提供给给定Closure
的处理。使用该chunk
方法将在处理大型结果集时节省内存:
This is Cursor:
这是光标:
The
cursor
method allows you to iterate through your database records using a cursor, which will only execute a single query. When processing large amounts of data, thecursor
method may be used to greatly reduce your memory usage:
该
cursor
方法允许您使用游标遍历数据库记录,该游标只会执行一个查询。在处理大量数据时,cursor
可以使用该方法来大大减少你的内存使用:
Chunk retrieves the records from the database, and load it into memory while setting a cursor on the last record retrieved so there is no clash.
Chunk 从数据库中检索记录,并将其加载到内存中,同时在检索到的最后一条记录上设置游标,以免发生冲突。
So the advantage here is if you want to reformat the largerecord before they are sent out, or you want to perform an operation on an nth number of records per time then this is useful. An example is if you are building a view out/excel sheet, so you can take the record in counts till they are done so that all of them are not loaded into the memory at once and thereby hitting the memory limit.
所以这里的优点是如果你想在发送之前重新格式化大记录,或者你想每次对第 n 个记录执行操作,那么这很有用。一个例子是,如果您正在构建一个视图/excel 表,那么您可以以计数方式记录记录,直到它们完成,这样所有的记录都不会一次加载到内存中,从而达到内存限制。
Cursor uses PHP Generators, you can check the php generatorspage however here is an interesting caption:
Cursor 使用 PHP 生成器,您可以查看php 生成器页面,但这里有一个有趣的标题:
A generator allows you to write code that uses foreachto iterate over a set of data without needing to build an array in memory, which may cause you to exceed a memory limit, or require a considerable amount of processing time to generate. Instead, you can write a generator function, which is the same as a normal function, except that instead of returning once, a generator can yieldas many times as it needs to in order to provide the values to be iterated over.
生成器允许您编写使用foreach迭代一组数据的代码,而无需在内存中构建数组,这可能会导致您超出内存限制,或者需要大量的处理时间来生成。相反,你可以写一个发电机的功能,这是同作为一个正常的功能,只是不用返回荷兰国际集团一次,发电机可以产生,因为它需要以提供要遍历值多次。
While I cannot guarantee that I understand fully the concept of Cursor, but for Chunk, chunk runs the query at every record size, retrieving it, and passing it into the closure for further works on the records.
虽然我不能保证我完全理解 Cursor 的概念,但对于 Chunk,chunk 会在每个记录大小上运行查询,检索它,并将其传递到闭包中以进一步处理记录。
Hope this is useful.
希望这是有用的。
回答by mohammad asghari
We have a comparison: chunk() vs cursor()
我们有一个比较: chunk() vs cursor()
- cursor(): High Speed
- chunk(): Constant Memory Usage
- 游标():高速
- chunk():常量内存使用
10,000 records:
10,000 条记录:
+-------------+-----------+------------+
| | Time(sec) | Memory(MB) |
+-------------+-----------+------------+
| get() | 0.17 | 22 |
| chunk(100) | 0.38 | 10 |
| chunk(1000) | 0.17 | 12 |
| cursor() | 0.16 | 14 |
+-------------+-----------+------------+
100,000 records:
100,000 条记录:
+--------------+------------+------------+
| | Time(sec) | Memory(MB) |
+--------------+------------+------------+
| get() | 0.8 | 132 |
| chunk(100) | 19.9 | 10 |
| chunk(1000) | 2.3 | 12 |
| chunk(10000) | 1.1 | 34 |
| cursor() | 0.5 | 45 |
+--------------+------------+------------+
- TestData: users table of Laravel default migration
- Homestead 0.5.0
- PHP 7.0.12
- MySQL 5.7.16
- Laravel 5.3.22
- TestData:Laravel 默认迁移的用户表
- 家园 0.5.0
- PHP 7.0.12
- MySQL 5.7.16
- Laravel 5.3.22
回答by oraoto
chunk
is based on pagination, it maintains a page number, and do the looping for you.
chunk
基于分页,它维护一个页码,并为你做循环。
For example, DB::table('users')->select('*')->chunk(100, function($e) {})
will do multiple queries until the result set is smaller than the chunk size(100
):
例如,DB::table('users')->select('*')->chunk(100, function($e) {})
将执行多次查询,直到结果集小于块大小(100
):
select * from `users` limit 100 offset 0;
select * from `users` limit 100 offset 100;
select * from `users` limit 100 offset 200;
select * from `users` limit 100 offset 300;
select * from `users` limit 100 offset 400;
...
cursor
is based on PDOStatement::fetch
and Generator.
cursor
是基于PDOStatement::fetch
和生成器的。
$cursor = DB::table('users')->select('*')->cursor()
foreach ($cursor as $e) { }
will do issue a single query:
将发出单个查询:
select * from `users`
But the driver doesn't fetch the result set at once.
但是驱动程序不会立即获取结果集。
回答by u8329054
Cursor()
Cursor()
- only single query
- fetch result by call
PDOStatement::fetch()
- by default buffered query is used and fetch all result at once.
- turned only current row into eloquent model
- 只有一个查询
- 通过调用获取结果
PDOStatement::fetch()
- 默认情况下使用缓冲查询并一次获取所有结果。
- 仅将当前行转换为 eloquent 模型
Pros
优点
- minimize eloquent model memory overhead
- easy to manipulate
- 最小化 eloquent 模型内存开销
- 易于操作
Cons
缺点
- huge result leads to out of memory
- buffered or unbuffered is a trade-off
- 巨大的结果导致内存不足
- 缓冲或无缓冲是一种权衡
Chunk()
Chunk()
- chunk query in to queries with limit and offset
- fetch result by call
PDOStatement::fetchAll
- turned results into eloquent models batchly
- 块查询到具有限制和偏移量的查询
- 通过调用获取结果
PDOStatement::fetchAll
- 将结果批量转化为雄辩的模型
Pros
优点
- controllable used memory size
- 可控的已用内存大小
Cons
缺点
- turned results in to eloquent models batchly may cause some memory overhead
- queries and memory usage is a traid-off
- 将结果批量转化为 eloquent 模型可能会导致一些内存开销
- 查询和内存使用是一种权衡
TL;DR
TL; 博士
I used to think cursor()will do query each time and only keep one row result in memory. So when I saw @mohammad-asghari's comparison table I got really confused. It must be some bufferbehind the scenes.
我曾经认为cursor()每次都会做查询并且只在内存中保留一行结果。所以当我看到@mohammad-asghari 的比较表时,我真的很困惑。它一定是幕后的一些缓冲区。
By tracking Laravel Code as below
通过跟踪 Laravel 代码如下
/**
* Run a select statement against the database and returns a generator.
*
* @param string $query
* @param array $bindings
* @param bool $useReadPdo
* @return \Generator
*/
public function cursor($query, $bindings = [], $useReadPdo = true)
{
$statement = $this->run($query, $bindings, function ($query, $bindings) use ($useReadPdo) {
if ($this->pretending()) {
return [];
}
// First we will create a statement for the query. Then, we will set the fetch
// mode and prepare the bindings for the query. Once that's done we will be
// ready to execute the query against the database and return the cursor.
$statement = $this->prepared($this->getPdoForSelect($useReadPdo)
->prepare($query));
$this->bindValues(
$statement, $this->prepareBindings($bindings)
);
// Next, we'll execute the query against the database and return the statement
// so we can return the cursor. The cursor will use a PHP generator to give
// back one row at a time without using a bunch of memory to render them.
$statement->execute();
return $statement;
});
while ($record = $statement->fetch()) {
yield $record;
}
}
I understood Laravel build this feature by wrap PDOStatement::fetch(). And by search buffer PDO fetchand MySQL, I found this document.
我了解 Laravel 通过包装PDOStatement::fetch()来构建此功能。通过搜索缓冲区 PDO fetch和MySQL,我找到了这个文档。
https://www.php.net/manual/en/mysqlinfo.concepts.buffering.php
https://www.php.net/manual/en/mysqlinfo.concepts.buffering.php
Queries are using the buffered mode by default. This means that query results are immediately transferred from the MySQL Server to PHP and then are kept in the memory of the PHP process.
查询默认使用缓冲模式。这意味着查询结果会立即从 MySQL 服务器传输到 PHP,然后保存在 PHP 进程的内存中。
so by doing PDOStatement::execute() we actually fetch whole result rowsat ones and stored in the memory, not only one row. So if the result is too huge, this will lead to out of memoryexception.
因此,通过执行 PDOStatement::execute() 我们实际上以一个为单位获取整个结果行并存储在内存中,而不仅仅是一行。所以如果结果太大,就会导致内存不足异常。
Though the Document shown we could use $pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
to get rid of buffered query. But the drawback should be caution.
虽然显示的文档我们可以$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
用来摆脱缓冲查询。但缺点是要谨慎。
Unbuffered MySQL queries execute the query and then return a resource while the data is still waiting on the MySQL server for being fetched. This uses less memory on the PHP-side, but can increase the load on the server. Unless the full result set was fetched from the server no further queries can be sent over the same connection. Unbuffered queries can also be referred to as "use result".
无缓冲的 MySQL 查询执行查询,然后在数据仍在 MySQL 服务器上等待获取时返回资源。这在 PHP 端使用较少的内存,但会增加服务器的负载。除非从服务器获取完整的结果集,否则无法通过同一连接发送进一步的查询。无缓冲查询也可以称为“使用结果”。