Laravel:将大量任务放入队列中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35648386/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 13:18:29  来源:igfitidea点击:

Laravel: Working a large number of tasks into a queue

phplaravellaravel-5

提问by haakym

I'm building a web application using Laravel 5 that creates links to the web application that when visited show a form for a student's progress report. These links are sent by the web application to an email of a contact at the institution the student attends in order for the recipient to complete the progress report accessed by the link in the email.

我正在使用 Laravel 5 构建一个 web 应用程序,该应用程序创建指向 web 应用程序的链接,当访问该链接时,会显示学生进度报告的表单。这些链接由 Web 应用程序发送到学生就读机构的联系人的电子邮件,以便收件人完成通过电子邮件中的链接访问的进度报告。

The problem I face is when creating and sending links. I have some code which works fine with a couple of hundred students, however in real world use the application would be potentially creating and sending 3000+ or so links at one time. The code I've written simply can't handle such a large number in a timely manner and the application crashes. Strangely enough though I don't receive any timeout error via laravel (I'll need to double check the php logs).

我面临的问题是创建和发送链接时。我有一些代码适用于几百名学生,但在现实世界中使用该应用程序可能一次创建和发送 3000 多个链接。我写的代码根本无法及时处理这么大的数字,应用程序崩溃了。奇怪的是,虽然我没有通过 laravel 收到任何超时错误(我需要仔细检查 php 日志)。

Although I'm more than welcome to other suggestions, I believe the answer to the problem is utilising queues. I have already used queues when sending email (see code), but I would like to work some other parts of the code into queues, but I'm a bit unsure how to do this!

尽管我非常欢迎其他建议,但我相信问题的答案是利用队列。我在发送电子邮件时已经使用了队列(请参阅代码),但我想将代码的其他部分放入队列中,但我有点不确定如何执行此操作!

Brief database schema

简要数据库架构

StudenthasMany Link

Student有很多 Link

StudenthasMany InstitutionContact(limited to two by my application)

StudenthasMany InstitutionContact(我的应用程序限制为两个)

LinkhasMany InstitutionContact(limited to two by my application)

LinkhasMany InstitutionContact(我的应用程序限制为两个)

EmailmanyToMany Link

Email多对多 Link

What I am trying to accomplish

我正在努力完成的事情

  • Get all the Student's that require a new Link

  • Create a Linkfor each Student

  • Assign the Student's current InstitutionContacts to the Link's InstitutionContact(A Student's institution contact can change, so I link the InstitutionContactto the link if needed to resend.

  • Loop through all the newly created Linksin order to group them together by shared InstitutionContacts - this is so an email is not sent per link (thus possibly sending many emails with one link to the same address), rather links should be grouped together by the same email/contact and sent together where applicable

  • Loop through all the Links grouped by email/contact and:

    1. Send an email including the Link's info (url, student name etc) to the designated InstitutionContact's email address
    2. Write a copy of the Emailto the database
    3. Join the Emailcreated in the former step to the Link(s) that were sent in it (so the application can be used to search what link was sent in which email)
  • 获取所有Student需要新的Link

  • Link为每个创建一个Student

  • Student的当前InstitutionContacts分配给Links InstitutionContact(AStudent的机构联系人可以更改,因此InstitutionContact如果需要重新发送,我会将 链接到该链接。

  • 循环遍历所有新创建的文件Links,以便通过共享InstitutionContacts将它们组合在一起- 这样一来,不会为每个链接发送一封电子邮件(因此可能会发送许多带有一个链接到同一地址的电子邮件),而应该将链接组合在一起电子邮件/联系方式并在适用的情况下一起发送

  • 遍历Link按电子邮件/联系人分组的所有s 并:

    1. Link向指定InstitutionContact的电子邮件地址发送一封包含学生信息(网址、学生姓名等)的电子邮件
    2. 将 的副本写入Email数据库
    3. 将上Email一步中创建的Link(s)加入其中发送的(s)(因此该应用程序可用于搜索在哪个电子邮件中发送的链接)

So the main challenge I'm facing is performing the aforementioned task with a large dataset. I have already considered creating and sending a Linkone by one via a queue, however this wouldn't allow me to group all the Links together by contact/email. As the task wouldn't be performed regularly I would be open to consider performing the task as it is with an increase in memory and time for the process, however I didn't have much success when attempting this using set_time_limit(0);and ini_set('memory_limit','1056M');before sending any links.

所以我面临的主要挑战是使用大型数据集执行上述任务。我已经考虑Link过通过队列一个一个地创建和发送一个,但是这不允许我Link通过联系人/电子邮件将所有s 组合在一起。由于该任务不会定期执行,我愿意考虑执行该任务,因为它会增加进程的内存和时间,但是在尝试使用此方法set_time_limit(0);ini_set('memory_limit','1056M');发送任何链接之前,我没有取得太大的成功。

Any help would be really appreciated, thank you if you read this far!

任何帮助将不胜感激,感谢您阅读到这里!

Code

代码

app\Http\Controllers\LinkController.php

应用\Http\Controllers\LinkController.php

public function storeAndSendMass(Request $request)
{
    $this->validate($request, [
        'student_id' => 'required|array',
        'subject'    => 'required|max:255',
        'body'       => 'required|max:5000',
    ]);

    $studentIds = $request->get('student_id');
    $subject    = $request->get('subject');
    $body       = $request->get('body');

    $students = $this->student
        ->with('institutionContacts')
        ->whereIn('id', $studentIds)
        ->where('is_active', 1)
        ->get();

    // create link, see Link.php below for method
    $newLinks = $this->link->createActiveLink($students);

    // send link to student's contact(s), see LinkEmailer.php below for method
    $this->linkEmailer->send($newLinks, ['subject' => $subject, 'body' => $body], 'mass');

    // return
    return response()->json([
        'message' => 'Creating and sending links'
    ]);
}

app\Models\Link.php

应用\模型\Link.php

public function createActiveLink($students)
{
    $links = [];

    foreach ($students as $student) {
        $newLink = $this->create([
            'token'          => $student->id, // automatically hashed
            'status'         => 'active',
            'sacb_refno'     => $student->sacb_refno,
            'course_title'   => $student->course_title,
            'university_id'  => $student->university_id,
            'student_id'     => $student->id,
            'institution_id' => $student->institution_id,
            'course_id'      => $student->course_id,
        ]);

        $studentContacts = $student->institutionContacts;

        if ($studentContacts) {

            foreach ($studentContacts as $studentContact) {

                $newLink->contacts()->create([
                    'type'                   => $studentContact->pivot->type,
                    'institution_contact_id' => $studentContact->pivot->institution_contact_id
                ]);

                $newLink->save();
            }

        }

        $links[] = $newLink->load('student');
    }

    return $links;
}

app\Emails\LinkEmailer.php

应用\电子邮件\LinkEmailer.php

namespace App\Emails;

use App\Emails\EmailComposer;

class LinkEmailer
{
    protected $emailComposer;

    public function __construct(EmailComposer $emailComposer)
    {
        $this->emailComposer = $emailComposer;
    }

    public function send($links, $emailDetails, $emailType)
    {        
        $contactsAndLinks = $this->arrangeContactsToLinks($links);

        foreach ($contactsAndLinks as $linksAndContact) {

            $emailData = array_merge($linksAndContact, $emailDetails);

            // send/queue email
            \Mail::queue('emails/queued/reports', $emailData, function ($message) use ($emailData) {
                $message
                    ->to($emailData['email'], $emailData['formal_name'])
                    ->subject($emailData['subject']);
            });

            // compose email message, returns text of the email
            $emailMessage = $this->emailComposer->composeMessage($emailData);

            // // create Email
            $email = \App\Models\Email::create([
                'to'      => $emailData['email'],
                'from'    => '[email protected]',
                'subject' => $emailData['subject'],
                'body'    => $emailMessage,
                'type'    => $emailType,
                'user'    => $_SERVER['REMOTE_USER']
            ]);

            foreach ($linksAndContact['links'] as $link) {
                $link->emails()->attach($email->id);
            }
        }
    }

    // group links by contact
    public function arrangeContactsToLinks($links)
    {
        $contactsForLinks = [];
        $assigned         = false;
        $match            = false;

        foreach ($links as $link) { // 1, n

            if ($link->contacts) {

                foreach ($link->contacts as $contact) { // 1, 2

                    if ($contactsForLinks) {

                        $assigned = false;

                        foreach ($contactsForLinks as $key => $contactLink) { // n
                            // assign links to existing email in array
                            if ($contactLink['email'] === $contact->institutionContact->email) {
                                $match = false;

                                // check link hasn't already been included
                                foreach ($contactsForLinks[$key]['links'] as $assignedLink) {
                                    if ($assignedLink === $link) {
                                        $match = true;
                                    }
                                }

                                // if there was no match add to list of links
                                if (!$match) {
                                    $contactsForLinks[$key]['links'][] = $link->load('student');
                                    $assigned = true;
                                    break;
                                }
                            }
                        }

                        if (!$assigned) {
                            $contactsForLinks[] = [
                                'email'                 => $contact->institutionContact->email,
                                'formal_name'           => $contact->institutionContact->formal_name,
                                'requires_id'           => $contact->institutionContact->institution->requires_id,
                                'requires_course_title' => $contact->institutionContact->institution->requires_course_title,
                                'links'                 => [$link->load('student')],
                            ];
                        }
                    } else {
                        $contactsForLinks[] = [
                            'email'                 => $contact->institutionContact->email,
                            'formal_name'           => $contact->institutionContact->formal_name,
                            'requires_id'           => $contact->institutionContact->institution->requires_id,
                            'requires_course_title' => $contact->institutionContact->institution->requires_course_title,
                            'links'                 => [$link->load('student')],
                        ];
                    }
                }
            }
        }

        return $contactsForLinks;
    }
}

Edit 1

编辑 1

I've got this working now with set_time_limit(0);and ini_set('memory_limit','1056M');it took 8 mins to do 3000 students.

我有这个工作现在set_time_limit(0);,并ini_set('memory_limit','1056M');花了8分钟做3000名学生。

Edit 2

编辑 2

I'm running Laravel Framework version 5.1.6 (LTS), MySQL for DB.

我正在运行 Laravel Framework 版本 5.1.6 (LTS),用于 DB 的 MySQL。

Edit 3

编辑 3

Appreciate all the answers so far, thank you all. I am thinking that I may work the linkcreation process into a queue which will have a related entity in the database called something like Batchand when that Batchof links has finished being created then group all the Links from that Batchand send them.

感谢到目前为止所有的答案,谢谢大家。我想我可以将link创建过程工作到一个队列中,该队列将在数据库中有一个相关的实体,称为类似的东西Batch,当Batch链接的创建完成后,然后将所有的Links分组Batch并发送它们。

I could use the approach that @denis-mysenko suggested by having a sent_atfield in the Links table and have a scheduled process to check for Links that haven't been sent and then send them. However, using the aforementioned approach I can send the Batchof Links when they've all finished being created, whereas with the sent_atapproach with a scheduled process looking for Links that haven't been sent it could potentially send some links when all the links haven't been created yet.

我可以使用@denis-mysenko 建议的方法,sent_atLinks 表中有一个字段,并有一个预定的过程来检查Link尚未发送的 s 然后发送它们。然而,使用上述的方法,我可以发送BatchLink有s他们已经全部完成在创建时,而sent_at与寻找一个预定的处理方法Links表示尚未发出它可能会发送一些链接时,所有的链接避风港”尚未创建。

回答by Andrew Reborn

If you've tested your code with a small amount of data and it succeeds without crashing, it's clear that the problem is (as you said) the quite high number of records you're dealing with. Why don't you process your collection with the chunkmethod?

如果您使用少量数据测试了您的代码并且它成功而没有崩溃,那么很明显问题是(如您所说)您正在处理的记录数量非常多。为什么不使用方法处理您的集合?

Accordingly to the Laravel docs:

根据 Laravel 文档:

If you need to process thousands of Eloquent records, use the chunk command. The chunk method will retrieve a "chunk" of Eloquent models, feeding them to a given Closure for processing. Using the chunk method will conserve memory when working with large result sets

如果您需要处理数千条 Eloquent 记录,请使用 chunk 命令。chunk 方法将检索 Eloquent 模型的“块”,将它们提供给给定的闭包进行处理。在处理大型结果集时,使用块方法将节省内存

Anyway, I think the use of a queue it's required in this kind of scenarios. I believe that working on a large set of data on a HTTP request should be absolutely avoided due to the high risk of request timeout. The queued process doesn't have the limit of execution time.

无论如何,我认为在这种情况下需要使用队列。我认为,由于请求超时的高风险,应该绝对避免在 HTTP 请求上处理大量数据。排队的进程没有执行时间的限制。

Why don't you use the queue AND the chunk method on your collection together? This will make you able to:

为什么不在集合中同时使用队列和块方法?这将使您能够:

  • avoid timeout errors
  • use a reasonable amount of memory during the execution of the process (this depends on the number of data passed to the chunk closure)
  • give you priority control over this queued process and (eventually) other queued processes
  • 避免超时错误
  • 在进程执行期间使用合理的内存量(这取决于传递给块闭包的数据数量)
  • 让您优先控制这个排队的进程和(最终)其他排队的进程

The Laravel docs covers all you need: Eloquent - Retrieving multiple models(see the "Chunking results" chapter for going deeper on how to save memory when dealing with a large data set) and Queuesfor creating jobs and detach some parts of your software that should not run under your webserver, avoiding the risk of timeouts

Laravel 文档涵盖了您需要的所有内容:Eloquent - 检索多个模型(请参阅“分块结果”一章以更深入地了解在处理大型数据集时如何节省内存)和用于创建作业和分离软件的某些部分的队列不应在您的网络服务器下运行,避免超时风险

回答by Denis Mysenko

I would propose to change the architecture. I think it's unnecessarily overcomplicated.

我会建议改变架构。我认为它不必要地过于复杂。

Controller could like like:

控制器可能喜欢:

public function storeAndSendMass(Request $request, LinkEmailer $linkEmailer)
{
    $this->validate($request, [
        'student_id' => 'required|array',
        'subject'    => 'required|max:255',
        'body'       => 'required|max:5000',
    ]);

    $students = $this->student
        ->with('institutionContacts')
        ->whereIn('id', $request->input('student_id'))
        ->where('is_active', 1)
        ->get();

    // Don't use Link.php method at all
    foreach ($students as $student) {
        $student->links()->create([
            'token'          => $student->id, // automatically hashed
            'status'         => 'active',
            'sent_at'        => null,
            'email_body'     => $request->input('body'),
            'email_subject'  => $request->input('subject')
        ]);
    }

    return response()->json([
        'message' => 'Creating and sending links'
    ]);
}

Why keep so many fields in Link model that already exist in Student model and are accessible via student() relationship? You could just keep the status and the token (I assume it's part of the link?), as well as 'sent_at' timestamp. If links are usually sent only once, it's reasonable to keep the email body and subject there as well.

为什么在 Link 模型中保留这么多已经存在于 Student 模型中并且可以通过 student() 关系访问的字段?您可以保留状态和令牌(我认为它是链接的一部分?),以及“sent_at”时间戳。如果链接通常只发送一次,那么将电子邮件正文和主题也保留在那里是合理的。

If student updates his institution contacts, fresh data will be used at the time of email composition because you are not linking links to institution contacts explicitly.

如果学生更新了他的机构联系人,则在撰写电子邮件时将使用新数据,因为您没有明确链接到机构联系人的链接。

Then, I would create a command (let's say newLinkNotifier) that would run, for instance, every 10 minutes and that would look for links that haven't been sent yet ($links->whereNull('sent_at')), group them by email ($link->student->institutionContacts) and email content ($link->email_body, $link->email_subject) and create queued email jobs. Then a queue worker would send those emails (or you could set queue to 'async' to send them right away from the command).

然后,我将创建一个命令(假设为 newLinkNotifier),例如,每 10 分钟运行一次,它会查找尚未发送的链接 ( $links->whereNull('sent_at')),按电子邮件 ( $link->student->institutionContacts) 和电子邮件内容 ( $link->email_body, $link->email_subject) 将它们分组,然后创建排队的电子邮件作业。然后队列工作人员将发送这些电子邮件(或者您可以将队列设置为“异步”以立即从命令发送它们)。

Since this command will run async, it doesn't really matter if it takes 5 minutes to finish. But in real life, it would take less than a minute for thousands and thousands of objects.

由于此命令将异步运行,因此是否需要 5 分钟才能完成并不重要。但在现实生活中,处理成千上万个对象只需要不到一分钟的时间。

How to do the grouping? I would probably just rely on MySQL (if you are using it), it will do the job faster than PHP. Since all 3 fields are accessible from SQL tables (two directly, another from JOIN) - it's actually a pretty simple task.

如何进行分组?我可能只依赖 MySQL(如果您正在使用它),它会比 PHP 更快地完成工作。由于所有 3 个字段都可以从 SQL 表中访问(两个直接,另一个来自 JOIN) - 这实际上是一个非常简单的任务。

In the end your emailers send() method will become as trivial as:

最后,您的电子邮件发送者的 send() 方法将变得如此简单:

public function send()
{        
    $links = Link::whereNull('sent_at')->get();

    // In this example, we will group in PHP, not in MySQL
    $grouped = $links->groupBy(function ($item, $key) {
        return implode('.', [$item->email, $item->email_body, $item->email_subject]);
    });

    $grouped->toArray();

    foreach ($grouped as $group) {
        // We know all items in inside $group array got the same
        // email_body, email, email_subject values anyway!
        Mail::queue('emails/queued/reports', $group[0]->email_body, function ($message) use ($group) {
            $message
                ->to($group[0]->email)
                ->subject($group[0]->email_subject);
        });    
    }
}

This is not perfect yet, and I haven't tested this code - I wrote it right here, but it shows the proposed concept.

这还不完美,我还没有测试过这段代码——我在这里写的,但它显示了提议的概念。

If you don't plan to get millions of entries - this is probably good enough. Otherwise, you could move link creation into a separate async job as well.

如果您不打算获得数百万个条目 - 这可能已经足够了。否则,您也可以将链接创建移动到单独的异步作业中。

回答by Captain Hypertext

Assuming you're running version 5.0, how about passing off that initial processing to the queue as well?

假设您正在运行 5.0 版,那么如何将初始处理也传递给队列?

app\Http\Controllers\LinkController.php

应用\Http\Controllers\LinkController.php

 // Accept request, validate $students

 // Send this work strait to the cue
 Bus::dispatch(
    new CreateActiveLinks($students));
 );

// return
return response()->json([
    'message' => 'Creating and sending links. This will take a while.'
]);

app\Console\Commands\CreateActiveLinks.php(queued job)

app\Console\Commands\CreateActiveLinks.php(排队作业)

class CreateActiveLinks extends Command implements SelfHandling, ShouldQueue {

    protected $studentIds;

    /**
     * Create a new command instance.
     *
     * @return void
     */
    public function __construct($studentIds)
    {
        $this->studentIds = $studentIds;
    }

    /**
     * This part is executed in the queue after the
     * user got their response
     *
     * @return void
     */
    public function handle()
    {
        $students = Student::with('institutionContacts')
            ->whereIn('id', $studentIds)
            ->where('is_active', 1)
            ->get();

        foreach ($students as $student) {
            // Process and create records...

            $newLinks[] = $newLink->load('student');
        }

        // Emailer job would run like normal
        LinkEmailer::send($newLinks, ['subject' => $subject, 'body' => $body], 'mass');

        // Notify user or something...
    }
}

Queuing Commands in Laravel 5.0

Laravel 5.0 中的排队命令

In 5.1 forward, these are called Jobsand work a little differently.

在 5.1 版本中,这些被称为Jobs,工作方式略有不同。

This code is untested, and I don't have a firm grasp on your application structure, so please don't take it as gospel. It's just based on work I've done in my own application when faced with similar circumstances. Maybe this will at least give you some ideas. If you really have a lot of records, then adding the chunk()method in the CreateActiveLinksclass query might be helpful.

这段代码是未经测试的,我对你的应用结构不是很了解,所以请不要把它当作福音。它只是基于我在自己的应用程序中遇到类似情况时所做的工作。也许这至少会给你一些想法。如果您确实有很多记录,那么chunk()CreateActiveLinks类查询中添加该方法可能会有所帮助。

回答by Can Celik

I have found out that creating an Event/Listener and them implementing queue is a lot more easier. All you have to is to create an Event and Listener for your email process(LinkEmailer) and then implement ShouldQueue interface as mentioned in the documentation.

我发现创建一个事件/监听器和他们实现队列要容易得多。您所要做的就是为您的电子邮件进程(LinkEmailer)创建一个事件和侦听器,然后实现文档中提到的 ShouldQueue 接口。

https://laravel.com/docs/5.1/events#queued-event-listeners

https://laravel.com/docs/5.1/events#queued-event-listeners