bash 如何使用 PHP 设置 Beanstalkd

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7730562/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 00:57:01  来源:igfitidea点击:

How to set up Beanstalkd with PHP

phpbashmessage-queuedaemonbeanstalkd

提问by joshholat

Recently I've been researching the use of Beanstalkd with PHP. I've learned quite a bit but have a few questions about the setup on a server, etc.

最近我一直在研究 Beanstalkd 和 PHP 的使用。我已经学到了很多,但对服务器上的设置有一些疑问,等等。

Here is how I see it working:

这是我如何看待它的工作:

  1. I install Beanstalkd and any dependencies (such as libevent) on my Ubuntu server. I then start the Beanstalkd daemon (which should basically run at all times).
  2. Somewhere in my website (such as when a user performs some actions, etc) tasks get added to various tubes within the Beanstalkd queue.
  3. I have a bash script (such as the following one) that is run as a deamon that basically executes a PHP script.

    #!/bin/sh
    php worker.php
    
  1. 我在我的 Ubuntu 服务器上安装了 Beanstalkd 和任何依赖项(例如 libevent)。然后我启动 Beanstalkd 守护进程(它基本上应该一直运行)。
  2. 在我网站的某个地方(例如当用户执行某些操作时等),任务被添加到 Beanstalkd 队列中的各个管道中。
  3. 我有一个 bash 脚本(例如下面的一个),它作为守护进程运行,基本上执行一个 PHP 脚本。

    #!/bin/sh
    php worker.php
    

4) The worker script would have something like this to execute the queued up tasks:

4) 工作脚本将有这样的东西来执行排队的任务:

while(1) {
  $job = $this->pheanstalk->watch('test')->ignore('default')->reserve();
  $job_encoded = json_decode($job->getData(), false);
  $done_jobs[] = $job_encoded;
  $this->log('job:'.print_r($job_encoded, 1));
  $this->pheanstalk->delete($job);
}

Now here are my questions based on the above setup (which correct me if I'm wrong about that):

现在这是我基于上述设置的问题(如果我错了,请纠正我):

  1. Say I have the task of importing an RSS feed into a database or something. If 10 users do this at once, they'll all be queued up in the "test" tube. However, they'd then only be executed one at a time. Would it be better to have 10 different tubes all executing at the same time?

  2. If I do need more tubes, does that then also mean that i'd need 10 worker scripts? One for each tube all running concurrently with basically the same code except for the string literal in the watch() function.

  3. If I run that script as a daemon, how does that work? Will it constantly be executing the worker.php script? That script loops until the queue is empty theoretically, so shouldn't it only be kicked off once? How does the daemon decide how often to execute worker.php? Is that just a setting?

  1. 假设我有将 RSS 提要导入数据库之类的任务。如果 10 个用户同时执行此操作,他们将全部在“测试”管中排队。然而,他们只会一次被处决。让 10 个不同的管同时执行会更好吗?

  2. 如果我确实需要更多管,那是否也意味着我需要 10 个工作脚本?除了 watch() 函数中的字符串文字外,每个管都使用基本相同的代码同时运行。

  3. 如果我将该脚本作为守护程序运行,它是如何工作的?它会不断地执行worker.php 脚本吗?该脚本循环直到理论上队列为空,所以不应该只启动一次吗?守护进程如何决定执行worker.php 的频率?这只是一个设定吗?

Thanks!

谢谢!

采纳答案by Alister Bulman

  1. If the worker isn't taking too long to fetch the feed, it will be fine. You can run multiple workers if required to process more than one at a time. I've got a system (currently using Amazon SQS, but I've done similar with BeanstalkD before), with up to 200 (or more) workers pulling from the queue.
  2. A single worker script (the same script running multiple times) should be fine - the script can watch multiple tubes at the same time, and the first one available will be reserved. You can also use the job-statcommand to see where a particular $job came from (which tube), or put some meta-information into the message if you need to tell each type from another.
  3. A good example of running a worker is described here. I've also added supervisord(also, a useful postto get started) to easily start and keep running a number of workers per machine (I run shell scripts, as in the first link). I would limit the number of times it loops, and also put a number into the reserve()to have it wait for a few seconds, or more, for the next job the become available without spinning out of control in a tight loop that does not pause at all - even if there was nothing to do.
  1. 如果工作人员不会花费太长时间来获取提要,那就没问题了。如果需要一次处理多个工作程序,您可以运行多个工作程序。我有一个系统(目前使用 Amazon SQS,但我之前用 BeanstalkD 做过类似的事情),最多有 200 个(或更多)工作人员从队列中拉出。
  2. 单个工作脚本(多次运行同一个脚本)应该没问题——脚本可以同时观看多个管,第一个可用的将被保留。您还可以使用该job-stat命令查看特定 $job 来自何处(哪个管),或者如果您需要将每种类型与另一种类型区分,则将一些元信息放入消息中。
  3. 这里描述了一个运行 worker 的好例子。我还添加了supervisord(也是一个有用的入门帖子)以轻松启动并保持每台机器运行多个工作程序(我运行 shell 脚本,如第一个链接中所示)。我会限制它循环的次数,并在其中输入一个数字reserve(),让它等待几秒钟或更长时间,以便下一个工作变得可用,而不会在不暂停的紧密循环中失控一切——即使无事可做。

Addendum:

附录:

  1. The shell script would be run as many times as you need. (the link show how to have it re-run as required with exec $@). Whenever the php script exits, it re-runs the PHP.
  2. Apparently there's a Djanjo app to show some stats, but it's trivial enough to connect to the daemon, get a list of tubes, and then get the stats for each tube - or just counts.
  1. shell 脚本将根据需要运行多次。(该链接显示了如何根据需要重新运行它exec $@)。每当 php 脚本退出时,它都会重新运行 PHP。
  2. 显然有一个 Djanjo 应用程序可以显示一些统计数据,但它足以连接到守护程序,获取管道列表,然后获取每个管道的统计信息 - 或者只是计数。