Linux 如何在不延迟任务的情况下优雅地重启 Celery

Question

提问by nitwit

We use Celery with our Django webapp to manage offline tasks; some of these tasks can run up to 120 seconds.

我们使用 Celery 和我们的 Django webapp 来管理离线任务；其中一些任务可以运行长达 120 秒。

Whenever we make any code modifications, we need to restart Celery to have it reload the new Python code. Our current solution is to send a SIGTERM to the main Celery process (kill -s 15 `cat /var/run/celeryd.pid`), then to wait for it to die and restart it (python manage.py celeryd --pidfile=/var/run/celeryd.pid [...]).

每当我们进行任何代码修改时，我们都需要重新启动 Celery 以使其重新加载新的 Python 代码。我们当前的解决方案是向 Celery 主进程发送一个 SIGTERM ( kill -s 15 `cat /var/run/celeryd.pid`)，然后等待它死亡并重新启动它 ( python manage.py celeryd --pidfile=/var/run/celeryd.pid [...])。

Because of the long-running tasks, this usually means the shutdown will take a minute or two, during which no new tasks are processed, causing a noticeable delay to users currently on the site. I'm looking for a way to tell Celery to shutdown, but then immediately launch a new Celery instance to start running new tasks.

由于长时间运行的任务，这通常意味着关闭需要一两分钟，在此期间不会处理任何新任务，从而对当前站点上的用户造成明显延迟。我正在寻找一种方法来告诉 Celery 关闭，然后立即启动一个新的 Celery 实例以开始运行新任务。

Things that didn'twork:

事情并没有工作：

Sending SIGHUP to the main process: this caused Celery to attempt to "restart," by doing a warm shutdown and then relaunching itself. Not only does this take a long time, it doesn't even work, because apparently the new process launches before the old one dies, so the new one complains ERROR: Pidfile (/var/run/celeryd.pid) already exists. Seems we're already running? (PID: 13214)and dies immediately. (This looks like a bug in Celery itself; I've let them knowabout it.)
Sending SIGTERM to the main process and then immediately launching a new instance: same issue with the Pidfile.
Disabling the Pidfile entirely: without it, we have no way of telling which of the 30 Celery process are the main process that needs to be sent a SIGTERM when we want it to do a warm shutdown. We also have no reliable way to check if the main process is still alive.

将 SIGHUP 发送到主进程：这导致 Celery 尝试通过热关机然后重新启动自己来“重新启动”。这不仅需要很长时间，甚至不起作用，因为显然新进程在旧进程死亡之前启动，因此新进程ERROR: Pidfile (/var/run/celeryd.pid) already exists. Seems we're already running? (PID: 13214)立即抱怨并死亡。（这看起来像是 Celery 本身的一个错误；我已经让他们知道了。）
将 SIGTERM 发送到主进程，然后立即启动一个新实例：与 Pidfile 相同的问题。
完全禁用 Pidfile：没有它，我们无法判断 30 个 Celery 进程中的哪些是需要发送 SIGTERM 的主要进程，当我们希望它进行热关机时。我们也没有可靠的方法来检查主进程是否还活着。

Answer 1

回答by j_mcnally

Can you launch it with a custom pid file name. Possibly timestamped, and key off of that to know which PID to kill?

您可以使用自定义 pid 文件名启动它吗？可能有时间戳，然后关闭它以知道要杀死哪个 PID？

CELERYD_PID_FILE="/var/run/celery/%n_{timestamp}.pid"

^I dont know the timestamp syntax but maybe you do or you can find it?

^我不知道时间戳语法，但也许你知道或者你可以找到它？

then use the current system time to kill off any old pids and launch a new one?

然后使用当前系统时间杀死所有旧的 pid 并启动一个新的？

Answer 2

回答by mher

celeryd has --autoreload option. If enabled, celery worker (main process) will detect changes in celery modules and restart all worker processes. In contrast to SIGHUP signal, autoreload restarts each process independently when the current executing task finishes. It means while one worker process is restarting the remaining processes can execute tasks.

celeryd 有 --autoreload 选项。如果启用，celery worker（主进程）将检测 celery 模块中的更改并重新启动所有工作进程。与 SIGHUP 信号相反，autoreload 在当前执行的任务完成时独立地重新启动每个进程。这意味着当一个工作进程正在重新启动时，其余进程可以执行任务。

http://celery.readthedocs.org/en/latest/userguide/workers.html#autoreloading

Answer 3

回答by Ivan Virabyan

I've recently fixed the bug with SIGHUP: https://github.com/celery/celery/pull/662

我最近用 SIGHUP 修复了这个错误：https: //github.com/celery/celery/pull/662

Answer 4

回答by Régis B.

rm *.pyc

This causes the updated tasks to be reloaded. I discovered this trick recently, I just hope there are no nasty side effects.

这会导致重新加载更新的任务。我最近发现了这个技巧，我只希望没有讨厌的副作用。

Answer 5

回答by Debanshu Kundu

Well you using SIGHUP (1) for warm shutdown of celery. I am not sure if it actually causes a warm shutdown. But SIGINT (2) would cause a warm shutdown. Try SIGINT in place of SIGHUP and then start celery manually in your script (I guess).

好吧，您使用 SIGHUP (1) 对 celery 进行热关机。我不确定它是否真的会导致热关机。但是 SIGINT (2) 会导致热关机。尝试使用 SIGINT 代替 SIGHUP，然后在脚本中手动启动 celery（我猜）。

Answer 6

回答by spac3_monkey

A little late, but that can fixed by deletingthe file called celerybeat.pid.

有点晚了，但这可以通过删除名为celerybeat.pid的文件来解决。

Workedfor me.

工作对我来说。

Answer 7

回答by lasthuman

I think you can try this:

我想你可以试试这个：

kill -s HUP ``cat /var/run/celeryd.pid`` 
python manage.py celeryd --pidfile=/var/run/celeryd.pid

HUPmay recycle every free worker and leave executing workers keep running and HUPwill let these workers be trusted. Then you can safely restart a new celery worker main process and workers. Old workers may be killed itself when task has been finished.

HUP可以回收每一个空闲的工人，让执行的工人继续运行，HUP并让这些工人得到信任。然后你可以安全地重新启动一个新的 celery worker 主进程和 worker。当任务完成时，老工人可能会被杀死。

I've use this way in our production and it seems safe now. Hope this can help you!

我在我们的生产中使用过这种方式，现在看起来很安全。希望这可以帮到你！

Linux 如何在不延迟任务的情况下优雅地重启 Celery

提问by nitwit

回答by j_mcnally

回答by mher

回答by Ivan Virabyan

回答by Régis B.

回答by Debanshu Kundu

回答by spac3_monkey

回答by lasthuman

相关推荐

最近更新

标签

Linux 如何在不延迟任务的情况下优雅地重启 Celery

提问by nitwit

回答by j_mcnally

回答by mher

回答by Ivan Virabyan

回答by Régis B.

回答by Debanshu Kundu

回答by spac3_monkey

回答by lasthuman

相关推荐

Linux shell 脚本：十六进制数到二进制字符串

C# 无法获取窗口句柄，不支持无窗口 ActiveX 控件？

如何从 Linux 驱动程序访问和调试 FDT/DTS 设备树（seg-fault）

从 C# 读取 Gmail 帐户的 Atom 提要

相关推荐

最近更新

标签