Python 用芹菜运行“独特”的任务

Question

提问by Luper Rouch

I use celery to update RSS feeds in my news aggregation site. I use one @task for each feed, and things seem to work nicely.

我使用 celery 更新我的新闻聚合站点中的 RSS 提要。我为每个提要使用一个 @task，一切似乎都运行良好。

There's a detail that I'm not sure to handle well though: all feeds are updated once every minute with a @periodic_task, but what if a feed is still updating from the last periodic task when a new one is started ? (for example if the feed is really slow, or offline and the task is held in a retry loop)

有一个细节我不确定处理好：所有提要每分钟更新一次@periodic_task，但是如果提要在启动新任务时仍在从上一个定期任务更新怎么办？（例如，如果提要真的很慢，或者离线并且任务处于重试循环中）

Currently I store tasks results and check their status like this:

目前我存储任务结果并像这样检查它们的状态：

import socket
from datetime import timedelta
from celery.decorators import task, periodic_task
from aggregator.models import Feed


_results = {}


@periodic_task(run_every=timedelta(minutes=1))
def fetch_articles():
    for feed in Feed.objects.all():
        if feed.pk in _results:
            if not _results[feed.pk].ready():
                # The task is not finished yet
                continue
        _results[feed.pk] = update_feed.delay(feed)


@task()
def update_feed(feed):
    try:
        feed.fetch_articles()
    except socket.error, exc:
        update_feed.retry(args=[feed], exc=exc)

Maybe there is a more sophisticated/robust way of achieving the same result using some celery mechanism that I missed ?

也许有一种更复杂/更强大的方法可以使用我错过的一些芹菜机制来实现相同的结果？

Answer 1

采纳答案by MattH

From the official documentation: Ensuring a task is only executed one at a time.

来自官方文档：确保一次只执行一个任务。

Answer 2

回答by SteveJ

Based on MattH's answer, you could use a decorator like this:

根据 MattH 的回答，您可以使用这样的装饰器：

def single_instance_task(timeout):
    def task_exc(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            lock_id = "celery-single-instance-" + func.__name__
            acquire_lock = lambda: cache.add(lock_id, "true", timeout)
            release_lock = lambda: cache.delete(lock_id)
            if acquire_lock():
                try:
                    func(*args, **kwargs)
                finally:
                    release_lock()
        return wrapper
    return task_exc

then, use it like so...

然后，像这样使用它......

@periodic_task(run_every=timedelta(minutes=1))
@single_instance_task(60*10)
def fetch_articles()
    yada yada...

Answer 3

回答by keithl8041

If you're looking for an example that doesn't use Django, then try this example(caveat: uses Redis instead, which I was already using).

如果你正在寻找一个不使用 Django 的例子，那么试试这个例子（注意：使用 Redis，我已经在使用了）。

The decorator code is as follows (full credit to the author of the article, go read it)

装饰器代码如下（完全归功于文章作者，去阅读吧）

import redis

REDIS_CLIENT = redis.Redis()

def only_one(function=None, key="", timeout=None):
    """Enforce only one celery task at a time."""

    def _dec(run_func):
        """Decorator."""

        def _caller(*args, **kwargs):
            """Caller."""
            ret_value = None
            have_lock = False
            lock = REDIS_CLIENT.lock(key, timeout=timeout)
            try:
                have_lock = lock.acquire(blocking=False)
                if have_lock:
                    ret_value = run_func(*args, **kwargs)
            finally:
                if have_lock:
                    lock.release()

            return ret_value

        return _caller

    return _dec(function) if function is not None else _dec

Answer 4

回答by user12397901

This solution for celery working at single host with concurency greater 1. Other kinds (without dependencies like redis) of locks difference file-based don't work with concurrency greater 1.

此解决方案适用于在并发性大于 1 的单个主机上工作的 celery。其他类型（无依赖关系，如 redis）基于文件差异的锁不适用于并发性大于 1 的情况。

class Lock(object):
    def __init__(self, filename):
        self.f = open(filename, 'w')

    def __enter__(self):
        try:
            flock(self.f.fileno(), LOCK_EX | LOCK_NB)
            return True
        except IOError:
            pass
        return False

    def __exit__(self, *args):
        self.f.close()


class SinglePeriodicTask(PeriodicTask):
    abstract = True
    run_every = timedelta(seconds=1)

    def __call__(self, *args, **kwargs):
        lock_filename = join('/tmp',
                             md5(self.name).hexdigest())
        with Lock(lock_filename) as is_locked:
            if is_locked:
                super(SinglePeriodicTask, self).__call__(*args, **kwargs)
            else:
                print 'already working'


class SearchTask(SinglePeriodicTask):
    restart_delay = timedelta(seconds=60)

    def run(self, *args, **kwargs):
        print self.name, 'start', datetime.now()
        sleep(5)
        print self.name, 'end', datetime.now()

Answer 5

回答by vdboor

Using https://pypi.python.org/pypi/celery_onceseems to do the job really nice, including reporting errors and testing against some parameters for uniqueness.

使用https://pypi.python.org/pypi/celery_once似乎可以很好地完成这项工作，包括报告错误和针对某些参数进行唯一性测试。

You can do things like:

您可以执行以下操作：

from celery_once import QueueOnce
from myapp.celery import app
from time import sleep

@app.task(base=QueueOnce, once=dict(keys=('customer_id',)))
def start_billing(customer_id, year, month):
    sleep(30)
    return "Done!"

which just needs the following settings in your project:

只需要在您的项目中进行以下设置：

ONCE_REDIS_URL = 'redis://localhost:6379/0'
ONCE_DEFAULT_TIMEOUT = 60 * 60  # remove lock after 1 hour in case it was stale

Python 用芹菜运行“独特”的任务

提问by Luper Rouch

采纳答案by MattH

回答by SteveJ

回答by keithl8041

回答by user12397901

回答by vdboor

相关推荐

最近更新

标签

Python 用芹菜运行“独特”的任务

提问by Luper Rouch

采纳答案by MattH

回答by SteveJ

回答by keithl8041

回答by user12397901

回答by vdboor

相关推荐

用逗号分割并在 Python 中去除空格

Python Turtle 模块 - 保存图像

Python BeautifulSoup XML 解析

Python 如何在不知道小部件的字体系列/大小的情况下更改小部件的字体样式？

相关推荐

最近更新

标签