首页 > 解决方案 > 某些 Celery 任务启动但挂起并且从不执​​行

问题描述

我对 Django 和 Celery 有一个问题,其中一些已注册的任务永远不会被执行。

我的 tasks.py 文件中有三个任务,其中两个;schedule_notification()并且schedule_archive()没有问题地工作。它们在预定义的 ETA 处毫无问题地执行。

使用该schedule_monitoring()功能,我可以看到该作业是在 Celery Flower 中启动的,但它从未真正执行过。它只是坐在那里。

我已经确认我可以从工作人员本地运行命令,所以我不确定问题可能出在哪里。

tasks.py(失败函数)

@task
def schedule_monitoring(job_id: str, action: str) -> str:
    salt = OSApi() # This is a wrapper around a REST API.
    job = Job.objects.get(pk=job_id)
    target = ('compound', f"G@hostname:{ job.network.gateway.host_name } and G@serial:{ job.network.gateway.serial_number }")

    policies = [
        'foo',
        'bar',
        'foobar',
        'barfoo'
    ]

    if action == 'start':
        salt.run(target, 'spectrum.add_to_collection', fun_args=['foo'])  
        for policy in policies:
            salt.run(target, 'spectrum.refresh_policy', fun_args=[policy])

        create_activity("Informational", "MONITORING", "Started proactive monitoring for job.", job)
    elif action == 'stop':
        salt.run(target, 'spectrum.remove_from_collection', fun_args=['bar'])
        for policy in policies:
            salt.run(target, 'spectrum.refresh_policy', fun_args=[policy])

        create_activity("Informational", "MONITORING", "Stopped proactive monitoring for job.", job)
    else:
        raise NotImplementedError

    return f"Applying monitoring action: {action.upper()} to Job: {job.job_code}"

芹菜花产量

芹菜配置

# Async
CELERY_BROKER_URL = os.environ.get('BROKER_URL', 'redis://localhost:6379')
CELERY_RESULT_BACKEND = os.environ.get('RESULT_BACKEND', 'redis://localhost:6379')
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
CELERY_ENABLE_UTC = True

下面是在应该执行它的工作人员上成功执行命令:

>>> schedule_monitoring(job.pk, 'start')
'Applying monitoring action: START to Job: Test 1'
>>> schedule_monitoring(job.pk, 'stop')
'Applying monitoring action: STOP to Job: Test 1'
>>> exit()
Waiting up to 5 seconds.
Sent all pending logs.
root@9d045ff7dfc1:/app#

从调试工人;工作开始时我看到的只是以下内容,但没有什么有趣的;

[2021-01-06 17:08:00,001: DEBUG/MainProcess] TaskPool: Apply <function _trace_task_ret at 0x7f6adbc29680> (args:('Operations.tasks.schedule_monitoring', '407e8a87-b3bf-4e8f-8a17-776a33ae5fea', {'lang': 'py', 'task': 'Operations.tasks.schedule_monitoring', 'id': '407e8a87-b3bf-4e8f-8a17-776a33ae5fea', 'shadow': None, 'eta': '2021-01-06T17:08:00+00:00', 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '407e8a87-b3bf-4e8f-8a17-776a33ae5fea', 'parent_id': None, 'argsrepr': "(UUID('11118a85-20f2-488d-9a12-b8d200ea7a74'), 'start')", 'kwargsrepr': '{}', 'origin': 'gen442@31a9de56d061', 'reply_to': '24a8dc4c-2e5c-32ce-aa3d-84392d7cbf41', 'correlation_id': '407e8a87-b3bf-4e8f-8a17-776a33ae5fea', 'hostname': 'celery@bc4bb7af894f', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}, 'args': ['11118a85-20f2-488d-9a12-b8d200ea7a74', 'start'], 'kwargs': {}}, b'[["11118a85-20f2-488d-9a12-b8d200ea7a74", "start"], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]', 'application/json', 'utf-8') kwargs:{})
[2021-01-06 17:08:00,303: DEBUG/MainProcess] basic.qos: prefetch_count->32
[2021-01-06 17:08:00,305: DEBUG/MainProcess] Task accepted: Operations.tasks.schedule_monitoring[407e8a87-b3bf-4e8f-8a17-776a33ae5fea] pid:44
[2021-01-06 17:08:00,311: DEBUG/ForkPoolWorker-3] Resetting dropped connection: storage.googleapis.com
[2021-01-06 17:08:00,383: DEBUG/ForkPoolWorker-3] https://storage.googleapis.com:443 "GET /download/storage/v1/b/foo/o/bar?alt=media HTTP/1.1" 200 96
[2021-01-06 17:08:01,228: DEBUG/MainProcess] pidbox received method enable_events() [reply_to:None ticket:None]
[2021-01-06 17:08:06,228: DEBUG/MainProcess] pidbox received method enable_events() [reply_to:None ticket:None]
[2021-01-06 17:08:11,227: DEBUG/MainProcess] pidbox received method enable_events() [reply_to:None ticket:None]
[2021-01-06 17:08:16,228: DEBUG/MainProcess] pidbox received method enable_events() [reply_to:None ticket:None]
[2021-01-06 17:08:21,227: DEBUG/MainProcess] pidbox received method enable_events() [reply_to:None ticket:None]
[2021-01-06 17:08:26,229: DEBUG/MainProcess] pidbox received method enable_events() [reply_to:None ticket:None]
[2021-01-06 17:08:31,231: DEBUG/MainProcess] pidbox received method enable_events() [reply_to:None ticket:None]

标签: pythondjangoasynchronouscelery

解决方案


我找到的解决方案是在 Celery 中创建两个队列,一个通过 Celery Beat 管理计划任务,另一个对其余的具有更高优先级。

在我创建了单独的队列之后,任务开始流动并正确完成;我的猜测是拥挤的公共汽车或工人。

要创建其他队列,请在settings.py中执行以下操作:

from kombu import Queue, Exchange

CELERYD_MAX_TASKS_PER_CHILD = 4

CELERY_DEFAULT_QUEUE = 'scheduled'
CELERY_QUEUES = (
    Queue('scheduled', Exchange('scheduled'), routing_key='sched'),
    Queue('proactive_monitoring', Exchange('proactive_monitoring'), routing_key='prmon'),
)

然后在注册任务功能时,传递您希望将它们分配到的队列:

任务.py:

@task(queue='proactive_monitoring')
def schedule_monitoring(job_id: str, action: str) -> str:

最后,确保在每个队列中至少启动一名工作人员。您可以通过在启动工作人员时传递队列来执行此操作:

celery -A proj worker -l INFO -Q proactive_monitoring

如果您在 localhost 上启动多个工作器,则应通过指定 name 属性来区分每个队列中的至少前两个:

celery -A proj worker -l INFO -Q proactive_monitoring -n prmon_first_worker


推荐阅读