Celery部署安装

Celery

Celery 是负责分布式调度的中间件。

Celery的部署安装

  1. python 环境搭建:
    yum install python-pip
  2. Celery安装:
    pip install -U Celery
  3. Celery所需依赖安装(根据需求选择安装):
    pip install 'celery[redis]'
    celery[redis]: for using Redis as a message transport or as a result backend.
    
    当出现连接超时的错误时,可以使用豆瓣源。例如:
    pip install -i https://pypi.douban.com/simple 'Celery[redis]'

Celery的架构

Celery是分布式任务队列的重要原因在于worker可以分布在多台主机中运行。

  1. Worker: 任务执行单元, 为Celery提供的任务执行的单元, worker并发的运行在分布式的系统的节点上。
  2. Task: 包含异步任务和定时任务。异步任务,通常在业务逻辑中被触发并发往任务队列。定时任务,由Celery Beat进程周期性地将任务发往任务队列。
  3. Broker: 消息中间件接收任务生产者发来的消息(任务),将任务存入队列
  4. Backend: 存储任务的执行结果

Celery的版本

[root@VM_0_2_centos home]# celery --version
4.4.2 (cliffs)

创建Celery实例

Simple

Simple-Application

Celery instance的实例,通常被称为Celery application 简称为 app
app: Celery执行操作的入口点,其他模块需要能够导入app

from celery import celery

app = Celery('tasks', broker='redis://127.0.0.1:6379')

@app.tasks
def add(x, y):
    print('tasks add ...')
    return x + y
  1. app = Celery('tasks', broker='redis://127.0.0.1:6379')
  • 第一个参数tasks: 当前模块的名称。 必须被命名, 以便在__main__模块中定义任务可以自动生成名称。

  • 第二个参数broker: 指定的broker消息代理的URL.
    安装并启动redis, 可以观察到后台进程

[root@VM_0_2_centos test001]# ps ax|grep redis
11508 ?        Sl     0:40 ./redis-server 0.0.0.0:6379

Simple-Running the Celery worker server

启动Celery的服务端

[root@VM_0_2_centos test001]# celery -A tasks worker --loglevel=info
/usr/lib64/python2.7/site-packages/celery/platforms.py:801: RuntimeWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!

Please specify a different user using the --uid option.

User information: uid=0 euid=0 gid=0 egid=0

  uid=uid, euid=euid, gid=gid, egid=egid,

 -------------- celery@VM_0_2_centos v4.4.2 (cliffs)
--- ***** -----
-- ******* ---- Linux-3.10.0-693.el7.x86_64-x86_64-with-centos-7.7.1908-Core 2020-04-02 10:08:26
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app:         tasks:0x7fbb02a6a190
- ** ---------- .> transport:   redis://127.0.0.1:6379//
- ** ---------- .> results:     disabled://
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery


[tasks]
  . tasks.add

[2020-04-02 10:08:26,747: INFO/MainProcess] Connected to redis://127.0.0.1:6379//
[2020-04-02 10:08:26,761: INFO/MainProcess] mingle: searching for neighbors
[2020-04-02 10:08:27,778: INFO/MainProcess] mingle: all alone
[2020-04-02 10:08:27,793: INFO/MainProcess] celery@VM_0_2_centos ready.

服务端启动成功!

将其变为Supervisor进程处,待补充。

关于celery worker的参数配置 https://blog.csdn.net/lanyang123456/article/details/77600568

Simple-Calling the task

调动task可以使用delay()方法。delay()apply_async()方法的简要版本。

客户端的调用方式:

>>> from tasks import add
>>> add.delay(4, 4)

服务端的相关信息打印

[2020-04-02 10:29:31,669: INFO/MainProcess] Connected to redis://127.0.0.1:6379/0
[2020-04-02 10:29:31,696: INFO/MainProcess] mingle: searching for neighbors
[2020-04-02 10:29:32,713: INFO/MainProcess] mingle: all alone
[2020-04-02 10:29:32,722: INFO/MainProcess] celery@VM_0_2_centos ready.
[2020-04-02 10:29:39,930: INFO/MainProcess] Received task: tasks.add[8bbabe07-0941-4369-83f7-afa7e0dd099a]
[2020-04-02 10:29:39,931: WARNING/ForkPoolWorker-1] tasks add ...
[2020-04-02 10:29:39,932: INFO/ForkPoolWorker-1] Task tasks.add[8bbabe07-0941-4369-83f7-afa7e0dd099a] succeeded in 0.000679905992001s: 8

参考链接, async方法 https://docs.celeryproject.org/en/stable/userguide/calling.html#guide-calling

Simple-Keeping Results

如果需要将服务端的相关信息,则配置backend的相关信息。

tasks.pyapp = Celery('tasks', broker='redis://127.0.0.1:6379')修改为app = Celery('tasks', backend = 'redis://127.0.0.1:6379/0', broker='redis://127.0.0.1:6379')

重新启动服务端:
celery -A tasks worker --loglevel=info

启动客户端调用add方法:

>>> from tasks import add
>>> res = add.delay(4, 4)
>>> res.ready()
True
>>> res.get()
8

此处结果已经被存储下来。

部分方法的解释说明:

result.ready()                  # tasks是否执行完成
result.get(timeout = 1)         # 等待结果完成
result.get(propagate = False)   # 忽略异常
result.traceback                # 追踪异常
  1. result.get(timeout = 1) 超出timeout时间,则报出异常提示。
>>> res = add.delay(2, 3)
>>> res.get(timeout = 15)
Traceback (most recent call last):
  File "", line 1, in <module>
  File "/usr/lib64/python2.7/site-packages/celery/result.py", line 228, in get
    on_message=on_message,
  File "/usr/lib64/python2.7/site-packages/celery/backends/asynchronous.py", line 200, in wait_for_pending
    for _ in self._wait_for_pending(result, **kwargs):
  File "/usr/lib64/python2.7/site-packages/celery/backends/asynchronous.py", line 272, in _wait_for_pending
    raise TimeoutError('The operation timed out.')
celery.exceptions.TimeoutError: The operation timed out.
  1. result.get(propagate = False) 仅获取异常信息
>>> res.get(propagate=False)
TypeError("unsupported operand type(s) for +: 'int' and 'str'",)
  1. result.traceback 追踪异常报告。
>>> from tasks import add
>>> res = add.delay(1, 2)
>>> res.traceback
'Traceback (most recent call last):\n  File "/usr/lib64/python2.7/site-packages/celery/app/trace.py", line 385, in trace_task\n    R = retval = fun(*args, **kwargs)\n  File "/usr/lib64/python2.7/site-packages/celery/app/trace.py", line 650, in __protected_call__\n    return self.run(*args, **kwargs)\n  File "/home/test/test001/tasks.py", line 11, in add\n    raise(\'timeout ...\')\nTypeError: exceptions must be old-style classes or derived from BaseException, not str\n'

1. 参考链接,上述总体流程 https://docs.celeryproject.org/en/stable/getting-started/first-steps-with-celery.html#first-steps

Simple-Configuration

同样可以为Celery设置一些配置信息。
例如,通过更改task_serializer设置配置用于序列化任务负载的默认序列化器。

app.conf.task_serializer = 'json' 这个没有实践,对json这个不咋熟,挖个坑。

如果,有多个配置信息需要设置可以使用:

app.conf.update(
    task_serializer = 'json',
    accept_content = ['json'],
    result_serializer = 'json',
    timezone = 'Europe/Oslo',
    enable_utc = True,
)

通常,在一个项目中,有专门的配置文件来存放配置信息。一般是celeryconfig.py文件中:

broker_url = 'redis://127.0.0.1:6379',
result_backend = 'redis://127.0.0.1:6379/0'

task_serializer = 'json'
result_serializer = 'json'
accept_content = ['json']
timezone = 'Europe/Oslo'
enable_utc = True

为了app能够自动导入配置文件,app中通常调用config_from_object()方法,来导入配置信息。

app.config_from_object('celeryconfig')

为了确认配置信息是否有误,是否包含语法错误,可以import进行确认。

python -m celeryconfig命令进行确认。

例如,在celeryconfig.py文件中随机写入乱序字符。使用python -m celeryconfig命令可以看到:

[root@VM_0_2_centos test001]# python -m celeryconfig
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/test/test001/celeryconfig.py", line 4, in <module>
    asdfasdfas
NameError: name 'asdfasdfas' is not defined

如果没有问题不会报错。


当前项目示例的目录结构:

[root@VM_0_2_centos test001]# tree
.
|-- celeryconfig.py
|-- tasks.py

0 directories, 3 files

tasks.py文件内容如下:

from celery import Celery
import time 

# app = Celery('tasks', backend = 'redis://127.0.0.1:6379/0', \
#         broker='redis://127.0.0.1:6379')
app = Celery('tasks')                           # 定义模块
app.config_from_object('celeryconfig')          # 导入配置文件

@app.task
def add(x, y):
    print('tasks add ...')
    return x + y

celeryconfig.py文件内容如下:

BROKER_URL = 'redis://127.0.0.1:6379'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379/0'
  1. 验证配置文件是否正确
    [root@VM_0_2_centos test001]# python -m celeryconfig

  2. 启动worker
    celery -A tasks worker --logconfig=info

  3. 客户端调用add方法

>>> from tasks import add
>>> res = add.delay(1, 3)
>>> res.ready()
True
>>> res.get()
4

类似的相关配置,还有:

  1. 将异常的任务路由到专用队列中, **这里看似是设置一个低的优先级。**待补坑
task_routes = {
    'tasks.add': 'low-priority',
}
  1. 限制任务的执行次数
task_annotations = {
    'tasks.add': {'rate_limit': '10/m'}
}

示例:
服务端的配置文件:

# backend and broker
BROKER_URL = 'redis://127.0.0.1:6379'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379/0'

CELERY_ANNOTATIONS = {
    'tasks.add': {'rate_limit': '3/m'}
}

客户端执行结果:

>>> res0 = add.delay(1, 3)
>>> res1 = add.delay(1, 3)
>>> res2 = add.delay(1, 3)
>>> res3 = add.delay(1, 3)
>>> res0.ready()
True
>>> res1.ready()
True
>>> res2.ready()
False
>>> res3.ready()
False

这里测试了2次,现在是2, 3被阻塞住了,而0,1可以执行。 所以,这里是不多于3次?


  1. 或者直接在命令行中,做补充。
celery -A tasks control rate_limit tasks.add 10/m

参考文档

1. Celery 官方文档: http://docs.celeryproject.org/en/latest/index.html

2. Celery 官方安装文档: http://docs.celeryproject.org/en/latest/getting-started/introduction.html#installation

3. 简书中的学习文档: https://www.jianshu.com/p/620052aadbff

4. CSDN中的学习文档: https://blog.csdn.net/chenqiuge1984/article/details/80127446

你可能感兴趣的:(Celery,分布式,中间件)