Celery 是负责分布式调度的中间件。
python
环境搭建:yum install python-pip
Celery
安装:pip install -U Celery
Celery
所需依赖安装(根据需求选择安装):pip install 'celery[redis]'
celery[redis]: for using Redis as a message transport or as a result backend.
当出现连接超时的错误时,可以使用豆瓣源。例如:pip install -i https://pypi.douban.com/simple 'Celery[redis]'
Celery是分布式任务队列的重要原因在于worker可以分布在多台主机中运行。
[root@VM_0_2_centos home]# celery --version
4.4.2 (cliffs)
Celery instance的实例,通常被称为Celery application
简称为 app
。
app
: Celery执行操作的入口点,其他模块需要能够导入app
。
from celery import celery
app = Celery('tasks', broker='redis://127.0.0.1:6379')
@app.tasks
def add(x, y):
print('tasks add ...')
return x + y
app = Celery('tasks', broker='redis://127.0.0.1:6379')
第一个参数tasks
: 当前模块的名称。 必须被命名, 以便在__main__
模块中定义任务可以自动生成名称。
第二个参数broker
: 指定的broker
消息代理的URL
.
安装并启动redis, 可以观察到后台进程
[root@VM_0_2_centos test001]# ps ax|grep redis
11508 ? Sl 0:40 ./redis-server 0.0.0.0:6379
启动Celery
的服务端
[root@VM_0_2_centos test001]# celery -A tasks worker --loglevel=info
/usr/lib64/python2.7/site-packages/celery/platforms.py:801: RuntimeWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!
Please specify a different user using the --uid option.
User information: uid=0 euid=0 gid=0 egid=0
uid=uid, euid=euid, gid=gid, egid=egid,
-------------- celery@VM_0_2_centos v4.4.2 (cliffs)
--- ***** -----
-- ******* ---- Linux-3.10.0-693.el7.x86_64-x86_64-with-centos-7.7.1908-Core 2020-04-02 10:08:26
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: tasks:0x7fbb02a6a190
- ** ---------- .> transport: redis://127.0.0.1:6379//
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. tasks.add
[2020-04-02 10:08:26,747: INFO/MainProcess] Connected to redis://127.0.0.1:6379//
[2020-04-02 10:08:26,761: INFO/MainProcess] mingle: searching for neighbors
[2020-04-02 10:08:27,778: INFO/MainProcess] mingle: all alone
[2020-04-02 10:08:27,793: INFO/MainProcess] celery@VM_0_2_centos ready.
服务端启动成功!
将其变为
Supervisor
进程处,待补充。
关于celery worker的参数配置 https://blog.csdn.net/lanyang123456/article/details/77600568
调动task
可以使用delay()
方法。delay()
是apply_async()
方法的简要版本。
客户端的调用方式:
>>> from tasks import add
>>> add.delay(4, 4)
服务端的相关信息打印
[2020-04-02 10:29:31,669: INFO/MainProcess] Connected to redis://127.0.0.1:6379/0
[2020-04-02 10:29:31,696: INFO/MainProcess] mingle: searching for neighbors
[2020-04-02 10:29:32,713: INFO/MainProcess] mingle: all alone
[2020-04-02 10:29:32,722: INFO/MainProcess] celery@VM_0_2_centos ready.
[2020-04-02 10:29:39,930: INFO/MainProcess] Received task: tasks.add[8bbabe07-0941-4369-83f7-afa7e0dd099a]
[2020-04-02 10:29:39,931: WARNING/ForkPoolWorker-1] tasks add ...
[2020-04-02 10:29:39,932: INFO/ForkPoolWorker-1] Task tasks.add[8bbabe07-0941-4369-83f7-afa7e0dd099a] succeeded in 0.000679905992001s: 8
参考链接, async方法 https://docs.celeryproject.org/en/stable/userguide/calling.html#guide-calling
如果需要将服务端的相关信息,则配置backend
的相关信息。
将tasks.py
中app = Celery('tasks', broker='redis://127.0.0.1:6379')
修改为app = Celery('tasks', backend = 'redis://127.0.0.1:6379/0', broker='redis://127.0.0.1:6379')
重新启动服务端:
celery -A tasks worker --loglevel=info
启动客户端调用add
方法:
>>> from tasks import add
>>> res = add.delay(4, 4)
>>> res.ready()
True
>>> res.get()
8
此处结果已经被存储下来。
部分方法的解释说明:
result.ready() # tasks是否执行完成
result.get(timeout = 1) # 等待结果完成
result.get(propagate = False) # 忽略异常
result.traceback # 追踪异常
result.get(timeout = 1)
超出timeout
时间,则报出异常提示。>>> res = add.delay(2, 3)
>>> res.get(timeout = 15)
Traceback (most recent call last):
File "" , line 1, in <module>
File "/usr/lib64/python2.7/site-packages/celery/result.py", line 228, in get
on_message=on_message,
File "/usr/lib64/python2.7/site-packages/celery/backends/asynchronous.py", line 200, in wait_for_pending
for _ in self._wait_for_pending(result, **kwargs):
File "/usr/lib64/python2.7/site-packages/celery/backends/asynchronous.py", line 272, in _wait_for_pending
raise TimeoutError('The operation timed out.')
celery.exceptions.TimeoutError: The operation timed out.
result.get(propagate = False)
仅获取异常信息>>> res.get(propagate=False)
TypeError("unsupported operand type(s) for +: 'int' and 'str'",)
result.traceback
追踪异常报告。>>> from tasks import add
>>> res = add.delay(1, 2)
>>> res.traceback
'Traceback (most recent call last):\n File "/usr/lib64/python2.7/site-packages/celery/app/trace.py", line 385, in trace_task\n R = retval = fun(*args, **kwargs)\n File "/usr/lib64/python2.7/site-packages/celery/app/trace.py", line 650, in __protected_call__\n return self.run(*args, **kwargs)\n File "/home/test/test001/tasks.py", line 11, in add\n raise(\'timeout ...\')\nTypeError: exceptions must be old-style classes or derived from BaseException, not str\n'
1. 参考链接,上述总体流程 https://docs.celeryproject.org/en/stable/getting-started/first-steps-with-celery.html#first-steps
同样可以为Celery
设置一些配置信息。
例如,通过更改task_serializer
设置配置用于序列化任务负载的默认序列化器。
app.conf.task_serializer = 'json'
这个没有实践,对json这个不咋熟,挖个坑。
如果,有多个配置信息需要设置可以使用:
app.conf.update(
task_serializer = 'json',
accept_content = ['json'],
result_serializer = 'json',
timezone = 'Europe/Oslo',
enable_utc = True,
)
通常,在一个项目中,有专门的配置文件来存放配置信息。一般是celeryconfig.py
文件中:
broker_url = 'redis://127.0.0.1:6379',
result_backend = 'redis://127.0.0.1:6379/0'
task_serializer = 'json'
result_serializer = 'json'
accept_content = ['json']
timezone = 'Europe/Oslo'
enable_utc = True
为了app
能够自动导入配置文件,app
中通常调用config_from_object()
方法,来导入配置信息。
app.config_from_object('celeryconfig')
为了确认配置信息是否有误,是否包含语法错误,可以import
进行确认。
python -m celeryconfig
命令进行确认。
例如,在celeryconfig.py
文件中随机写入乱序字符。使用python -m celeryconfig
命令可以看到:
[root@VM_0_2_centos test001]# python -m celeryconfig
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/test/test001/celeryconfig.py", line 4, in <module>
asdfasdfas
NameError: name 'asdfasdfas' is not defined
如果没有问题不会报错。
当前项目示例的目录结构:
[root@VM_0_2_centos test001]# tree
.
|-- celeryconfig.py
|-- tasks.py
0 directories, 3 files
tasks.py文件内容如下:
from celery import Celery
import time
# app = Celery('tasks', backend = 'redis://127.0.0.1:6379/0', \
# broker='redis://127.0.0.1:6379')
app = Celery('tasks') # 定义模块
app.config_from_object('celeryconfig') # 导入配置文件
@app.task
def add(x, y):
print('tasks add ...')
return x + y
celeryconfig.py文件内容如下:
BROKER_URL = 'redis://127.0.0.1:6379'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379/0'
验证配置文件是否正确
[root@VM_0_2_centos test001]# python -m celeryconfig
启动worker
celery -A tasks worker --logconfig=info
客户端调用add
方法
>>> from tasks import add
>>> res = add.delay(1, 3)
>>> res.ready()
True
>>> res.get()
4
类似的相关配置,还有:
- 将异常的任务路由到专用队列中, **这里看似是设置一个低的优先级。**待补坑
task_routes = {
'tasks.add': 'low-priority',
}
- 限制任务的执行次数
task_annotations = {
'tasks.add': {'rate_limit': '10/m'}
}
示例:
服务端的配置文件:
# backend and broker
BROKER_URL = 'redis://127.0.0.1:6379'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379/0'
CELERY_ANNOTATIONS = {
'tasks.add': {'rate_limit': '3/m'}
}
客户端执行结果:
>>> res0 = add.delay(1, 3)
>>> res1 = add.delay(1, 3)
>>> res2 = add.delay(1, 3)
>>> res3 = add.delay(1, 3)
>>> res0.ready()
True
>>> res1.ready()
True
>>> res2.ready()
False
>>> res3.ready()
False
这里测试了2次,现在是2, 3被阻塞住了,而0,1可以执行。 所以,这里是不多于3次?
- 或者直接在命令行中,做补充。
celery -A tasks control rate_limit tasks.add 10/m
1. Celery 官方文档: http://docs.celeryproject.org/en/latest/index.html
2. Celery 官方安装文档: http://docs.celeryproject.org/en/latest/getting-started/introduction.html#installation
3. 简书中的学习文档: https://www.jianshu.com/p/620052aadbff
4. CSDN中的学习文档: https://blog.csdn.net/chenqiuge1984/article/details/80127446