So, while developing a web application, there comes a time when we need to process some of the tasks in the background, perhaps asynchronously. For example, your user would upload photos and the app would post them to multiple social networks. We would definitely want to offload the uploading task to some background workers.
那么,在开发一个网页时,我们需要在后台执行一些耗时的任务,或许可以是异步的。例如,你的用户会上传照片并且应用会推送它们到多个社交网络。我们一定要把上传的任务给一些背后工作者来完成。
Django and Celery makes background task processing a breeze. In this article, we shall see how we can setup Django and Celery to start processing our background tasks. We would use Redis to maintain our task queue.
Django和Celery 让后台任务变得简单。在这篇文章中,我们将会看到如何安装Django和Celery开始处理我们的后台任务。我们将使用redis来维持我们的任务队列。
How does it work?
We define some tasks in our application. These tasks are expected to run for a pretty long time.
We run the celery workers. Celery knows how to find and load these tasks. The workers keep waiting on us.
We add some jobs to the workers queue from our web app. The workers now have something to work on. So they start taking the jobs from the queue and start processing them.
We can query the status of the jobs from our web app to know whats happening.
The easy to use Python API makes it really simple to use. You don’t need any specialisation or anything in Redis.
Setting Up
Let’s first install the Redis server:
sudo apt-get install redis-server
sudo apt - get install redis - server
The version that comes from Ubuntu official repo is quite old. You can install the latest version from 3rd party PPAs.
Install Celery with Redis support:
pip install celery-with-redis
pip install celery - with - redis
And then install django-celery package:
pip install django-celery
pip install django - celery
Configuration
Add “djcelery” to your installed apps list:
INSTALLED_APPS = (
‘django.contrib.auth’,
‘django.contrib.contenttypes’,
‘django.contrib.sessions’,
‘django.contrib.sites’,
‘django.contrib.messages’,
‘django.contrib.staticfiles’,
'app',
'djcelery', # Must be added to the INSTALLED_APPS
'south',
)
INSTALLED_APPS = (
'django.contrib.auth' ,
'django.contrib.contenttypes' ,
'django.contrib.sessions' ,
'django.contrib.sites' ,
'django.contrib.messages' ,
'django.contrib.staticfiles' ,
'app' ,
'djcelery' , # Must be added to the INSTALLED_APPS
'south' ,
)
Modify your main app’s settings.py file to add the celery specific settings:
import djcelery
djcelery.setup_loader()
BROKER_URL = ‘redis://localhost:6379/0’
CELERY_RESULT_BACKEND = ‘redis://localhost:6379/0’
CELERY_ACCEPT_CONTENT = [‘json’]
CELERY_TASK_SERIALIZER = ‘json’
CELERY_RESULT_SERIALIZER = ‘json’
import djcelery
djcelery . setup_loader ( )
BROKER_URL = ‘redis://localhost:6379/0’
CELERY_RESULT_BACKEND = ‘redis://localhost:6379/0’
CELERY_ACCEPT_CONTENT = [ ‘json’ ]
CELERY_TASK_SERIALIZER = ‘json’
CELERY_RESULT_SERIALIZER = ‘json’
Now, inside your main application directory (the directory in which settings.py is located), create a file named “celery.py” with these contents:
from future import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault(‘DJANGO_SETTINGS_MODULE’, ‘project.settings’)
app = Celery(‘project’)
app.config_from_object(‘django.conf:settings’)
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
from future import absolute_import
import os
from celery import Celery
from django . conf import settings
os . environ . setdefault ( ‘DJANGO_SETTINGS_MODULE’ , ‘project.settings’ )
app = Celery ( ‘project’ )
app . config_from_object ( ‘django.conf:settings’ )
app . autodiscover_tasks ( lambda : settings . INSTALLED_APPS )
The above codes do a few things:
It creates our own Celery instance.
We ask the celery instance to load necessary configs from our project’s settings file.
We make the instance auto discover tasks from our INSTALLED_APPS.
Also let’s modify the “init.py” file in the same directory to make the celery app available more easily:
from future import absolute_import
from .celery import app as celery_app
from future import absolute_import
from . celery import app as celery_app
This would allow us to use the same app instance for shared tasks across reusable django apps.
Defining Tasks
Now let’s create a tasks.py file in one of our INSTALLED_APPS and add these contents:
from project import celery_app
from time import sleep
@celery_app.task()
def UploadTask(message):
# Update the state. The meta data is available in task.info dicttionary
# The meta data is useful to store relevant information to the task
# Here we are storing the upload progress in the meta.
UploadTask.update_state(state='PROGRESS', meta={'progress': 0})
sleep(30)
UploadTask.update_state(state='PROGRESS', meta={'progress': 30})
sleep(30)
return message
def get_task_status(task_id):
# If you have a task_id, this is how you query that task
task = UploadTask.AsyncResult(task_id)
status = task.status
progress = 0
if status == u'SUCCESS':
progress = 100
elif status == u'FAILURE':
progress = 0
elif status == 'PROGRESS':
progress = task.info['progress']
return {'status': status, 'progress': progress}
from project import celery_app
from time import sleep
@ celery_app . task ( )
def UploadTask ( message ) :
# Update the state. The meta data is available in task.info dicttionary
# The meta data is useful to store relevant information to the task
# Here we are storing the upload progress in the meta.
UploadTask . update_state ( state = 'PROGRESS' , meta = { 'progress' : 0 } )
sleep ( 30 )
UploadTask . update_state ( state = 'PROGRESS' , meta = { 'progress' : 30 } )
sleep ( 30 )
return message
def get_task_status ( task_id ) :
# If you have a task_id, this is how you query that task
task = UploadTask . AsyncResult ( task_id )
status = task . status
progress = 0
if status == u 'SUCCESS' :
progress = 100
elif status == u 'FAILURE' :
progress = 0
elif status == 'PROGRESS' :
progress = task . info [ 'progress' ]
return { 'status' : status , 'progress' : progress }
Now we have defined our own celery app, we have our tasks. It’s now time to launch the workers and start adding tasks.
Processing Tasks
Before we can start processing tasks, we have to launch the celery daemon first. This is how we do it:
celery worker –app=project.celery:app –loglevel=INFO
celery worker – app = project .celery : app – loglevel = INFO
Here, we tell celery to use the celery instance we defined and configured earlier. Here “project” is the main app, the package that contains our settings.py along with celery.py. The “app” the variable name which holds the celery instance.
Now let’s use the Django shell to add and query jobs:
$ python manage.py shell
[snipped]
from app.tasks import *
Please notice the “delay” method, which is a handy shortcut to apply_async.
It allows us to call the task with exactly the same parameters
as the original function. If you need more custom options, use apply_async.
t = UploadTask.delay(“hello world!”)
t is now a AsyncResult object. t.id is the task id for the task
you can directly use t to query the task. say - t.status
get_task_status(t.id)
{‘status’: u’PROGRESS’, ‘progress’: 0}
(After 35 secs delay)
get_task_status(t.id)
{‘status’: u’PROGRESS’, ‘progress’: 30}
(After waiting for another 35 secs or so)
get_task_status(t.id)
{‘status’: u’SUCCESS’, ‘progress’: 100}
$ python manage . py shell
[ snipped ]
from app . tasks import *
Please notice the “delay” method, which is a handy shortcut to apply_async.
It allows us to call the task with exactly the same parameters
as the original function. If you need more custom options, use apply_async.
t = UploadTask . delay ( “hello world!” )
t is now a AsyncResult object. t.id is the task id for the task
you can directly use t to query the task. say - t.status
get_task_status ( t . id )
{ ‘status’ : u ‘PROGRESS’ , ‘progress’ : 0 }
(After 35 secs delay)
get_task_status ( t . id )
{ ‘status’ : u ‘PROGRESS’ , ‘progress’ : 30 }
(After waiting for another 35 secs or so)
get_task_status ( t . id )
{ ‘status’ : u ‘SUCCESS’ , ‘progress’ : 100 }
So as we can see, out task was processed by celery. And we could easily query the status. We would generally use the meta data to store any task related information.