Capistrano: Automating Application Deployment

Capistrano: Automating Application Deployment

Capistrano: Automating Application Deployment

Application deployment is one of those things that becomes more and more complicated as the scale of your application increases. With just a single box running your database and your application, it’s quite simple. But when you start putting your database on a different server, and then separating your web servers from your app servers, and eventually splitting your database into master and slave servers… It can get to where you almost don’t want to deploy your application any more.

Capistrano is a standalone utility that can also integrate nicely with Rails. You simply provide Capistrano with a deployment “recipe” that describes your various servers and their roles, and voila! You magically have single-command deployment. It even allows you to roll a bad version out of production and revert back to the previous release.

It should be stated that the concepts that Capistrano uses and encourages are not specific to Rails, or even Ruby. They are common-sense practices that are applicable to a variety of environments. In fact, you’ll find that there is very little that is Rails-specific about Capistrano, aside from the fact that it is in Rails that it found its genesis. No matter where you are or what environment you are using, Capistrano can probably help ease your deployment pains.

Of course, we hope you’re using Ruby on Rails…

Introduction

What is Capistrano?

To say that Capistrano is a utitilty for deploying web applications would be akin to saying that computers are machines that let you type school papers. It’s a gross understatement. Capistrano is actually capable of doing far, far more than just deploying web apps. However, because deployment of web apps is what Capistrano was originally created for, this manual will focus on that, and then spend a little time at the end showing some of the other possibilities.

Historically, Capistrano was originally called SwitchTower. The name was changed in March 2006 in response to a trademark conflict.

What can it do?

Ultimately, Capistrano is a utility that can execute commands in parallel on multiple servers. It allows you to define tasks, which can include commands that are executed on the servers. You can also define roles for your servers, and then specify that certain tasks apply only to certain roles.

Capistrano is very configurable. The default configuration includes a set of basic tasks applicable to web deployment. (More on these tasks will be said later.)

Capistrano can do just about anything you can write shell script for. You just run those snippets of shell script on remote servers, possibly interacting with them based on their output. You can also upload files, and Capistrano includes some basic templating to allow you to dynamically create and deploy things like maintenance screens, configuration files, shell scripts, and more.

What assumptions does it make?

As with the rest of Rails, Capistrano makes many assumptions, both about your code, and the way you do things with it (like deployment).

There are basically two levels of assumptions in Capistrano: core assumptions, and assumptions made by the default tasks.

The core assumptions of Capistrano tend to be quite general (though there are some exceptions), and are not usually possible to override. They are:

  • You are interacting with at least one remote server.
  • You may need to tunnel through a gateway server to access your target server.
  • You are using SSH to connect to the servers.
  • The remote server is capable of understanding POSIX shell commands. (Windows, by default, does not fall into this category. Neither do shells like csh and tcsh. Sorry.)
  • The password for all servers is the same.
  • Some things you only want to execute on a subset of your production environment, rather than on all of your production servers.

The assumptions made by the default tasks are more specific, but are either configurable or overridable. Some of them are:

  • You are deploying a web application.
  • You are using Ruby on Rails to develop your application.
  • You are using subversion to manage your source code.
  • You are deploying your application to ”/u/apps/#{appname}” on every machine.
  • You are using FastCGI to power your application.
  • You are fronting your app with either lighttpd or apache.

This manual will hold to these same assumptions, but will also show (where applicable) how to configure or override them.

Quick Start

Getting started

An example is worth a lot. This chapter will describe very simple one-box environment, and demonstrate how to use Capistrano to manage it. This will introduce some of the basic concepts, which we can build on in the next chapter.

The production environment is this: a single production machine (we’ll call it “simple.capistrano.com”), running MySQL 4.x, using FastCGI and Apache. The application uses the lateset bleeding-edge version of Rails.

This imaginary application (we’ll call it “Flipper”) is stored in a subversion repository at http://svn.capistrano.com/flipper/trunk.

One disclaimer: the default Capistrano tasks assume a distributed environment in which the FastCGI processes are managed separately from the web server. For the sake of simplicity, we’ll assume Apache is managing the FastCGI processes for this example, and delve into the more complex setup in the next chapter.

Installing Capistrano

Capistrano is most easily installed as a gem. Just do gem install capistrano and you’re good to go.

Once you have Capistrano installed, you should be able to invoke the cap utility. To ensure it is installed correctly, just execute cap -h. You should see a help screen. (If you don’t, Capistrano was either not installed, or not installed correctly.)

Now that Capistrano is installed, you can “capistranize” your rails application in one simple command:

  cap --apply-to /path/to/my/app MyApplicationName

The /path/to/my/app is the location of the base directory of your application—it’s “rails root”. MyApplicationName is the name of your application. (You can change this later, easily, so if you don’t know what to put here right now, just put “application”.)

And now you should be set to get started!

Deployment Recipe

The deployment recipe is to Capistrano, as the Rakefile is to Rake. It describes the tasks that are to be performed, and the subsets of servers they are to be performed on. By default, it is called deploy.rb and resides in the config directory.

When developing a deployment recipe, it helps to have a template to work from. Rails provides a “deployment” generator that creates a default deployment recipe in the config/deploy.rb file, which you can tailor to your specific needs.

Our deployment script starts by setting the two required variables:

Required variables for deployment recipes [ruby]
1
2
																		set
																		:application
																		,
																		"
																		flipper
																		"
																		set
																		:repository
																		,
																		"
																		http://svn.capistrano.com/flipper/trunk
																		"
																

The :application variable names the application being deployed. This is used for various things, but most notably to describe the path being deployed to on the remote server.

The :repository variable is the location of the (subversion, in this case) repository that stores our code. (Note that, for subversion, you cannot use file:// repositories with Capistrano.)

Once you’ve defined the application and repository, all you need to define further is the list of roles (and the servers in each role). In our case, we only have one server, so that server is going to be pulling multiple duties:

Defining our roles [ruby]
1
2
3
																		role
																		:app
																		,
																		"
																		simple.capistrano.com
																		"
																		role
																		:web
																		,
																		"
																		simple.capistrano.com
																		"
																		role
																		:db
																		,
																		"
																		simple.capistrano.com
																		"
																

You can define whatever roles you want, but the default Capistrano tasks look for those three: :app, :web, and :db. The :app role describes which servers are acting as the application servers (the servers running the FastCGI instances). The :web role describes the servers running Apache, and the :db role describes the servers running your database(s). In our case, they’re all the same box.

That’s it! Your recipe is ready to use. The default tasks provided by Capistrano are sufficient for what we need to do right now, but we’ll demonstrate doing some custom tasks shortly.

Setup

Okay, now that we’ve got a basic deployment recipe going, we can try it out by executing the setup task. This task will set up the basic deployment directory structure on our production box for us.

The deployment directory structure is:

Deployment directory structure [chart]
[deploy_to]
  +- releases
  |    +- 20050725121411
  |    +- 20050801090107
  |    +- 20050802231414
  |    ...
  |    +- 20050824141402
  |    |   +- Rakefile
  |    |   |  app
  |    |   |  config
  |    |   |  db
  |    |   |  lib
  |    |   |  log --> [deploy_to]/shared/log
  |    |   |  public
  |    |   |    +- ...
  |    |   |       system --> [deploy_to]/shared/system
  |    |   |       ...
  |    |   |  script
  |    |   |  test
  |    |   |  vendor
  |
  +- shared
  |    +- log
  |    +- system
  |
  + current --> [deploy_to]/releases/20050824141402

The [deploy_to] represents the root of your deployment path. By default, Capistrano uses "/u/apps/#{application}" as the root of the deployment path, but you can specify whatever root you want via the :deploy_to variable in your recipe file:

Custom deployment root [ruby]
										set
										:deploy_to
										,
										"
										/var/www/flipper
										"
								

Beneath the deployment root are two other directories, releases and shared. The releases directory contains one subdirectory for every released version of your software. Each subdirectory is named for the time (in Universal Standard Time) at which it was deployed.

The shared directory contains directories and files that should be shared between multiple releases, like log files and static system HTML files (like a “down for maintenance page”).

Finally, the deployment root contains a symlink called current that points the current release.

It isn’t necessary to build all these directories yourself. You can use the default setup Capistrano task to do it for you. Just type the following:

Executing the setup task [shell]
rake remote:exec ACTION=setup

This will prompt you for your server’s password. (If you don’t want the password to echo to the screen as you type it, be sure you have the termios gem installed—only guaranteed to work in *nix environments.)

After you enter the password, Capistrano will go out to your server and build the necessary directories, chmod-ing them as necessary.

Nifty, huh? But this is only the beginning…

Apache Configuration

We should take a moment here and make sure we’ve got Apache configured for our application. Anticipating only a moderate load (at least initially), we figure five FastCGI instances should be enough for getting on with. The following snippets of Apache configuration should be sufficient to configure our web server for that:

Configuration snippets [apache]
  ...
  LoadModule fastcgi_module     libexec/apache/mod_fastcgi.so
  ...
  AddModule mod_fastcgi.c
  ...
  AddHandler fastcgi-script fcgi
  ...
  FastCgiIpcDir /tmp/fcgi_ipc
  FastCgiServer /u/apps/flipper/current/public/dispatch.fcgi -initial-env RAILS_ENV=production -processes 5 -idle-timeout 600
  ...

Of course, we’ll also need to configure vhosts as appropriate using (as shown above) /u/apps/flipper/current as the RAILS_ROOT of our application.

Deploying

Okay, let’s look at writing our first custom task. We can’t use Capistrano’s default deployment task because it assumes we are using a distributed set up. As a result, it will try to restart the application in a way incompatible with our single-server setup.

To make it work, we’ll just add the following task to our deploy.rb file:

Redefining the restart task [ruby]
1
2
3
4
																		desc
																		"
																		Restart the web server
																		"
																		task
																		:restart
																		,
																		:roles
																		=>
																		:app
																		do
																		sudo
																		"
																		apachectl graceful
																		"
																		end
																

The first line gives us a description of the task we are defining. (You can see all available deployment tasks, and their descriptions, by typing rake show_deploy_tasks.) The next line defines a task named restart, that only applies to servers in the app role. When invoked, it will execute apachectl graceful on all app servers, via sudo.

Once you’ve got that task defined, we can try it out. Just type:

Deploy the application [shell]
  rake deploy

This will (again) prompt for your password for the remote server, and then will do the following things:

  • Checkout the latest revision of your application to the releases directory
  • Update (or create) the current symlink so it points to this new revision
  • Invoke the restart task that we just redefined

The checkout/symlink process is roughly atomic, so if any part of those two tasks fail, the symlink will be restored to the prior revision and the newly checked out revision deleted.

Note that by default, Capistrano checks out the latest revision of your code. If you ever want to checkout a revision other than the latest, you can specify the revision you want via the :revision variable (see chapter 4 for more about variables).

Rolling back a release

So, let’s assume we’ve gone through this process a few times, and everything has gone well. Suddenly, though, we push a release into production that is a lemon—things start going crazy and we need to get it out fast.

Simple. Just type:

Rolling back a release from production [shell]
  rake rollback

This will go to the remote server, update the current symlink to point to the previous revision, delete the bad revision from off of the server, and then restart the web server.

A More Complicated Example

Getting started

In the previous chapter, we looked at a simple deployment environment that consisted of a single production box. Although this is a valid environment for small deployments (Basecamp started out this way, for example), it rapidly becomes untenable as an application grows.

This chapter will revisit the “flipper” application from the previous chapter. Let’s assume a year has passed, and we have much higher usage. The application has definitely outgrown it’s single box. Instead, we’ll do the following:

  • Two web servers accessed via load-balancers. The web servers will be running Apache.
  • Two application servers accessed via load-balancers from the web servers. The application servers run standalone FastCGI processes.
  • Two database servers, one as master, one as slave.

This configuration should allow us to scale nicely to much higher usage. And Capistrano allows us to deploy to this kind of configuration with very little effort.

Deployment Recipe

The first thing we need to do is revisit our deployment recipe. The roles, in particular need to be completely revisited, and we can also get rid of our custom restart task. The complete deploy.rb file looks like this:

Multi-server deployment recipe [ruby]
1
2
3
4
5
6
7
																		set
																		:application
																		,
																		"
																		flipper
																		"
																		set
																		:repository
																		,
																		"
																		http://svn.capistrano.com/flipper/trunk
																		"
																		role
																		:web
																		,
																		"
																		www1.capistrano.com
																		",
																		"
																		www2.capistrano.com
																		"
																		role
																		:app
																		,
																		"
																		app1.capistrano.com
																		",
																		"
																		app2.capistrano.com
																		"
																		role
																		:db
																		,
																		"
																		db1.capistrano.com
																		",
																		:primary
																		=>
																		true
																		role
																		:db
																		,
																		"
																		db2.capistrano.com
																		"
																

We now have two servers (www1 and www2) in the web role, and two servers (app1 and app2) in the app role. Fairly self-explanatory.

Looking at the db role, though, we have one server (db1) with the extra information :primary => true. This tells Capistrano that some tasks should be executed only on this server, and not on all db servers. (This is useful for things like migrations, where you only want them applied to the primary copy of the data. You could also add :slave => true to the db2 server and then define a backup task that only ran on the slave.)

We can now run the setup task again to make sure our directories are all set up on all six machines. Just type:

Running setup [shell]
  rake remote:exec ACTION=setup

Spinner

Rails comes with three utilities (spinner, spawner, and reaper) for managing your FastCGI processes.

The spinner script is located in the script/process directory of your application. (If your application doesn’t have this script, you probably just need to update your application to the latest version. Rails 0.13.1 was the last version of Rails without the scripts.)

The spinner script is intended to be a continually running process that watches the spawned FastCGI processes. When you start the spinner, you also specify a command to invoke that will start your FCGI processes. This command is usually the spawner:

Spinner Example [shell]
  /u/apps/flipper/current/scripts/process/spinner \
    -c '/u/apps/flipper/current/scripts/process/spawner -p 7000 -i 5' \
    -d

In the above example, the spinner is given the command to execute (the reference to spawner, which we’ll describe next), and is told to daemonize (the -d switch). By default, the spinner will attempt to execute the given command every 5 seconds. This is an admittedly brute force method of making sure your FastCGI listeners are always up.

Because it is tedious to type the above command frequently, we’ll extract the whole thing into its own script, and put it in script/spin.

Spawner

The spawner script is used to spawn multiple FastCGI listeners. You can give it various parameters (try spawner -h to see them all), but the notable ones in this context are:

  • -p: the first port number for the listeners to use
  • -i: the number of listener instances to start, one per port, starting on the port given by -p

Thus, as used above by the spinner, each time the spinner executes the spawner command (by default, once every 5 seconds), it will try to start 5 FastCGI’s listeners on ports 7000-7004. A listener can’t start if there is already one listening on that port, so only those listeners that have died will actually be respawned.

Reaper

The reaper is the opposite of the spawner—it gracefully restarts all running FCGI listeners (sending them USR2 signals, by default).

The reaper also sends (by default) a USR1 signal to the active spinner processes. This causes the spinner to shift into high gear, attempting to restart FastCGI listeners every half second, instead of every 5. Then, when the reaper is done, it drops the spinner back down into low gear. This makes sure that new listeners are started as promptly as possible if the any are killed during the restart.

This means that once the spinner is going, all it takes to restart your FastCGI processes is to invoke the reaper on them. The rest happens automatically.

The restart task invokes the reaper without arguments by default, so if you want to use a different restart mechanism (i.e., USR1 to kill the processes instead of USR2 to restart them) you will need to implement your own restart task.

Deploying

The first deployment is a bit tricky with this setup, because you have to do some bootstrapping. The spinner isn’t running, and you have to get it running. But we can’t get it running until we’ve deployed the application…

Not to worry. We’ll just create a couple of custom tasks that will get everything set up for us:

Tasks for initial deployment [ruby]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
																		desc
																		"
																		Start the spinner daemon
																		"
																		task
																		:spinner
																		,
																		:roles
																		=>
																		:app
																		do
																		run
																		"
																		
																				#{current_path}/script/spin
																		"
																		end
																		desc
																		"
																		Used only for deploying when the spinner isn't running
																		"
																		task
																		:cold_deploy
																		do
																		transaction
																		do
																		update_code
																		symlink
																		end
																		spinner
																		end
																

The first task only applies to the app servers, and all it does is start the spinner by invoking our custom spin script.

The second task is a more complicated one. It calls the update_code and symlink tasks in a transaction. This means that if either of those tasks fails, they will be rolled back, leaving the system in a consistent state. Once those two tasks finish successfully (executing on all boxes), our new spinner task is invoked (which will only be executed on the app servers, remember).

Once that’s all done, you just have to invoke the cold_deploy task, and you’re golden!

Invoking cold_deploy [shell]
  rake remote:exec ACTION=cold_deploy

Once you’ve got the spinner running, future deployments can simply use the default deploy task:

Invoking deploy [shell]
  rake deploy

Recipes

Introduction

At this point, you’ve seen a few Capistrano recipes. You’ve been exposed to all three of the building blocks of recipes: variables, roles, and tasks. In this chapter, we’ll take a closer look at each of these components and understand better what they can do for us.

Variables

Capistrano variables are set using the set keyword. Once set, you can access them in your recipes by name:

Using variables [ruby]
1
2
																		set
																		:application
																		,
																		"
																		flipper
																		"
																		puts
																		"
																		The application name is #{application}
																		"
																

(Note that because Capistrano recipe files are really just specialized Ruby scripts, you can do most anything in a recipe file that you would be able to do in a full-fledged Ruby script.)

You can set any variables you want. This allows you to create (for instance) configurable tasks that you can then share with others—you just define your tasks to use certain variables, and then others can set those variables in their own scripts. The subversion and darcs scm modules use this approach, allowing you to set (respectively) the :svn and :darcs variables to define where the executables are on the remote hosts (if they aren’t in the default path).

Capistrano also defines several pre-defined variables internally. Some of the more commonly used of these variables are:

Variable Default Description
application (required) The name of your application. Used to build other values, like the deployment directory.
repository (required) The location of your code’s scm repository.
gateway nil The address of the server to use as a gateway. If given, all other connections will be tunneled through this server.
user (current user) The name of the user to use when logging into the remote host(s).
password (prompted) The password to use for logging into the remote host(s). Probably not a good idea to set this in recipe files, for various reasons.
deploy_to ”/u/apps/#{application}” The root of the directory tree on the remote host(s) that the application should be deployed to.
version_dir “releases” The directory under deploy_to that should contain each deployed revision.
current_dir “current” The name to use (relative to deploy_to) for the symlink that points at the current release.
shared_dir “shared” The name of the directory under deploy_to that will contain directories and files to be shared between all releases.
revision (latest revision) This specifies the revision you want to check out on the remote machines. (Because the definition of a “revision” differs from SCM to SCM, the actual format of this variable is rather free form.)
scm :subversion The source control module to use. Currently supported modules are :subversion, :cvs, and :darcs.
svn (path) The location on the remote host(s) of the svn executable. This is useful if subversion is installed in a non-standard path on the servers.
checkout "co" The subversion operation to use when checking out the code on the remote host. This can be set to "export" if you would rather do an svn export instead of co.
cvs (path) The location on the remote host(s) of the cvs executable. This is useful if CVS is installed in a non-standard path on the servers.
darcs (path) The location on the remote host(s) of the darcs executable. This is useful if darcs is installed in a non-standard path on the servers.
ssh_options Hash.new This is a hash of additional options that you would like passed to the SSH connection routine. This lets you set (among other things) a non-standard port to connect on (ssh_options[:port] = 2345).
use_sudo true Whether or not tasks that can use sudo, ought to use sudo. In a shared environment, this is typically not desirable (or possible), and in that case you should set this variable to false, which will cause those tasks to simply try to run the command directly.

One last trick you can use with variables. Sometimes you want a variable to be evaluated lazily, like deploy_to is. deploy_to is set at the very beginning, by Capistrano, to "/u/apps/#{application}", but at this point the application variable has not been set. So what Capistrano does is set the deploy_to variable to a Proc instance, which gets evaluated the first time deploy_to is referenced:

Defining a variable to be lazily evaluated [ruby]
										set
										(
										:deploy_to
										)
										{
										"
										/u/apps/#{application}
										"
										}
								

Any time you set the value of a variable to be a block (or a Proc instance), the first time that variable is accessed the block will be executed, and the return value cached and returned.

Roles

Roles, as we have seen, allow you to define named subsets of your production servers. You can then define tasks that are only executed on these specific subsets.

To define a new role, you use the role keyword, followed by a comma-delimited list of server names that belong in that role. Servers can be put in multiple roles (such as when you have one server that hosts everything).

Defining roles [ruby]
1
2
3
4
																		role
																		:web
																		,
																		"
																		www.capistrano.com
																		"
																		role
																		:app
																		,
																		"
																		app1.capistrano.com
																		",
																		"
																		app2.capistrano.com
																		"
																		role
																		:db
																		,
																		"
																		app2.capistrano.com
																		"
																		role
																		:spare
																		,
																		"
																		genghis.capistrano.com
																		"
																

You can define as many servers in as many roles as you want. You can even use any name you want for the roles, but Capistrano’s standard roles are written to look for three in particular web, app and db.

If the last parameter to role is a Hash, the values will be used to further specialize the servers in that list, creating (in effect) sub-roles:

Defining roles [ruby]
1
2
																		role
																		:db
																		,
																		"
																		master.capistrano.com
																		",
																		:primary
																		=>
																		true
																		role
																		:db
																		,
																		"
																		slave.capistrano.com
																		"
																

In the above example, there are two servers in the db role, so any task associated with the db role will be executed on both of them. However, one of the servers (master.capistrano.com) is also given the more specific information of :primary => true (meaning, in this case, that this server is the primary database server). Tasks may then be defined that run only on servers in the db role, and with the :primary => true setting.

Tasks

Tasks are like methods. You create them (using the task keyword), give them a name and then define what they ought to do. By default, a task is associated with all servers, unless you explicitly specify the subset of servers to be used.

A task may invoke other tasks, simply by naming them. In this sense, a task really is like a method, because it can be invoked anywhere:

Defining tasks [ruby]
1
2
3
4
5
6
7
8
																		task
																		:hello_world
																		do
																		run
																		"
																		echo Hello, $HOSTNAME
																		"
																		end
																		task
																		:some_task
																		do
																		puts
																		"
																		calling hello_world...
																		"
																		hello_world
																		end
																

The above example creates two tasks, hello_world and some_task. Neither task specifies a role, which means that both are potentially associated with all servers. However, let’s look at what this means in practice.

If I execute the some_task task, it will print calling hello_world... to the terminal, and will then invoke hello_world. So far, so good—no servers have been touched, and all activity has been on the local host.

However, when hello_world is invoked, it calls run. All run does is attempt to execute the given command on all associated remote hosts. (We’ll talk more later about the available helper methods, of which run is the most commonly used.) This means that as soon as run is invoked, Capistrano inspects the current task and determines what roles are active, and then determines which servers those roles map to. If no connection has been made a server yet, the connection is established and cached, and then the command is executed in parallel. This means that no connections are made to the remote hosts until they are actually needed.

In the above example, then, no connection is established to any server until hello_world is invoked, and then connections are made to all defined servers in all roles. If we only wanted the servers in the db and app roles to be used for that task, we could specify that:

Specifying roles [ruby]
1
2
3
																		task
																		:hello_world
																		,
																		:roles
																		=>
																		[
																		:db
																		,
																		:app
																		]
																		do
																		run
																		"
																		echo Hello, $HOSTNAME
																		"
																		end
																

If you only want a single role to be used, you can specify it directly, without putting it in an array (i.e., :roles => :db).

As was hinted at earlier in this manual, you can also specify extra information when adding a server to a role:

Extra role information [ruby]
1
2
																		role
																		:db
																		,
																		"
																		master.capistrano.com
																		",
																		:primary
																		=>
																		true
																		role
																		:db
																		,
																		"
																		slave.capistrano.com
																		"
																

In the above example, the “master” server has the extra information :primary => true, while the “slave” server does not. Both are in the db role, but you can define a task that will only execute on the “master” server like this:

Using extra information [ruby]
1
2
3
																		task
																		:hello_world
																		,
																		:roles
																		=>
																		:db
																		,
																		:only
																		=>
																		{
																		:primary
																		=>
																		true
																		}
																		do
																		run
																		"
																		echo Hello, $HOSTNAME
																		"
																		end
																

In this case, all servers in the db role, with :primary => true in their extra information hash, will be targeted for the hello_action task.

It should also be mentioned that tasks have complete access to all configuration variables:

Accessing configuration variables [ruby]
1
2
3
4
5
6
7
																		task
																		:hello_world
																		do
																		puts
																		"
																		The application is #{application}.
																		"
																		puts
																		"
																		The repository is #{repository}.
																		"
																		puts
																		"
																		Currently using #{scm} as the source control system.
																		"
																		puts
																		"
																		Deploying to #{deploy_to}.
																		"
																		# etc.
																		end
																

Extending Tasks

Sometimes, you want to attach some logic to an existing task, either to execute before or after the task itself. For instance, the standard setup task builds out the required directories on each of your servers, but what if you have some other specific setup tasks you’d like done at the same time?

Not a problem. Before Capistrano executes a task, it looks for any other task named before_XYZ (where XYZ is the name of the task to be executed). If it finds such a task, it executes it first. Likewise, when it finishes executing a task successfully, it will look for (and execute) after_XYZ.

So, let’s say you want to also create a shared/cache directory on each of your servers:

Defining an "after" task [ruby]
1
2
3
																		task
																		:after_setup
																		,
																		:roles
																		=>
																		[
																		:web
																		,
																		:app
																		]
																		do
																		run
																		"
																		mkdir -m 777 #{shared_dir}/cache
																		"
																		end
																

Notice that you can alsp specify roles and so forth forth these before and after tasks, so even though setup (in this example) executes on all servers, you can have these extra tasks only run for specific roles.

Standard Tasks

Overview

Capistrano comes with several tasks predefined, almost all of which are targeted specifically at deploying web applications. Some of the tasks are specific to deploying Rails applications (keep in mind that Capistrano was originally designed to integrate nicely with Rails).

This chapter will introduce each of the standard tasks, and describe how they can be used, configured, and (where necessary) overridden to achieve your own ends.

Note that at any time you can see what tasks are available—including both your own custom tasks and the standard ones—by running rake show_deploy_tasks. Also note that you can look at the definitions for these tasks by finding the capistrano/recipes/standard.rb file, located wherever Capistrano was installed.

Finally, as mentioned in Extending Tasks, you can add before and after hooks to any of these tasks, simply by defining a task with the same name and prepending either before_ or after_ to it.

cleanup

When you’ve deployed your application a few times, you’ll notice the releases directory tends to accumulate a lot of stuff that isn’t necessary any more. You’ll almost never rollback more than one or two releases if anything goes wrong (but if you do need to, there are more efficient ways of doing it than calling rollback over and over).

Thus, this task was introduced in version 0.10.0. The cleanup task will delete unused releases, keeping (by default) only the 5 most recent. If you would rather keep more or fewer than 5, you can set the :keep_releases variable in your recipe file.

This task runs on all roles. Also, it uses sudo to do the delete. There is not currently a way to change this, but if you need to use run instead of sudo, you can copy the task from the standard.rb to your own recipe file and tweak it as necessary.

cold_deploy

The cold_deploy task is used when deploying an application for the first time. It will basically start the application’s spinner (via the spinner task) and then do a normal deploy. You’ll rarely need to use this more than once for an application.

deploy

The deploy task is intended to help you push a new release of your software into production. It updates the code on all servers (via the update_code and symlink tasks), and then restarts the FastCGI listeners on the application servers (via the restart task). If you are using a different way of running your applications (like using Apache to manage your FastCGI processes), you may need to override the restart task to meet your specific needs.

The update_code and symlink tasks are executed in a transaction, so if either of them fail your application will be left in its original state.

diff_from_last_deploy

This task simply prints the difference between what was last deployed, and what is currently in your repository. It can be useful for determining what changed since your last deploy.

disable_web

There are times when you want to temporarily disable web access to your application, such as when you are doing database maintenance, or upgrading your Ruby installation. The disable_web task may be used in this instance to put up a static maintenance page that is displayed to visitors, instead of your application.

This task assumes several things:

  • You are using Apache to front your applications.
  • Your web servers are all in the :web role.
  • There is a system symlink in your application’s public directory that points to a #{shared_path}/system directory.
  • You have an rewrite rule set up that redirects all requests to /system/maintenance.html if that file exists.

If all three of these conditions hold, all you need to do to disable web access to your application is rake remote_exec ACTION=disable_web. If any one of those conditions don’t hold for your environment, then you’ll need to override the entire disable_web task and script it for your specific needs.

Additionally, you can specify the UNTIL and REASON environment variables, which will be used to tailor the maintenance.html file that gets generated. UNTIL should be a time (like “10pm UTC”) or period (“this evening”, or “tomorrow morning”)—basically any phrase that can complete the phrase "back by #{time}".

The REASON environment variable may be used to specify the purpose of the downtime. By default, the word “maintenance” will be used, but any term can be used that will complete the phrase "down for #{reason}".

So, you can create a customized maintenance screen by typing:

Customized disable_web [shell]
  rake remote_exec ACTION=disable_web \
          UNTIL="tomorrow morning" \
          REASON="a vital database ugrade"

To help get you started using this task, here’s an Apache rewrite condition that looks for and displays the maintenance.html page, but only if it exists:

Apache rewrite support for disable_web [apache]
1
2
3
  RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
  RewriteCond %{SCRIPT_FILENAME} !maintenance.html
  RewriteRule ^.*$ /system/maintenance.html [L]

To re-enable your application, you can use the enable_web task.

enable_web

The enable_web task is the reverse of the disable_web task, and makes the same assumptions about your environment. All it does is delete the maintenance.html file in #{shared_path}/system. Assuming your Apache rewrite rules are set up right, deleting that file should be all it takes to unlock your app and let visitors in again.

invoke

For most things, you’ll want to create tasks to describe the operations you perform on your servers. However, sometimes there is just a one-off command you want to execute—updating a single file, or dumping the contents of some file. In those instances, you can use the invoke task to easily execute some arbitrary command-line on your servers.

To use it, just specify the COMMAND environment variable. To restrict the command to a specific set of roles, you can set the ROLES environment variable to a comma-delimited list of role names. (By default, the command will be executed on all roles.) Finally, if you want the command to be executed via sudo, you can set the SUDO environment variable to some non-blank value.

Using the invoke task [shell]
  rake remote_exec ACTION=invoke \
          COMMAND="svn up /u/apps/flipper/current/app/views" \
          ROLES=app

migrate

The migrate task exists to help you run ActiveRecord migrations against your production database. It assumes that:

  • your database servers are in the :db role, and
  • your primary database server has :primary => true associated with it.

The migrate task will only be executed for the :db server with :primary => true.

By default, all this task does is change to the directory of your current release (as indicated by the current symlink), and run rake RAILS_ENV=production migrate. You can specify that it should run against the latest release (regardless of what the current release is) by setting the migrate_target variable to :latest before invoking this task. Likewise, if you want to specify additional environment variables (beside RAILS_ENV) you can set the migrate_env variable to the space-delimited list of name=value pairs to use.

(Note that for long-running migrations, or those that lock particularly busy tables, you may want to run disable_web first to reduce contention for the database.)

restart

The restart task is used to restart all FastCGI listeners for your application. It simply calls the reaper command, without arguments, which falls back to the default behavior of sending the USR2 signal to all active processes for your application. (The spinner/spawner/reaper setup is described in greater detail in Chapter 3: A More Complicated Example.)

By default, sudo is used to invoke the reaper. If your reaper is running as your user and you do not need to use (or have access to) sudo, you can set the :use_sudo variable to false, so that the reaper is invoked via run instead.

The restart task is only executed on the servers in the :app role.

rollback

The rollback task will roll your application back to the previously deployed version. It does this by first calling the update_code task, and then invoking restart to get your FastCGI listeners looking at the right version.

This task can be a lifesaver. Unless you never make any mistakes, someday you’re bound to deploy a lemon, and you’ll be grateful on that day that you can easily and cleanly rollback to your previous version.

(Note that this only rolls back the code—it does not undo any database migrations that might have been applied by the latest deployment. If you need to rollback database migrations or other wider-ranging environment changes, you can either write your own tasks, or run the disable_web task to give you enough time to manually roll the larger changes back. Not a beautiful solution, but as Capistrano matures, so will its ability to cope with these larger issues, out of the box.)

rollback_code

The rollback_code task is primarily used as a single component of the rollback task, but it may occassionally be useful on its own. All it does determine what the previous release was (if one exists), update the current symlink to point to that, and then delete the latest release. It affects all servers.

setup

The setup task only needs to be run once, at the beginning of your application’s lifecycle (or any time a new server is added to your production environment). It is non-destructive, though, and may safely be executed against an existing production system.

It runs against all servers, and sets up the expected directory tree. Specifically, it

  • Creates the releases_path directory and chmods it to 0775.
  • Creates the shared_path directory.
  • Creates the shared_path/system directory and chmods it to 0775.
  • Creates the shared_path/log directory and chmods it to 0777.

You can define additional setup logic by creating an after_setup task, which will be called after this task.

show_tasks

The show_tasks task never does any work on any remote servers. All it does is inspect the existing tasks and display them to standard out in alphabetical order, along with their descriptions. This will include both the standard tasks (described here), as well as your own custom tasks.

The default Rails Rakefile makes it easy to execute this task:

				rake show_deploy_tasks
		

spinner

The spinner task may be used to start the spinner process for your application (as described in chapter 3). It assumes that you have a file script/spin in your application, that describes the process for starting the spinner.

Also, by default the spinner will be started as the app user. If you wish to start it as a different user, set the :spinner_user variable to something else. (This only works if you are using sudo to start the spinner. If you can’t use sudo, or don’t want to use sudo, set the :use_sudo variable to false, and the spinner will always be started as you.)

symlink

The symlink task simply attempts to update the current symlink to the latest deployed version of the code. You will almost never need to invoke this task directly, but it is used internally by other tasks.

update_code

The standard update_code task will deploy the latest revision of your code to all of your servers. It also does some tweaking and linking to hook up the new release to shared directory. Specifically, this task will:

  • Checkout your source code (according to your selected SCM)
  • Delete the log and public/system directories in your new release (if they exist)
  • symlink log to #{shared_path}/log
  • symlink public/system to #{shared_path}/system

Note that because it deletes the log and public/system directories, you ought not to store anything in those directories that you want put into the production.

This task is frequently extended with after hooks (by creating an after_update_code task) to allow you to add application-specific deployment logic. You need to change the permissions on one of your scripts? Or update your database.yml or environment.rb file dynamically? The after_update_code task is where you’ll do it.

If update_code is run inside of a transaction and it fails for whatever reason (the checkout fails, or whatever), the new release will be deleted from the server, leaving your system in the state it was originally.

Creating Tasks

Overview

Creating new tasks is easy. You might even be surprised how easy it is to do fairly complex things. After all, this is all just Ruby code, and anything you can do in Ruby, you can do in a Capistrano task.

There are several methods available to tasks to make your life (and tasks) easier. This chapter will introduce each of them, and show how they can be used.

run

The run helper takes a single string identifying the command to execute. This command can be any valid shell command, or even multiple commands chained together by &&. This command (or commands) will be executed on all servers associated with the current task, in parallel. If the executed command fails (returns non-zero) on any server, run will raise an exception.

Example of using the run helper [ruby]
1
2
3
4
5
																		run
																		<<-
																		CMD
																		
    if [[ -d #{release_path}/status.txt ]]; then
      cat #{release_path}/status.txt
    fi

																		  CMD
																

Additionally, you can pass a block to run. The block will be invoked every time the command produces output (stderr or stdout). The block should accept three parameters: the channel (an object representing the underlying SSH channel being used to communicate with the server), the stream (a symbol, either :err or :out), and the data itself.

The channel object allows you to send data back to the process, on it’s stdin stream, by calling send_data on the channel. Also, you can access the name of the host that produced the output via channel[:host].

Example of capturing output [ruby]
1
2
3
4
5
6
																		run
																		"
																		sudo ls -la
																		"
																		do
																		|
																		channel
																		,
																		stream
																		,
																		data
																		|
																		if
																		data
																		=~
																		/
																		^Password:
																		/
																		logger
																		.
																		info
																		"
																		
																				#{channel[:host]} asked for password
																		"
																		channel
																		.
																		send_data
																		"
																		mypass\n
																		"
																		end
																		end
																

By default, the run command simply echos all output from all hosts to the terminal.

sudo

The sudo command is exactly like the run command, except that it executes the command via sudo. This assumes that sudo is in a standard path on the remote host, and that the user you used to log into the server has permission to use sudo for the requested operation.

If a password is requested, the password used to log into the server will be used.

sudo example [ruby]
										sudo
										"
										apachectl graceful
										"
								

Just like run, sudo can take a block to process output as well.

put

The put helper let’s you transfer data from the local host to a file on the remote host. In this case, though, the file is transferred to all associated servers via a single call to put. If Net::SFTP is available, it will be used to transfer the files, otherwise a less-robust method is used (pipe to cat).

To use put, just pass two parameters—a string containing the data to transfer, and the name of the file to receive the data on each remote host. Optionally, you can also specify :mode => value to set the mode of the value. (Note, this will overwrite the file on the remote host!)

Using put [ruby]
1
2
3
																		put
																		(
																		File
																		.
																		read
																		('
																		templates/database.yml
																		'),
																		"
																		
																				#{release_path}/config/database.yml
																		",
																		:mode
																		=>
																		0444
																		)
																

Also note that unless Net::SFTP is available, put cannot be used to (reliably) transfer binary files.

delete

The delete command is just a convenience for executing rm via run. It just attempts to do an rm -f (note the -f! Use with caution!) on the remote server(s), for the named file. To do a recursive delete, pass :recursive => true:

Demonstrating delete [ruby]
										delete
										"
										
												#{release_path}/certs
										",
										:recursive
										=>
										true
								

render

The render command is kind of an oddball, since it doesn’t change the remote servers at all. It basically just provides an interface for easily rendering ERb templates and returning the result.

So, how does this belong in something like Capistrano?

Consider the disable_web task. It dynamically generates and stores a maintenance.html file on each web server, allowing you to specify a few different components of the presentation (the reason for the downtime, and the estimated end time).

The render command makes this easy. You just have a maintenance.rhtml template that you pass to render, along with the variables you want to use in the render, and then pass the result to put.

You can use this for all sorts of things—dynamically constructing your database.yml, or customizing a script, or whatever you can think of.

If you pass a string to render, it is interpreted as the name of a template file to render. The name need not be suffixed with ”.rhtml”—if a file exists with the given name and ”.rhtml” appended to it, that file will be used. The given file must exist relative either to the current directory, or the capistrano/recipes/templates directory (for access to standard template files).

Rendering a file [ruby]
										render
										"
										maintenance
										"
								

The above will render the file “maintenance.rhtml” (or “maintenance”, if “maintenance.rhtml” does not exist) and return the result as a string. You can also specify a hash of variables to use for the render (these will be treated as local variables within the scope of the render):

Rendering a file with variables [ruby]
										render
										"
										maintenance
										",
										:deadline
										=>
										ENV
										['
										UNTIL
										'],
										:reason
										=>
										ENV
										['
										REASON
										']
								

If you don’t want to render a file, but instead have a string containing an ERb template that you want to render, you can do it like this:

Rendering a string [ruby]
										render
										:template
										=>
										"
										Hello <%= target %>
										",
										:target
										=>
										"
										world
										"
								

transaction

The transaction helper lets you execute a series of other tasks with some (limited) ability to roll back their effects if any of them fail. What it really does is execute the attached block, and if an exception is raised it looks to see what tasks have been executed, and then executes the on_rollback handler (see below) for each one (if one exists).

This means that the rollback is only as accurate as the on_rollback handlers for the associated tasks. And not all tasks specify on_rollback.

Using a transaction [ruby]
1
2
3
4
5
6
																		task
																		:push_latest
																		do
																		transaction
																		do
																		update_code
																		symlink
																		end
																		end
																

Of the standard tasks, the following define an on_rollback handler:

  • disable_web
  • symlink
  • update_code

on_rollback

The on_rollback helper allows a task to specify a callback to use if that task raises an exception when invoked inside of a transaction (see transaction, above). It accepts no parameters, only a block:

Specifying an on_rollback handler [ruby]
										task
										:update_code
										do
										on_rollback
										{
										delete
										release_path
										,
										:recursive
										=>
										true
										}
										...
										end
								

Note that the on_rollback clause is only executed when an exception is raised, when the task is being executed inside the scope of a transaction call. If the task raises an exception when no transaction is active, the on_rollback handler is not invoked.

Extending Capistrano

Task Libraries

Eventually, you’re going to find yourself with a task or two that you’ve written, that you want to use in other applications. Or perhaps you showed it to a friend and they wanted to use it themselves.

Capistrano provides a way of loading “task libraries” that have been installed in the Ruby load path (such as via rubygems).

Writing a Task Library

As the author of a task library, you simply write your tasks as you normally would, but then you wrap them in a block so that Capistrano can load them into the currently executing configuration:

A task library [ruby]
										Capistrano
										.
										configuration
										(
										:must_exist
										).
										load
										do
										task
										:my_funky_task
										,
										:roles
										=>
										:app
										do
										...
										end
										task
										:another_funky_task
										do
										...
										end
										end
								

The :must_exist parameter simply guards against your file being loaded outside of a Capistrano recipe file. If it is, an exception will be raised indicating that was the case.

Then, you package the file up (let’s call it "custom-tasks.rb") and distribute it, either via rubygems, or with a “setup.rb”http://i.loveruby.net/en/projects/setup/ file.

Using a Task Library

Now, you (or your friend, or anybody else) can use that library simply by installing it. In your deploy.rb, you just require the file like you would any other ruby file:

Using a task library [ruby]
										require
										'
										custom-tasks
										'
								

Doing cap show_tasks now ought to list your two custom tasks, along with all the standard ones.

Extension Libraries

Sometimes, you’ll write methods that you want multiple tasks to share. The methods themselves aren’t tasks, they are simply lower-level operations, like the run or put or delete methods that Capistrano itself provides.

Capistrano allows you to easily distribute and share libraries of these extension methods, as well as tasks. Simply put your extension methods in a module, register the module with Capistrano, and then package it up and ship it. People can then use your extension methods simply by requiring the file, the same as with task libraries.

Sample extension library [ruby]
										require
										'
										capistrano
										'
										module 
										MyReportingMethods
										def 
										display
										(
										options
										={})
										...
										run
										(...)
										...
										put
										(...)
										...
										end
										end
										Capistrano
										.
										plugin
										:report
										,
										MyReportingMethods
								

The last line is where your plugin is registered with Capistrano. You simply give it a name (:report, in this case) and point it at your new module.

Once a recipe file loads this extension, it can access your report’s display method via report.display(...), effectively namespacing your extension methods.

Using an extension library [ruby]
										require
										'
										my_reporting_methods
										'
										task
										:show_general_report
										do
										report
										.
										display
										end
										task
										:show_app_report
										,
										:roles
										=>
										:app
										do
										report
										.
										display
										end
								

你可能感兴趣的:(Capistrano: Automating Application Deployment)