cherrypy documentation
. Application developer referenceAbstract
CherryPy lets developers use Python to develop web applications, just as they would use Python for any other type of application. Building a web application with CherryPy is very straightforward and does not require the developer to change habits, or learn many features, before being able to produce a working application. This section will review the basic components which you will use to build a CherryPy application.1.1. Global Overview
1.1.1. Mapping URI's to handlers
CherryPy has lots of fancy features to help you manage HTTP messages. But the most fundamental thing it does is allow you to map URI's to handler functions. It does this in a very straightforward way: the path portion of a URI is heirarchical, so CherryPy uses a parallel heirarchy of objects, starting with cherrypy.root
. If your application receives a request for "/admin/user?name=idunno", then CherryPy will try to find the handler: cherrypy.root.admin.user
. If it exists, is callable, and has an "exposed = True" attribute, then CherryPy will hand off control to that function. Any URI parameters (like "name=idunno", above) are passed to the handler as keyword arguments.
1.1.1.1. Index methods
There are some special cases, however. To what handler should we map a path like "/admin/search/"? Note the trailing slash after "search"—it indicates that our path has three components: "admin", "search", and "". Static webservers interpret this to mean that the search
object is a directory, and, since the third component is blank, they use an index.html
file if it exists. CherryPy is a dynamic webserver, so it allows you to specify an index
method to handle this. In our example, CherryPy will look for a handler at cherrypy.root.admin.search.index
. Let's pause and show our example application so far:
Example 3.1. Sample application (handler mapping example)
import cherrypy class Root: def index(self): return "Hello, world!" index.exposed = True class Admin: def user(self, name=""): return "You asked for user '%s'" % name user.exposed = True class Search: def index(self): return search_page() index.exposed = True cherrypy.root = Root() cherrypy.root.admin = Admin() cherrypy.root.admin.search = Search()
So far, we have three exposed handlers:
-
root.index
. This will be called for the URI's "/" and "/index". -
root.admin.user
. This will be called for the URI "/admin/user". -
root.admin.search.index
. This will be called for the URI's "/admin/search/" and "/admin/search".
Yes, you read that third line correctly: root.admin.search.index
will be called whether or not the URI has a trailing slash. Actually, that isn't quite true; CherryPy will answer a request for "/admin/search" (without the slash) with an HTTP Redirect response. Most browsers will then request "/admin/search/" as the redirection suggests, and then our root.admin.search.index
handler will be called. But the final outcome is the same.
1.1.1.2. Positional Parameters
Now, let's consider another special case. What if, instead of passing a user name as a parameter, we wish to use a user id as part of the path? What to do with a URI like "/admin/user/8173/schedule"? This is intended to reference the schedule belonging to "user #8173", but we certainly don't want to have a separate function for each user id!
CherryPy allows you to map a single handler to multiple URI's with the simple approach of not writing handlers you don't need. If a node in the cherrypy.root
tree doesn't have any children, that node will be called for all of its child paths, and CherryPy will pass the leftover path info as positional arguments. In our example, CherryPy will call cherrypy.root.admin.user("8173", "schedule")
. Let's rewrite our user method to handle such requests:
Example 3.2. A user method which handles positional parameters
class Admin: def user(self, *args): if not args: raise cherrypy.HTTPError(400, "A user id was expected but not supplied.") id = args.pop(0) if args and args[0] == 'schedule': return self.schedule(id) return "You asked for user '%s'" % id user.exposed = True
Note that this is different behavior than CherryPy 2.1, which only allowed positional params for methods named "default".
1.1.1.3. Default methods
Are you ready for another special case? What handler is called in our example if you request the URI "/not/a/valid/path"? Given the behavior we have described up to this point, you might deduce that the root.index
method will end up handling any path that can't be mapped elsewhere. This would mean, in effect, that CherryPy applications with a root.index
could never return a "404 Not Found" response!
To prevent this, CherryPy doesn't try to call index methods unless they are attached to the last node in the path; in our example, the only index method that might be called would be a root.not.a.valid.path.index
method. If you truly want an intermediate index method to receive positional parameters, well, you can't do that. But what you can do is define a default
method to do that for you, instead of an index
method. If we wanted our cherrypy.root
to handle any child path, and receive positional parameters, we could rewrite it like this:
Example 3.3. A default
method example
class Root: def index(self): return "Hello, world!" index.exposed = True def default(self, *args): return "Extra path info: %s" % repr(args) default.exposed = True
This new Root class would handle the URI's "/" and "/index" via the index
method, and would handle URI's like "/not/a/valid/path" and "/admin/unknown" via the default
method.
1.1.1.4. Traversal examples
For those of you who need to see in exactly what order CherryPy will try various handlers, here are some examples, using the application above. We always start by trying to find the longest object path first, and then working backwards until an exposed, callable handler is found:
Example 3.4. Traversal examples
"/admin/user/8192/schedule" Trying to reach cherrypy.root.admin.user.8192.schedule.index... cherrypy.root exists? Yes. .root.admin exists? Yes. .admin.user exists? Yes. .user.8192 exists? No. .user.default is callable and exposed? No. .admin.user is callable and exposed? Yes. Call it. "/admin/search/" Trying to reach cherrypy.root.admin.search.index... cherrypy.root exists? Yes. .root.admin exists? Yes. .admin.search exists? Yes. .search.index exists? Yes. Path exhausted. .search.index is callable and exposed? Yes. Call it. "/admin/unknown" Trying to reach cherrypy.root.admin.unknown.index... cherrypy.root exists? Yes. .root.admin exists? Yes. .admin.unknown exists? No. .admin.default is callable and exposed? No. .root.admin is callable and exposed? No. .root.default is callable and exposed? Yes. Call it.
1.2. Filters
Filters are one of the most important features of CherryPy. The CherryPy core can call user-defined functions at specific points during request processing; a filter is a class which defines those functions. Filters are designed to be called at a low level—the HTTP request/response level—and therefore should only be used in that context.
CherryPy comes with a set of built-in filters, but they're turned off by default. To enable them, you must use the configuration system as follows:
- First you must decide where to enable the filter. CherryPy maintains a tree of published objects; you must decide which branch should use the filter. The filter will then apply to that branch and all its children in the tree. Remember that the tree is accessed as a path and then mapped internally by the core to match the correct exposed object.
- Second in the config file you must turn the filter on like this :
filterName.on = True
Example 3.5. Turning on a default filter
[/entries/view]
tidy_filter.on = True
tidy_filter.tmp_dir = "/tmp"
tidy_filter.strict_xml = True
On the first line we define that the tidy filter will be used by the core whenever the path /entries/view
(or one of its sub-paths) is called. On the two last lines we also define some parameters used by the filter.
CherryPy lets you write your own filters as we will see in the developer reference chapter. However, the way to use them is different from the default filters. You do not declare custom filters within the configuration file; instead, use the _cp_filters
attribute in your source code:
Example 3.6. Using a non default filter
import cherrypy
from myfiltermodule import MyFilterClass
class Entry:
_cp_filters = [ MyFilterClass() ]
def view(self, id):
# do suff...
view.exposed = True
class Root: pass
cherrypy.root = Root()
cherrypy.root.entries = Entry()
cherrypy.server.start()
As all objects below cherrypy.root.entries
will inherit the filter, there is no need to re-specify it in each _cp_filters
underneath.
Keep in mind that the user-defined filters are called in the order you add them to the list.
1.3. Configuration system
The CherryPy configuration system provides fine-grained control over how each part of the application should react. You will use it for two reasons:
Web server settings
Enabling filters per path
You will be able to declare the configuration settings either from a file or from a Python dictionary.
First of all, let's see how a typical configuration file is defined.
Example 3.7. Configuration file
# The configuration file called myconfigfile.conf
[global]
server.socket_port=8080
server.socket_host=""
server.socket_file=""
server.socket_queue_size=5
server.protocol_version="HTTP/1.0"
server.log_to_screen=True
server.log_file=""
server.reverse_dns=False
server.thread_pool=10
server.environment="development"
[/service/xmlrpc]
xmlrpc_filter.on = True
[/admin]
session_authenticate_filter.on=True
[/css/default.css]
static_filter.on = True
static_filter.file = "data/css/default.css"
# From your script...
cherrypy.config.update(file="myconfigfile.conf")
The settings can also be defined using a python dictionary instead of a file as follows:
Example 3.8. Configuration dictionary
settings = {
'global': {
'server.socket_port' : 8080,
'server.socket_host': "",
'server.socket_file': "",
'server.socket_queue_size': 5,
'server.protocol_version': "HTTP/1.0",
'server.log_to_screen': True,
'server.log_file': "",
'server.reverse_dns': False,
'server.thread_pool': 10,
'server.environment': "development"
},
'/service/xmlrpc' : {
'xmlrpc_filter.on': True
},
'/admin': {
'session_authenticate_filter.on' :True
},
'/css/default.css': {
'static_filter.on': True,
'static_filter.file': "data/css/default.css"
}
}
cherrypy.config.update(settings)
1.3.1. Configuration Sections
Each section of the configuration refers to an object path; the object path is used to lookup the correct handler for each Request-URI. Therefore when the server receives a Request-URI of /css/default.css
, the static filter will handle the request, and the server will actually return the physical file at data/css/default.css
. Since the path /service/xmlrpc
has the XML-RPC filter enabled, all the exposed methods of the object cherrypy.root.service.xmlrpc
will be treated as XML-RPC methods.
The global
entry represents settings which apply outside the request process, including server settings such as the port, the protocol version to use by default, the number of threads to start with the server, etc. This is not the same as the root entry [/]
, which maps to cherrypy.root.
By default, URI's and object paths are equivalent; however, filters may rewrite the objectPath to produce a different mapping between URI's and handlers. This is necessary, for example, when mounting applications at virtual roots (e.g. serving the object path /welcome
at the URI "/users/~rdelon/welcome").
1.3.2. Configuration Entries
All values in the configuration file must be valid Python values. Strings must be quoted, booleans must be True or False, etc.
1.3.2.1. server.environment
The server.environment
entry controls how CherryPy should run. Three values are built in:
development
log_debug_info_filter is enabled
HTTPErrors (and therefore the default _cp_on_error) display tracebacks in the browser if errors occur
autoreload is enabled
NotFound errors (404) are listed in the error.log
production
log_debug_info_filter is disabled
tracebacks are logged, but are not displayed in the browser
autoreload is disabled
NotFound errors (404) aren't listed in the error log
staging (same as production for the moment)
Beginning in CherryPy 2.2, the behavior of each environment is defined in cherrypy.config.environments
, a dict whose keys are "development", "production", etc, and whose values are dicts of config keys and values. Application developers are free to modify existing environments, or define new environments for use by their deployers, by modifying this container. For example, if you develop an application which absolutely cannot handle autoreload, your app can set cherrypy.config.environments['development']['autoreload.on'] = False
. Deployers who selected the "development" environment would then be free from the danger of autoreload interacting with your application. Another example of using config.environments directly might be an application which needs a "development" and "production" environment, but also separate "beta", "rc", "live data" and/or "testing" environments.
1.4. Session Management
Abstract
CherryPy 2.1 includes a powerful sessions system provided via a new session_filter
.
1.4.1. Using Sessions
First you need to enable the session filter through the configuration system, by setting session_filter.on
to True
. This gives you a variable called cherrypy.session
, which is a dictionary-like object where you can read/store your session data. This dictionary always has a special key called _id
which contains the session id.
Here is sample code showing how to implement a simple counter using sessions:
Example 3.9. Basic example of session usage
import cherrypy
class Root:
def index(self):
count = cherrypy.session.get('count', 0) + 1
cherrypy.session['count'] = count
return 'Counter: %s' % count
index.exposed = True
cherrypy.config.update({'session_filter.on': True})
cherrypy.root = Root()
cherrypy.server.start()
1.4.2. Configuring sessions
The following configuration options are available for "session_filter":
session_filter.on
: True
or False
(default): enable/disable sessions
session_filter.storage_type
: Specify which storage type should be used for storing session data on the server. Built-in types are Ram
(default), File
and PostgreSQL
(see Section 1.4.3, “Choosing the backend” for more info).
session_filter.storage_path
: Specifies the directory in which CherryPy puts the session files when session_filter.storage_type is set to File
.
session_filter.timeout
: The number of minutes of inactivity before an individual session can be removed. It can be a float (ex: 0.5 for 30 seconds). Defaults to 60.
session_filter.clean_up_delay
: Once in a while the server cleans up old/expired sessions. This config option specifies how often this clean up process should happen. The delay is in minutes. Defaults to 5.
session_filter.cookie_name
: The name of the cookie that CherryPy will use to store the session ID. Defaults to sessionID
.
session_filter.get_db
: See the PostgreSQL
backend from Section 1.4.3, “Choosing the backend”.
session_filter.deadlock_timeout
: See Section 1.4.5, “Handling concurrent requests for the same session data”.
session_filter.on_create_session
: See Section 1.4.6, “Being notified when sessions are created/deleted”.
session_filter.on_delete_session
: See Section 1.4.6, “Being notified when sessions are created/deleted”.
session_filter.storage_class
: See Section 1.4.4, “Writing your own custom backend”.
1.4.3. Choosing the backend
CherryPy comes with multiple build-in backends for storing session data on the server side. They are:
Ram
: All data is stored in RAM; this is the fastest storage, but it means that the data will be lost if you restart the server; and it also means that it won't scale to multiple processes/machines
File
: All data is stored on disk; this is a bit slower than Ram storage, but the data will persist if you restart the server. It also means that data can be shared amongst multiple CherryPy processes, either on the same machine, or on multiple machines if all machines have access to the same disk (for example, via NFS).
PostgreSQL
: This backend is included with CherryPy to show how easy it is to implement your own custom backend for the session system. All data is stored in a PostgreSQL database; storing your data in a database is the recommend setup for production if you have a very high traffic website and you need to scale your site across multiple machines. To use this backend, you'll need to create the following table in your PostgreSQL database:
create table session (
id varchar(40),
data text,
expiration_time timestamp
)
You also need to programmatically set the session_filter.get_db
config option to a function that returns a DB connection. Note that you should use the psycopg2
module.
Note that when using the
Ram
backend, the session data is saved as soon as you stick it in
cherrypy.session
. So even if an error occurs later on in the page handler the data is still saved; this is not the case for the other backends.
1.4.4. Writing your own custom backend
By default, CherryPy comes with 3 built-in backends, but if you have specific needs, it is very easy to implement your own custom backend (for instance, another database, or an XML-RPC server, ...). To do so, all you have to do is write a class that implements the following methods:
class MyCustomBackend:
def save(self, id, data, expirationTime):
""" Save the session data and expirationTime for that session id """
def load(self, id):
""" Load the session data and expirationTime for 'id' and return
a tuple (data, expirationTime) (even if the session is
expired). Return None if id doesn't exist. """
def clean_up(self):
""" Delete expired session data from storage and call
'on_delete_session' for each deleted session id """
Note that if you want to use explicit
locking (see Section 1.4.5, “Handling concurrent requests for the same session data”), you also have to implement two extra methods: acquire_lock
and release_lock
.
Once you have written this class, you have to programmatically set the session_filter.storage_class
config option to this class.
If you need help in writing your own custom backend it is a good idea to look at how the current ones (ram, file and postgresql) are implemented. They are implemented in the file cherrypy/lib/filter/sessionfilter.py
1.4.5. Handling concurrent requests for the same session data
It is normally quite rare to have two simultaneous requests with the same session ID. It means that a same browser is making 2 requests to your server at the same time (to dynamic pages ... static data like images don't have sessions). However, this case can happen (if you're using frames for instance), and it will happen more and more often as more and more people start using Ajax.
In that case, we need to make sure that access to the session data is serialized. This way, threads can't both modify the data at the same time and leave it in an inconsistent state.
You can easily make CherryPy serialize access to the session data by setting the session_filter.locking
config option to implicit
(the default is explicit
, which means that CherryPy won't do any locking for you). In the implicit
mode, if a browser makes a second request while a first request is still being handled by the server, the second request will block while the first request is accessing the data. As soon as the first request is finished then the second request will be able to access it.
This means that the second request will block until the first request is finished.
1.4.6. Being notified when sessions are created/deleted
It is possible to configure the session_filter
so that it calls some special callback functions from your code when sessions are being created/deleted. To do so you have to set the session_filter.on_create_session
and session_filter.on_delete_session
config options. When a session is created/deleted, CherryPy will call these functions and pass them the session data.
1.5. Templating language independent
CherryPy is a low-level framework for building web applications, and thus does not offer high-level features such as an integrated templating system. This is quite a different point of view from many other web frameworks. CherryPy does not force you to use a specific templating language; instead, it allows you to plug in your favourite one as you see fit.
CherryPy works with all the main templating systems:
- Cheetah
- XSLT
- CherryTemplate
- HTMLTemplate
- Kid
- Zope Page Template
You will find recipes on how to use them on the CherryPy website.
1.6. Static content handling
Static content is now handled by a filter called "static_filter" that can easily be enabled and configured in your config file. For instance, if you wanted to serve /style.css
from /home/site/style.css
and /static/*
from /home/site/static/*
, you can use the following configuration:
Example 3.10. Static filter configuration
[global]
static_filter.root = "/home/site"
[/style.css]
static_filter.on = True
static_filter.file = "style.css"
[/static]
static_filter.on = True
static_filter.dir = "static"
The static_filter.root
entry can be either absolute or relative. If absolute, static content is sought within that absolute path. Since CherryPy cannot guess where your application root is located, relative paths are assumed to be relative to the directory where your cherrypy.root
class is defined (if you do not provide a root, it defaults to "", and therefore to the directory of your cherrypy.root
class).
As an application developer, the design of your application affects whether you choose to use absolute or relative paths. If you are creating a one-off application that will only be deployed once, you might as well use absolute paths. But you can make multiple deployments easier by using relative paths, letting CherryPy calculate the absolute path each time for you. Absolute paths, however, give deployers the ability to place static content on read-only filesystems, or on faster disks.
1.7. File upload
Before version 2.1, CherryPy handled file uploads by reading the entire file into memory, storing it in a string, and passing it to the page handler method. This worked well for small files, but not so well for large files.
CherryPy 2.1 uses the python cgi
module to parse the POST data. When a file is being uploaded, the cgi
module stores it in a temp file and returns a FieldStorage
instance which contains information about this file. CherryPy then passes this FieldStorage
instance to the method. The FieldStorage
instance has the following attributes:
file
: the file(-like) object from which you can read the data
filename
: the client-side filename
type
: the content-type of the file
1.8. Exceptions and Error Handling
As you read this section, refer to the following diagram to understand the flow of execution:
Figure 3.1. Error flow execution
1.8.1. Unanticipated exceptions
When an unhandled exception is raised inside CherryPy, three actions occur (in order):
before_error_response
filter methods are called
a _cp_on_error
method is called
response.finalize
is called
after_error_response
filter methods are called
The error response filter methods are defined by each filter; they cannot prevent the call to _cp_on_error
(unless before_error_response
raises an exception, including HTTPRedirect).
The _cp_on_error
function is a CherryPy "special attribute"; that is, you can define your own _cp_on_error
method for any branch in your cherrypy.root
object tree, and it will be invoked for all child handlers. For example:
Example 3.11. A custom _cp_on_error
method
import cherrypy
class Root:
def _cp_on_error(self):
cherrypy.response.body = ("We apologise for the fault in the website. "
"Those responsible have been sacked.")
def index(self):
return "A m" + 00 + "se once bit my sister..."
index.exposed = True
The default _cp_on_error
function simply responds as if an HTTPError 500 had been raised (see the next section).
If an HTTPRedirect is raised during the error-handling process, it will be handled appropriately. If any other kind of error occurs during the handling of an initial error, then CherryPy punts, returning a bare-bones, text/plain
error response (containing both tracebacks if server.show_tracebacks
is True).
1.8.2. HTTPError
HTTPError exceptions do not result in calls to _cp_on_error
. Instead, they have their own _cp_on_http_error
function. Like _cp_on_error
, this is a "special attribute" and can be overridden by cherrypy.root objects. The default _cp_on_http_error
handler sets the HTTP response to a pretty HTML error page.
1.8.3. HTTPRedirect
HTTPRedirect exceptions are not errors; therefore, there is no way to override their behavior. They set the response to an appropriate status, header set, and body, according to the HTTP spec.
2. Administrator reference
2.1. Install a CherryPy application
2.2. Config options reference
2.2.1. List of core (ie: not for filters) config options:
[global] server.socket_port
: port number where the server is listening (defaults to 8080)
[global] server.log_file
: path to a file to log CherryPy server activity. Items logged include startup config info, tracebacks and HTTP requests. It is disabled by default and everything is logged to the screen.
[global] server.log_access_file
: path to a file where access log data will be stored in Common Log Format. The default is to write access log data to the screen. If a file is specified, the access log data is no longer written to the screen.
[global] server.log_to_screen
: controls whether any log data is written to the screen. It defaults to on (True). For performance reasons, it is best to have this option turned off on a production server.
[global] server.log_tracebacks
: controls whether or not tracebacks are written to the log (screen or otherwise). Defaults to on (True) If set to False, only a 500 return code will be logged in the access log.
[global] server.max_request_header_size
: maximum acceptable size of a request header, in bytes (defaults to 500KB). If a longer request arrives, the server will interrupt it and return a 413 error. This setting is global (ie: doesn't depend on the path). Set it to zero to remove the limit
[global] server.default_content_type
: default content type to be used for all responses (default to text/html). This setting is global (ie: doesn't depend on the path).
[/path] server.max_request_body_size
: maximum acceptable size of a request body, in bytes (defaults to 100MB). If a longer request body arrives, the server will interrupt it and return a 413 error. This setting can be configured per path. This is useful to limit the size of uploaded files. Set it to zero to remove the limit
TODO
: other config options
2.3. Configure an application
2.4. Production Setup
2.4.1. Quick overview
2.4.2. Servers
2.4.2.1. Built in server
2.4.2.2. Behind Apache
2.4.2.3. Built in server
2.4.2.4. FastCGI
2.4.2.5. mod_python
3. CherryPy framework developer reference
3.1. Detailed overview of CherryPy
3.1.1. The HTTP conversation (request/response process)
CherryPy is designed to be deployed in a variety of environments, and therefore has a number of layers involved in handling an HTTP request.
Figure 3.2. The HTTP conversation
3.2. Design choices
3.2.1. A layered API
3.2.1.1. Simple apps should not require any knowledge of HTTP
At its most basic, CherryPy is designed to allow the production of simple websites without having to think about any of the details of HTTP. Notice we're saying HTTP (the transport), not HTML (the markup language)! In particular, developers should not have to concern themselves with:
Responding to unpublished requests
Logging and notifying users appropriately when unhandled exceptions occur
The difference between query strings and POSTed params
The decoding and unpacking of request headers and bodies, including file uploads
Response status or headers
For the most part, simple "page handlers" (functions attached to cherrypy.root
), should never have to refer to cherrypy at all! They receive params via function arguments, and return content directly. Advanced functionality is most often enabled via the built-in filters, which encapsulate the particulars of HTTP, and can be completely controlled via the config file.
3.2.1.2. Advanced apps should have full control over (valid) HTTP output
Simple apps are produced simply, but when a developer needs to step out of the mundane and provide real value, they should be able to leverage the complete power and flexibility of the HTTP specification. In general, the HTTP request and response messages are completely represented in the cherrypy.request
and .response
objects. At the lowest level, a developer should be able to generate any valid HTTP response message by modifying cherrypy.response.status
, .headers
, and/or .body
.
3.2.1.2.1. How CherryPy relates to REST (REpresentational State Transfer)
The design of HTTP itself is guided by REST, a set of principles which constrain its expressivity and therefore its implementation. HTTP is a transfer protocol which enables the exchange of representations of resources. In a RESTful design, clients never expect to access a resource directly; instead, they request a representation of that resource. For example, if a resource has both an XML and an HTML representation, then an HTTP/1.1 server might be expected to inspect the Accept request header in order to decide which representation to serve in response.
It's important to clarify some terminology, here. In REST terms, a "resource" is "any concept that might be the target of an author’s hypertext reference...a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time". A resource is not the request, nor the response, in an HTTP conversation. "The resource is not the storage object. The resource is not a mechanism that the server uses to handle the storage object. The resource is a conceptual mapping — the server receives the identifier (which identifies the mapping) and applies it to its current mapping implementation (usually a combination of collection-specific deep tree traversal and/or hash tables) to find the currently responsible handler implementation and the handler implementation then selects the appropriate action+response based on the request content."
CherryPy, therefore, does not provide REST resources, nor model them, nor serve them. Instead, it provides mappings between identifiers (URI's) and handlers (functions). It allows application developers to model resources, perhaps, but it only serves representations of resources.
By default, these identifier-to-handler mappings (which we will call "handler dispatch" from now on) follow a simple pattern: since the path portion of a URI is hierarchical, CherryPy arranges handlers in a similar heirarchy, starting at cherrypy.root, and branching on each attribute; every leaf node in this tree must be "exposed" (but the branches need not be, see section 2.2). Note in particular that, although the query portion of a Request-URI is part of the resource identifier, CherryPy does not use it to map identifiers to handlers. Application developers may use the query string to further identify the requested resource, of course, but CherryPy, not having any domain-specific knowledge about the format or semantic of a query string, doesn't try to guess.
Filters, then, are CherryPy's way to wrap or circumvent the default handler dispatch. EncodingFilter, for example, wraps the response from a handler, encoding the response body as it is produced. StaticFilter, on the other hand, intercepts some requests (based on the path portion of the Request-URI) and implements its own identifier-to-handler mapping. Developers who wish to provide their own handler dispatch mechanisms are encouraged to do so via a filter.
3.3. API reference
3.3.1. cherrypy.thread_data
This attribute holds attributes that map to this thread only.
3.3.2. cherrypy.request
The cherrypy.request object contains request-related objects. Pretty lame description, but that's all it does; it's a big data dump. At the beginning of each HTTP request, the existing request object is destroyed, and a new one is created, (one request object for each thread). Therefore, CherryPy (and you yourself) can stick data into cherrypy.request and not worry about it conflicting with other requests.
3.3.2.1. cherrypy.request.remoteAddr
This attribute is a string containing the IP address of the client. It will be an empty string if it is not available.
3.3.2.2. cherrypy.request.remotePort
This attribute is an int containing the TCP port number of the client. It will be -1 if it is not available.
3.3.2.3. cherrypy.request.remoteHost
This attribute is a string containing the remote hostname of the client.
3.3.2.4. cherrypy.request.headers
This attribute is a dictionary containing the received HTTP headers, with automatically titled keys (e.g., "Content-Type"). As it's a dictionary, no duplicates are allowed.
3.3.2.5. cherrypy.request.header_list
This attribute is a list of (header, value) tuples containing the received HTTP headers. In general, you probably want to use headers instead; this is only here in case you need to inspect duplicates in the request headers.
3.3.2.6. cherrypy.request.requestLine
This attribute is a string containing the first line of the raw HTTP request; for example, "GET /path/page HTTP/1.1".
3.3.2.7. cherrypy.request.simpleCookie
This attribute is a SimpleCookie instance from the standard library's Cookie module which contains the incoming cookie values from the client.
3.3.2.8. cherrypy.request.rfile
This attribute is the input stream to the client, if applicable. See cherrypy.request.processRequestBody for more information.
3.3.2.9. cherrypy.request.body
This attribute is the request entity body, if applicable. See cherrypy.request.processRequestBody for more information.
3.3.2.10. cherrypy.request.processRequestBody
This attribute specifies whether or not the request's body (request.rfile, which is POST or PUT data) will be handled by CherryPy. If True (the default for POST and PUT requests), then request.rfile will be consumed by CherryPy (and unreadable after that). If the request Content-Type is "application/x-www-form-urlencoded", then the rfile will be parsed and placed into request.params; otherwise, it will be available in request.body. If cherrypy.request.processRequestBody is False, then the rfile is not consumed, but will be readable by the exposed method.
3.3.2.11. cherrypy.request.method
This attribute is a string containing the HTTP request method, such as GET or POST.
3.3.2.12. cherrypy.request.protocol
This attribute is a string containing the HTTP protocol of the request in the form of HTTP/x.x
3.3.2.13. cherrypy.request.version
This attribute is a Version object which represents the HTTP protocol. It's the same os request.protocol, but allows easy comparisons like if cherrypy.request.version >= "1.1": do_http_1_1_thing
.
3.3.2.14. cherrypy.request.queryString
This attribute is a string containing the query string of the request (the part of the URL following '?').
3.3.2.15. cherrypy.request.path
This attribute is a string containing the path of the resource the client requested.
3.3.2.16. cherrypy.request.params
This attribute is a dictionary containing the query string and POST arguments of this request.
3.3.2.17. cherrypy.request.base
This attribute is a string containing the root URL of the server. By default, it is equal to request.scheme://request.headers['Host'].
3.3.2.18. cherrypy.request.browser_url
This attribute is a string containing the URL the client requested. By default, it is equal to request.base + request.path
, plus the querystring, if provided.
3.3.2.19. cherrypy.request.objectPath
This attribute is a string containing the path of the exposed method that will be called to handle this request. This is usually the same as cherrypy.request.path, but can be changed in a filter to change which method is actually called.
3.3.2.20. cherrypy.request.originalPath
This attribute is a string containing the original value of cherrypy.request.path, in case it is modified by a filter during the request.
3.3.2.21. cherrypy.request.originalParamMap
This attribute is a string containing the original value of cherrypy.request.params, in case it is modified by a filter during the request.
3.3.2.22. cherrypy.request.scheme
This attribute is a string containing the URL scheme used in this request. It is either "http" or "https".
3.3.3. cherrypy.response
The cherrypy.response object contains response-related objects. Pretty lame description, but that's all it does; it's a big data dump. At the beginning of each HTTP request, the existing response object is destroyed, and a new one is created, (one response object for each thread). Therefore, CherryPy (and you yourself) can stick data into cherrypy.response and not worry about it conflicting with other requests.
3.3.3.1. cherrypy.response.headers
This attribute is a dictionary with automatically titled keys (e.g., "Content-Length"). It holds all outgoing HTTP headers to the client.
3.3.3.2. cherrypy.response.header_list
This attribute is a list of (header, value) tuples. It's not available until the response has been finalized; it's really only there in the extremely rare cases when you need duplicate response header_list. In general, you should use request.headers instead.
3.3.3.3. cherrypy.response.simpleCookie
This attribute is a SimpleCookie instance from the standard library's Cookie module. It contains the outgoing cookie values.
3.3.3.4. cherrypy.response.body
This attribute is originally just the return value of the exposed method, but by the end of the request it must be an iterable (usually a list or generator of strings) which will be the content of the HTTP response.
3.3.3.5. cherrypy.response.status
This attribute is a string containing the HTTP response code in the form "### Reason-Phrase", i.e. "200 OK". You may also set it to an int, in which case the response finalization process will supply a Reason-Phrase for you.
3.3.3.6. cherrypy.response.version
This attribute is a Version object, representing the HTTP protocol version of the response. This is not necessarily the value that will be written in the response! Instead, it should be used to determine which features are available for the response. For example, an HTTP server may send an HTTP/1.1 response even though the client is known to only understand HTTP/1.0—the response.version will be set to Version("1.0") to inform you of this, so that you (and CherryPy) can restrict the response to HTTP/1.0 features only.
3.3.4. cherrypy.server
3.3.4.1. cherrypy.server.start(initOnly=False, serverClass=_missing)
Start the CherryPy Server. Simple websites may call this without any arguments, to run the default server. If initOnly is False (the default), this function will block until KeyboardInterrupt or SystemExit is raised, so that the process will persist. When using one of the built-in HTTP servers, you should leave this set to False. You should only set it to True if you're running CherryPy as an extension to another HTTP server (for example, when using Apache and mod_python with CherryPy), in which case the foreign HTTP server should do its own process-management.
Use the serverClass argument to specify that you wish to use an HTTP server other than the default, built-in WSGIServer. If missing, config.get("server.class") will be checked for an alternate value; otherwise, the default is used. Possible alternate values (you may pass the class names as a string if you wish):
cherrypy._cphttpserver.CherryHTTPServer
: this will load the old, single-threaded built-in HTTP server. This server is deprecated and will probably be removed in CherryPy 2.2.
cherrypy._cphttpserver.PooledThreadServer
: this will load the old, multi-threaded built-in HTTP server. This server is deprecated and will probably be removed in CherryPy 2.2.
cherrypy._cphttpserver.embedded_server
: use this to automatically select between the CherryHTTPServer and the PooledThreadServer based on the value of config.get("server.thread_pool") and config.get("server.socket_file").
None
: this will not load any HTTP server. Note that this is not the default; the default (if serverClass is not given) is to load the WSGIServer.
Any other class (or dotted-name string): load a custom HTTP server.
You must call this function from Python's main thread, and set initOnly to False, if you want CherryPy to shut down when KeyboardInterrupt or SystemExit are raised (including Ctrl-C). The only time you might want to do otherwise is if you run CherryPy as a Windows service, or as an extension to, say, mod_python, and even then, you might want to anyway.
3.3.4.2. cherrypy.server.blocking
If the "initOnly" argument to server.start is True, this will be False, and vice-versa.
3.3.4.3. cherrypy.server.httpserverclass
Whatever HTTP server class is set in server.start will be stuck in here.
3.3.4.4. cherrypy.server.httpserver
Whatever HTTP server class is set in server.start will be instantiated and stuck in here.
3.3.4.5. cherrypy.server.state
One of three values, indicating the state of the server:
STOPPED = 0: The server hasn't been started, and will not accept requests.
STARTING = None: The server is in the process of starting, or an error occured while trying to start the server.
STARTED = 1: The server has started (including an HTTP server if requested), and is ready to receive requests.
3.3.4.6. cherrypy.server.ready
True if the server is ready to receive requests, false otherwise. Read-only.
3.3.4.7. cherrypy.server.wait()
Since server.start usually blocks, other threads need to be started before calling server.start; however, they often must wait for server.start to complete it's setup of the HTTP server. Use this function from other threads to make them wait for the HTTP server to be ready to receive requests.
3.3.4.8. cherrypy.server.start_with_callback(func, args=(), kwargs={}, serverClass=_missing)
Since server.start usually blocks, use this to easily run another function in a new thread. It starts the new thread and then runs server.start. The new thread automatically waits for the server to finish its startup procedure.
3.3.4.9. cherrypy.server.stop()
Stop the CherryPy Server. Well, "suspend" might be a better term—this doesn't terminate the process.
3.3.4.10. cherrypy.server.interrupt
Usually None, set this to KeyboardInterrupt() or SystemExit() to shut down the entire process. That is, the new exception will be raised in the main thread.
3.3.4.11. cherrypy.server.restart()
Restart the CherryPy Server.
3.3.4.12. cherrypy.server.on_start_server_list
A list of functions that will be called when the server starts.
3.3.4.13. cherrypy.server.on_stop_server_list
A list of functions that will be called when the server stops.
3.3.4.14. cherrypy.server.on_start_thread_list
A list of functions that will be called when each request thread is started. Note that such threads do not need to be started or controlled by CherryPy; for example, when using CherryPy with mod_python, Apache will start and stop the request threads. Nevertheless, CherryPy will run the on_start_thread_list functions upon the first request using each distinct thread.
3.3.4.15. cherrypy.server.on_stop_thread_list
A list of functions that will be called when each request thread is stopped.
3.3.4.16. cherrypy.server.request()
HTTP servers should call this function to create a new Request and Response object. The return value is the Request object; call its run
method to have the CherryPy core process the request data and populate the response.
3.3.5. cherrypy.config
3.3.5.1. cherrypy.config.get(key, defaultValue = None, returnSection = False)
This function returns the configuration value for the given key. The function checks if the setting is defined for the current request path; it walks up the request path until the key is found, or it returns the default value. If returnSection is True, the function returns the configuration path where the key is defined instead.
3.3.5.2. cherrypy.config.getAll(key)
The getAll function returns a list containing a (path, value) tuple for all occurences of the key within the request path. This function allows applications to inherit configuration data defined for parent paths.
3.3.5.3. cherrypy.config.update(updateMap=None, file=None)
Function to update the configuration map. The "updateMap" argument is a dictionary of the form {'sectionPath' : { } }. The "file" argument is the path to the configuration file.
3.3.5.4. cherrypy.config.environments
Dict containing config defaults for each named server.environment.
3.3.6. cherrypy exceptions
3.3.6.1. cherrypy.HTTPError
This exception can be used to automatically send a response using a http status code, with an appropriate error page.
3.3.6.1.1. cherrypy.NotFound
This exception is raised when CherryPy is unable to map a requested path to an internal method. It's a subclass of HTTPError (404).
3.3.6.2. cherrypy.HTTPRedirect
This exception will force a HTTP redirect.
3.3.6.3. cherrypy.InternalRedirect
This exception will redirect processing to another path within the site (without informing the client). Provide the new path as an argument when raising the exception. You may also provide a second "params" argument which will replace the current request params (usually a dict, but you may also supply a GET-param-style string). This exception is only handled from within page handlers and before_main filter methods.
3.3.7. The CherryPy library
3.3.7.1. cherrypy.lib.cptools
3.3.7.1.1. ExposeItems
Utility class that exposes a getitem-aware object. It does not provide index() or default() methods, and it does not expose the individual item objects - just the list or dict that contains them. User-specific index() and default() methods can be implemented by inheriting from this class.
3.3.7.1.2. PositionalParametersAware
Utility class that restores positional parameters functionality that was found in 2.0.0-beta.
3.3.7.1.3. getAccept(headername)
Returns a list of AcceptValue objects from the specified Accept-* header (or None if the header is not present). The list is sorted so that the most-preferred values are first in the list.
Each AcceptValue object has a value
attribute, a string which is the value itself. For example, if headername
is "Accept-Encoding", the value
attribute might be "gzip". It also has a (read-only) qvalue
attribute, a float between 0 and 1 which specifies the client's preference for the value; higher numbers are preferred. Finally, each AcceptValue also has a params
attribute, a dict; for most headers, this dict will only possess the original "q" value as a string.
If headername
is "Accept" (the default), then the params attribute may contain extra parameters which further differentiate the value. In addition, params["q"]
may itself be an AcceptValue object, with its own params
dict. Don't ask us why; ask the authors of the HTTP spec.
3.3.7.1.4. getRanges(content_length)
Returns a list of (start, stop) indices from a Range request header. Returns None if no such header is provided in the request. Each (start, stop) tuple will be composed of two ints, which are suitable for use in a slicing operation. That is, the header "Range: bytes=3-6", if applied against a Python string, is requesting resource[3:7]. This function will return the list [(3, 7)].
3.3.7.1.5. headers
A subclass of Python's builtin dict
class; CherryPy's default request.headers
and response.headers
objects are instances of this class. The keys are automatically titled (str(key).title()
) in order to provide case-insensitive comparisons and avoid duplicates.
3.3.7.1.6. parseRequestLine(requestLine)
Returns (method, path, querystring, protocol
) from an HTTP requestLine. The default Request processor calls this function.
3.3.7.1.7. parseQueryString(queryString, keep_blank_values=True)
Returns a dict of {'key': 'value'}
pairs from an HTTP "key=value" query string. Also handles server-side image map query strings. The default Request processor calls this function.
3.3.7.1.8. paramsFromCGIForm(form)
Returns a dict of {'key': ''value'}
pairs from a cgi.FieldStorage
object. The default Request processor calls this function.
3.3.7.1.9. serveFile(path, contentType=None, disposition=None, name=None)
Set status, headers, and body in order to serve the file at the given path. The Content-Type header will be set to the contentType arg, if provided. If not provided, the Content-Type will be guessed by the extension of the file. If disposition is not None, the Content-Disposition header will be set to "<disposition>; filename=<name>". If name is None, it will be set to the basename of path. If disposition is None, no Content-Disposition header will be written.
3.3.7.2. cherrypy.lib.covercp
This module both provides code-coverage tools, and may also be run as a script. To use this module, or the coverage tools in the test suite, you need to download 'coverage.py', either Gareth Rees' original implementation or Ned Batchelder's enhanced version.
Set cherrypy.codecoverage to True to turn on coverage tracing. Then, use the covercp.serve() function to browse the results in a web browser. If you run this module as a script (i.e., from the command line), it will call serve() for you.
3.3.7.3. cherrypy.lib.profiler
You can profile any of your page handlers (exposed methods) as follows:
Example 3.12. Profiling example
from cherrypy.lib import profile
class Root:
p = profile.Profiler("/path/to/profile/dir")
def index(self):
self.p.run(self._index)
index.exposed = True
def _index(self):
return "Hello, world!"
cherrypy.root = Root()
Set the config entry: "profiling.on = True" if you'd rather turn on profiling for all requests. Then, use the serve() function to browse the results in a web browser. If you run this module as a script (i.e., from the command line), it will call serve() for you.
Developers: this module should be used whenever you make significant changes to CherryPy, to get a quick sanity-check on the performance of the request process. Basic requests should complete in about 5 milliseconds on a reasonably-fast machine running Python 2.4 (Python 2.3 will be much slower due to threadlocal being implemented in Python, not C). You can profile the test suite by supplying the --profile option to test.py.
3.3.7.4. cherrypy.lib.autoreload
This module provides a brute-force method of reloading application files on the fly. When the config entry "autoreload.on" is True (or when "server.environment" is "development"), CherryPy uses the autoreload module to restart the current process whenever one of the files in use is changed. The mechanism by which it does so is pretty complicated:
Figure 3.3. The autoreload process
3.3.8. Special functions and attributes
3.3.8.1. _cp_on_error
_cp_on_error is a function for handling unanticipated exceptions, whether raised by CherryPy itself, or in user applications. The default simply responds as if HTTPError(500) had been raised.
3.3.8.2. _cp_on_http_error
_cp_on_http_error handles HTTPError responses, setting cherrypy.response.status, headers, and body.
3.3.8.3. _cp_filters
User defined filters are enabled using the class attribute _cp_filters. Any filter instances placed in _cp_filters will be applied to all methods of the class.
3.3.8.4. _cp_log_access
Function to log HTTP requests into the access.log file.
3.3.8.5. _cp_log_message
Function to log errors into the error.log file. The cherrypy.log
function is syntactic sugar for this one.
3.3.9. Filter API
CherryPy provides a set of hooks which are called at specific places during the request process. A filter should inherit from the BaseFilter class and implement the hooks it requires to add extra code during the process. CherryPy will go through all the filters which are on (buil-in and user defined) for that requested path and call all hooks that are implemented by each filter.
3.3.9.1. on_start_resource
This hook is being called righ at the beginning of the request process. The only work CherryPy has done when this hook is called is to parse the first line of the HTTP request. This is needed so that filters have access to the object path translated from the path specified in the HTTP request.
This hook is always called.
3.3.9.2. before_request_body
This hook is being called right after CherryPy has parse the HTTP request headers but before it tries to parse the request body. If a filter which implements that hook sets cherrypy.request.processRequestBody to False, CherryPy will not parse the request body at all. This can be handy when you know your user agent returns the data in a form that the default CherryPy request body parsing function cannot understand.
For example, assuming your user agent returns you a request body which is an XML string unquoted, you may want a filter to parse that XML string and generates an XML DOM instance. Then the filter could add that instance to the cherrypy.request.params which in turns would be passed to your page handler like if it had actually been sent like that through the HTTP request. Therefore your filter has turned the XML string into an XML DOM instance transparently and makes your life easier. In that case you do not want CherryPy to parse the request body. It could also be used to scan the request body before it is being processed any further and decide to reject it if needed.
This hook is not called if an error occurs during the process before hand.
3.3.9.3. before_main
This hook is called right before your page handler (exposed callable) is being called by CherryPy. It can be handy if considering HTTP request headers or body you may want not to call the page handler at all, then you would have to set cherrypy.request.executeMain to False.
This hook is not called if an error occurs during the process before hand.
3.3.9.4. before_finalize
This hook is called right after the page handler has been processed (depending on the before_main hook behavior) and before CherryPy formats the final respone object. It helps you for example to check for what could have been returned by your page handler and change some headers of needed.
This hook is not called if an error occurs during the process before hand.
3.3.9.5. on_end_resource
This hook is called at the end of the process so that you can finely tweak your HTTP response if needed (eg adding headers to the cherrypy.response.header_list). Note that cherrypy.response.headers will not be processed any longer at that stage.
This hook is always called.
3.3.9.6. before_error_response
This hook is called when an error has occured during the request processing. It allows you to called code before the _cp_on_error handler is being called as well as the response finalizing stage.
3.3.9.7. after_error_response
This hook is called when an error has occured during the request processing. It allows you to call code after the _cp_on_error handler is being called as well as the response finalizing stage.
3.4. Filters explained
Filters provide a powerful mechanism for extending CherryPy. The aim is to provide code called at the HTTP request level itself. More specifically it means that you can write code that will be called:
- before a request is processed
- after a request has been processed
- before a response is sent to the client
- after a response is sent to the client
3.4.1. Builtin Filters
3.4.1.1. baseurlfilter
The baseurlfilter changes the base url of a request. It is useful for running CherryPy behind Apache with mod_rewrite.
The baseurlfilter has the following configuration options
base_url_filter.base_url
base_url_filter.use_x_forwarded_host
3.4.1.2. cachefilter
The cachefilter stores responses in memory. If an identical request is subsequently made, then the cached response is output without calling the page handler.
3.4.1.3. decodingfilter
The decoding filter can be configured to automatically decode incoming requests.
The decodingfilter has the following configuration options:
decoding_filter.encoding
3.4.1.4. encodingfilter
The encodingfilter can be configured to automatically encode outgoing responses.
The encodingfilter has the following configuration options:
encoding_filter.encoding: Force all text responses to be encoded with this encoding.
encoding_filter.default_encoding: Default all text responses to this encoding (if the user-agent does not request otherwise).
3.4.1.5. gzipfilter
The gzipfilter will automatically gzip outgoing requests, if it is supported by the client.
The gzipfilter does not have any configuration options.
3.4.1.6. logdebuginfofilter
The logdebuinfofilter adds debug information to each page. The filter is automatically turned on when "server.environment" is set to "development".
The logdebuginfofilter has the following configuration options:
log_debug_info_filter.mime_types, ['text/html']
log_debug_info_filter.log_as_comment, False
log_debug_info_filter.log_build_time, True
log_debug_info_filter.log_page_size, True
3.4.1.7. staticfilter
The static filter allows CherryPy to serve static files.
The staticfilter has the following configuration options:
static_filter.file
static_filter.dir
static_filter.root
3.4.1.8. nsgmlsfilter
The nsgmlsfilter parses and validates SGML responses.
3.4.1.9. tidyfilter
The tidyfilter cleans up returned html by running the response through Tidy.
Note that we use the standalone Tidy tool rather than the python mxTidy module. This is because this module doesn't seem to be stable and it crashes on some HTML pages (which means that the server would also crash.)
The tidyfilter has the following configuration options:
tidy_filter.tmp_dir
tidy_filter.strict_xml, False
tidy_filter.tidy_path
3.4.1.10. virtualhostfilter
The virtualhostfilter changes the ObjectPath based on the Host. Use this filter when running multiple sites within one CP server.
The virtualhostfilter has the following configuration options:
virtual_host_filter.prefix, '/'
3.4.1.11. xmlrpcfilter
The xmlrpcfilter converts XMLRPC to the CherryPy2 object system and vice-versa.
PLEASE NOTE: before_request_body: Unmarshalls the posted data to a methodname and parameters. - These are stored in cherrypy.request.rpcMethod and .rpcParams - The method is also stored in cherrypy.request.path, so CP2 will find the right method to call for you, based on the root's position. before_finalize: Marshalls cherrypy.response.body to xmlrpc. - Until resolved: cherrypy.response.body must be a python source string; this string is 'eval'ed to return the results. This will be resolved in the future. - Content-Type and Content-Length are set according to the new (marshalled) data
The xmlrpcfilter does not have any configuration options.
3.4.1.12. sessionauthenticatefilter
The sessionauthenticatefilter provides simple form-based authentication and access control.
3.4.1.13. sessionfilter
The Session Filter has its own section
3.4.2. Writing Filters
3.4.2.1. Extending basefilter
3.4.2.2. Defining configurable settings
3.4.2.3. Pitfalls to avoid
3.5. Web servers architecure (HTTP servers which CP supplies)
3.5.1. Design choices
3.5.2. WSGI server
CherryPy 2.1 supports arbitrary WSGI servers, and includes its own WSGI server (the default). This means that you should be able to deploy your CherryPy application using Apache or IIS (among others) without any changes to your application--only the deployment scripts will change.
3.5.2.1. mod_python
3.5.2.2. IIS/ASP
3.5.2.3. FastCGI
3.5.2.4. SCGI
3.5.3. HTTP server
Appendix A. Appendix