全称为Web Server Gateway Interface
,即 Web服务器网关接口。是一种标准接口规范,规定了 web 服务器 和 Python web 应用/框架 之间如何传递数据,以便 web 应用 可以与多种 web 服务器配合工作。
HTTP 客户端 --- web 服务器 --- WSGI --- Flask
作用:
- 让 web 服务器知道如何调用 web 应用,传递用户的请求给应用
- 让应用知道用户的请求内容,以及如何返回消息给 web 服务器
WSGI 的两种角色
server/gateway, 通常是 web 服务器,接受客户的请求,调用 application,将 application 处理的结果封装成 HTTP 响应返回给客户。
application/framework, 是 Python 应用
application 是一个需要两个参数的可调用对象,可以是一个函数、方法,或一个有__call__
方法的实例。
角色的实现
application 端 : 由 Python 框架实现,会提供接口让开发者能够获取到请求内容,并帮助进行响应返回
server 端 : 一般 web 服务器 不内置对 WSGI 的支持,需要通过扩展来完成,比如 Apache 的 mod_wsgi 扩展模块、Nginx 的 uWSGI。扩展可以实现 WSGI 的服务端、进程管理、对 application 的调用
application
在 web 框架中定义
这里举了两个 application 对象的例子,一个是通过函数实现,另一个通过类实现
HELLO_WORLD = b"Hello world!\n"
def simple_app(environ, start_response):
"""Simplest possible application object"""
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
start_response(status, response_headers)
return [HELLO_WORLD]
HELLO_WORLD = b"Hello world!\n"
class AppClass:
"""Produce the same output, but using a class
(Note: 'AppClass' is the "application" here, so calling it
returns an instance of 'AppClass', which is then the iterable
return value of the "application callable" as required by
the spec.
If we wanted to use *instances* of 'AppClass' as application
objects instead, we would have to implement a '__call__'
method, which would be invoked to execute the application,
and we would need to create an instance for use by the
server or gateway.
"""
def __init__(self, environ, start_response):
self.environ = environ
self.start = start_response
def __iter__(self):
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
self.start(status, response_headers)
yield HELLO_WORLD
server 调用 application 对象
每个 web 应用只有一个入口,就是按照 WSGI 规范定义的 application,这个可调用对象在 Python 应用的一个文件/模块(入口文件)中定义。
每当 web 服务器从一个 HTTP 客户端收到请求,便会调用这个 application,将用户请求传递给 web 应用。
server 调用 application 时传递两个参数
application
对象必须接受两个位置参数:environ
和start_response
,对参数名称没有强制要求。因此,server/gateway 在调用application
对象时,必须产生并传递两个位置参数,而不是关键字参数。比如,这样调用result = application(environ, start_response)
。
environ
参数是一个字典对象, 包含 CGI 规范中定义的environment
变量。这个对象必须是一个 Python 内建的字典, application 可以根据需要修改内容。字典还必须包含 WSGI 规范要求的变量, 以及 server 指定的扩展变量、任意的操作系统的环境变量。
environ
中常用的成员,首先是CGI规范中要求必须包含的变量,除非值为空字符串:
- REQUEST_METHOD: HTTP 请求方法,是个字符串,'GET'、 'POST'等
- SCRIPT_NAME: HTTP请求的path中的用于查找到application对象的部分,比如Web服务器可以根据path的一部分来决定请求由哪个virtual host处理
- PATH_INFO: HTTP请求的path中剩余的部分,也就是application要处理的部分
- QUERY_STRING: HTTP请求中的查询字符串,URL中?后面的内容
- CONTENT_TYPE: HTTP headers中的content-type内容
- CONTENT_LENGTH: HTTP headers中的content-length内容
- SERVER_NAME 和 SERVER_PORT: 服务器名和端口,这两个值和前面的SCRIPT_NAME, PATH_INFO拼起来可以得到完整的URL路径
- SERVER_PROTOCOL: HTTP协议版本,HTTP/1.0或者HTTP/1.1
- HTTP_: 和HTTP请求中的headers对应。
- WSGI规范中还要求environ包含下列成员:
WSGI规范中要求必须有的environ变量:
- wsgi.version:表示WSGI版本,一个元组(1, 0),表示版本1.0
- wsgi.url_scheme:http或者https
- wsgi.input:一个类文件的输入流,application可以通过这个获取HTTP request body
- wsgi.errors:一个输出流,当应用程序出错时,可以将错误信息写入这里
- wsgi.multithread:当application对象可能被多个线程同时调用时,这个值需要为True
- wsgi.multiprocess:当application对象可能被多个进程同时调用时,这个值需要为True
- wsgi.run_once:当server期望application对象在进程的生命周期内只被调用一次时,该值为True
start_response
是一个可调用的对象,接受两个必须的位置参数和一个可选的参数。通常命名为status
,response_headers
和exc_info
,但不强制。start_response
的定义方式:start_response(status, response_headers)
。
status
参数是一个表示 HTTP 响应状态的字符串, 例如200 ok
。response_headers
是一个列表,由多个(header_name, header_value)
元组组成,描述 HTTP 响应头部。可选参数exc_info
,仅用于 server 向客户端报错并在浏览器中显示错误。
start_response
必须返回一个可调用的write(body_data)
,有一个位置参数: HTTP 响应的内容。
web 服务器的例子
import os, sys
enc, esc = sys.getfilesystemencoding(), 'surrogateescape'
def unicode_to_wsgi(u):
# Convert an environment variable to a WSGI "bytes-as-unicode" string
return u.encode(enc, esc).decode('iso-8859-1')
def wsgi_to_bytes(s):
return s.encode('iso-8859-1')
def run_with_cgi(application):
environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()} # 定义 environ
environ['wsgi.input'] = sys.stdin.buffer
environ['wsgi.errors'] = sys.stderr
environ['wsgi.version'] = (1, 0)
environ['wsgi.multithread'] = False
environ['wsgi.multiprocess'] = True
environ['wsgi.run_once'] = True
if environ.get('HTTPS', 'off') in ('on', '1'):
environ['wsgi.url_scheme'] = 'https'
else:
environ['wsgi.url_scheme'] = 'http'
headers_set = []
headers_sent = []
def write(data):
out = sys.stdout.buffer
if not headers_set:
raise AssertionError("write() before start_response()")
elif not headers_sent:
# Before the first output, send the stored headers
status, response_headers = headers_sent[:] = headers_set
out.write(wsgi_to_bytes('Status: %s\r\n' % status))
for header in response_headers:
out.write(wsgi_to_bytes('%s: %s\r\n' % header))
out.write(wsgi_to_bytes('\r\n'))
out.write(data)
out.flush()
def start_response(status, response_headers, exc_info=None): # 定义 start_response
if exc_info:
try:
if headers_sent:
# Re-raise original exception if headers sent
raise exc_info[1].with_traceback(exc_info[2])
finally:
exc_info = None # avoid dangling circular ref
elif headers_set:
raise AssertionError("Headers already set!")
headers_set[:] = [status, response_headers]
# Note: error checking on the headers should happen here,
# *after* the headers are set. That way, if an error
# occurs, start_response can only be re-called with
# exc_info set.
return write
result = application(environ, start_response) # 调用 application
try:
for data in result:
if data: # don't send headers until body appears
write(data)
if not headers_sent:
write('') # send headers now if body was empty
finally:
if hasattr(result, 'close'):
result.close()
application 返回数据
被调用的 application 对象根据 environ 的内容完成业务逻辑,并返回数据给 server
- 先调用
start_response()
,返回status
和response_headers
给 server 作为 HTTP 响应头部。这同时也是一个信号,告诉 server,要开始返回 HTTP 的 body 了 - 然后,通过 return 返回一个可迭代对象作为 HTTP 响应内容,如果响应为空,可以返回
None
这样 server 可以按照规定的 HTTP 报文格式顺序,先发送 HTTP 响应头部,然后再发送 HTTP 响应的内容。
WSGI 中间件
需要注意的是,有些应用可以同时扮演 WSGI 的两种角色/具有对应的功能,比如中间件(middleware)。这是运行在 server 与 application 之间的应用。
对于 server ,中间件是 application,而对于 application,中间件是 server。
可以生成environ
, 定义start_response
, 调用application
对象。也可以执行业务逻辑,调用start_response
,并通过return
返回结果。server 获取结果,发送给客户。
中间件可以有多层,能够处理所有经过的request
和response
,比如检查、修改。
中间件的工作过程:
上图中最上面的三个彩色框表示角色,中间的白色框表示操作,操作的发生顺序按照1 ~ 5进行了排序,我们直接对着上图来说明middleware是如何工作的:
- Server 收到客户端的 HTTP 请求后,生成了
environ_s
,并且已经定义了start_response_s
。 - Server 调用
Middleware
的application
对象,传递的参数是environ_s
和start_response_s
。 - Middleware 会根据
environ
执行业务逻辑,生成environ_m
,并且已经定义了start_response_m
。 - Middleware 决定调用 Application 的 application 对象,传递参数是
environ_m
和start_response_m
。Application 的 application 对象处理完成后,会调用start_response_m
并且返回结果给Middleware ,存放在result_m
中。 - Middleware 处理
result_m
,然后生成result_s
,接着调用start_response_s
,并返回结果result_s
给 Server 端。Server 端获取到result_s
后就可以发送结果给客户端了。
web 框架 WSGI application 端代码
Pyramid
from pyramid.config import Configurator
from pyramid.response import Response
def hello_world(request):
return Response(
'Hello world from Pyramid!n',
content_type='text/plain',
)
config = Configurator()
config.add_route('hello', '/hello')
config.add_view(hello_world, route_name='hello')
app = config.make_wsgi_app()
pyramid.config.__init__.py
from pyramid.router import Router
class Configurator(
TestingConfiguratorMixin,
TweensConfiguratorMixin,
SecurityConfiguratorMixin,
ViewsConfiguratorMixin,
RoutesConfiguratorMixin,
ZCAConfiguratorMixin,
I18NConfiguratorMixin,
RenderingConfiguratorMixin,
AssetsConfiguratorMixin,
SettingsConfiguratorMixin,
FactoriesConfiguratorMixin,
AdaptersConfiguratorMixin,
):
"""
A Configurator is used to configure a :app:`Pyramid`
:term:`application registry`.
"""
def make_wsgi_app(self):
self.commit()
app = Router(self.registry)
global_registries.add(self.registry)
self.manager.push({'registry':self.registry, 'request':None})
try:
self.registry.notify(ApplicationCreated(app))
finally:
self.manager.pop()
return app
pyramid.Router.py
@implementer(IRouter)
class Router(object):
debug_notfound = False
debug_routematch = False
threadlocal_manager = manager
def __init__(self, registry):
q = registry.queryUtility
self.logger = q(IDebugLogger)
self.root_factory = q(IRootFactory, default=DefaultRootFactory)
self.routes_mapper = q(IRoutesMapper)
self.request_factory = q(IRequestFactory, default=Request)
self.request_extensions = q(IRequestExtensions)
tweens = q(ITweens)
if tweens is None:
tweens = excview_tween_factory
self.orig_handle_request = self.handle_request
self.handle_request = tweens(self.handle_request, registry)
self.root_policy = self.root_factory # b/w compat
self.registry = registry
settings = registry.settings
if settings is not None:
self.debug_notfound = settings['debug_notfound']
self.debug_routematch = settings['debug_routematch']
def handle_request(self, request):
attrs = request.__dict__
registry = attrs['registry']
request.request_iface = IRequest
context = None
routes_mapper = self.routes_mapper
debug_routematch = self.debug_routematch
adapters = registry.adapters
has_listeners = registry.has_listeners
notify = registry.notify
logger = self.logger
has_listeners and notify(NewRequest(request))
# find the root object
root_factory = self.root_factory
if routes_mapper is not None:
info = routes_mapper(request)
match, route = info['match'], info['route']
if route is None:
if debug_routematch:
msg = ('no route matched for url %s' %
request.url)
logger and logger.debug(msg)
else:
attrs['matchdict'] = match
attrs['matched_route'] = route
if debug_routematch:
msg = (
'route matched for url %s; '
'route_name: %r, '
'path_info: %r, '
'pattern: %r, '
'matchdict: %r, '
'predicates: %r' % (
request.url,
route.name,
request.path_info,
route.pattern,
match,
', '.join([p.text() for p in route.predicates]))
)
logger and logger.debug(msg)
request.request_iface = registry.queryUtility(
IRouteRequest,
name=route.name,
default=IRequest)
root_factory = route.factory or self.root_factory
root = root_factory(request)
attrs['root'] = root
# find a context
traverser = adapters.queryAdapter(root, ITraverser)
if traverser is None:
traverser = ResourceTreeTraverser(root)
tdict = traverser(request)
context, view_name, subpath, traversed, vroot, vroot_path = (
tdict['context'],
tdict['view_name'],
tdict['subpath'],
tdict['traversed'],
tdict['virtual_root'],
tdict['virtual_root_path']
)
attrs.update(tdict)
has_listeners and notify(ContextFound(request))
# find a view callable
context_iface = providedBy(context)
response = _call_view(
registry,
request,
context,
context_iface,
view_name
)
if response is None:
if self.debug_notfound:
msg = (
'debug_notfound of url %s; path_info: %r, '
'context: %r, view_name: %r, subpath: %r, '
'traversed: %r, root: %r, vroot: %r, '
'vroot_path: %r' % (
request.url, request.path_info, context,
view_name, subpath, traversed, root, vroot,
vroot_path)
)
logger and logger.debug(msg)
else:
msg = request.path_info
raise HTTPNotFound(msg)
return response
def invoke_subrequest(self, request, use_tweens=False):
registry = self.registry
has_listeners = self.registry.has_listeners
notify = self.registry.notify
threadlocals = {'registry':registry, 'request':request}
manager = self.threadlocal_manager
manager.push(threadlocals)
request.registry = registry
request.invoke_subrequest = self.invoke_subrequest
if use_tweens:
handle_request = self.handle_request
else:
handle_request = self.orig_handle_request
try:
try:
extensions = self.request_extensions
if extensions is not None:
apply_request_extensions(request, extensions=extensions)
response = handle_request(request)
if request.response_callbacks:
request._process_response_callbacks(response)
has_listeners and notify(NewResponse(request, response))
return response
finally:
if request.finished_callbacks:
request._process_finished_callbacks()
finally:
manager.pop()
def __call__(self, environ, start_response): # 按照 WSGI 规范定义的 application
"""
Accept ``environ`` and ``start_response``; create a
:term:`request` and route the request to a :app:`Pyramid`
view based on introspection of :term:`view configuration`
within the application registry; call ``start_response`` and
return an iterable.
"""
request = self.request_factory(environ)
response = self.invoke_subrequest(request, use_tweens=True)
return response(request.environ, start_response)
flask
from flask import Flask
from flask import Response
flask_app = Flask('flaskapp')
@flask_app.route('/hello')
def hello_world():
return Response(
'Hello world from Flask!n',
mimetype='text/plain'
)
app = flask_app.wsgi_app
flask.app.py
class Flask(_PackageBoundObject):
"""The flask object implements a WSGI application and acts as the central
object. It is passed the name of the module or package of the
application. Once it is created it will act as a central registry for
the view functions, the URL rules, template configuration and much more.
The name of the package is used to resolve resources from inside the
package or the folder the module is contained in depending on if the
package parameter resolves to an actual python package (a folder with
an `__init__.py` file inside) or a standard module (just a `.py` file).
For more information about resource loading, see :func:`open_resource`.
Usually you create a :class:`Flask` instance in your main module or
in the `__init__.py` file of your package like this::
from flask import Flask
app = Flask(__name__)
.. admonition:: About the First Parameter
The idea of the first parameter is to give Flask an idea what
belongs to your application. This name is used to find resources
on the file system, can be used by extensions to improve debugging
information and a lot more.
So it's important what you provide there. If you are using a single
module, `__name__` is always the correct value. If you however are
using a package, it's usually recommended to hardcode the name of
your package there.
For example if your application is defined in `yourapplication/app.py`
you should create it with one of the two versions below::
app = Flask('yourapplication')
app = Flask(__name__.split('.')[0])
Why is that? The application will work even with `__name__`, thanks
to how resources are looked up. However it will make debugging more
painful. Certain extensions can make assumptions based on the
import name of your application. For example the Flask-SQLAlchemy
extension will look for the code in your application that triggered
an SQL query in debug mode. If the import name is not properly set
up, that debugging information is lost. (For example it would only
pick up SQL queries in `yourapplication.app` and not
`yourapplication.views.frontend`)
"""
#: Default configuration parameters.
default_config = ImmutableDict({
'DEBUG': False,
'TESTING': False,
'PROPAGATE_EXCEPTIONS': None,
'PRESERVE_CONTEXT_ON_EXCEPTION': None,
'SECRET_KEY': None,
'PERMANENT_SESSION_LIFETIME': timedelta(days=31),
'USE_X_SENDFILE': False,
'LOGGER_NAME': None,
'SERVER_NAME': None,
'APPLICATION_ROOT': None,
'SESSION_COOKIE_NAME': 'session',
'SESSION_COOKIE_DOMAIN': None,
'SESSION_COOKIE_PATH': None,
'SESSION_COOKIE_HTTPONLY': True,
'SESSION_COOKIE_SECURE': False,
'MAX_CONTENT_LENGTH': None,
'SEND_FILE_MAX_AGE_DEFAULT': 12 * 60 * 60, # 12 hours
'TRAP_BAD_REQUEST_ERRORS': False,
'TRAP_HTTP_EXCEPTIONS': False,
'PREFERRED_URL_SCHEME': 'http',
'JSON_AS_ASCII': True,
'JSON_SORT_KEYS': True,
'JSONIFY_PRETTYPRINT_REGULAR': True,
})
def __init__(self, import_name, static_path=None, static_url_path=None,
static_folder='static', template_folder='templates',
instance_path=None, instance_relative_config=False):
_PackageBoundObject.__init__(self, import_name,
template_folder=template_folder)
if static_path is not None:
from warnings import warn
warn(DeprecationWarning('static_path is now called '
'static_url_path'), stacklevel=2)
static_url_path = static_path
if static_url_path is not None:
self.static_url_path = static_url_path
if static_folder is not None:
self.static_folder = static_folder
if instance_path is None:
instance_path = self.auto_find_instance_path()
elif not os.path.isabs(instance_path):
raise ValueError('If an instance path is provided it must be '
'absolute. A relative path was given instead.')
self.instance_path = instance_path
self.config = self.make_config(instance_relative_config)
# Prepare the deferred setup of the logger.
self._logger = None
self.logger_name = self.import_name
self.view_functions = {}
self._error_handlers = {}
self.error_handler_spec = {None: self._error_handlers}
self.url_build_error_handlers = []
self.before_request_funcs = {}
self.before_first_request_funcs = []
self.after_request_funcs = {}
self.teardown_request_funcs = {}
self.teardown_appcontext_funcs = []
self.url_value_preprocessors = {}
self.url_default_functions = {}
self.template_context_processors = {
None: [_default_template_ctx_processor]
}
self.blueprints = {}
self.extensions = {}
self.url_map = Map()
self._got_first_request = False
self._before_request_lock = Lock()
if self.has_static_folder:
self.add_url_rule(self.static_url_path + '/',
endpoint='static',
view_func=self.send_static_file)
def make_response(self, rv):
"""Converts the return value from a view function to a real
response object that is an instance of :attr:`response_class`.
The following types are allowed for `rv`:
.. tabularcolumns:: |p{3.5cm}|p{9.5cm}|
======================= ===========================================
:attr:`response_class` the object is returned unchanged
:class:`str` a response object is created with the
string as body
:class:`unicode` a response object is created with the
string encoded to utf-8 as body
a WSGI function the function is called as WSGI application
and buffered as response object
:class:`tuple` A tuple in the form ``(response, status,
headers)`` where `response` is any of the
types defined here, `status` is a string
or an integer and `headers` is a list of
a dictionary with header values.
======================= ===========================================
:param rv: the return value from the view function
.. versionchanged:: 0.9
Previously a tuple was interpreted as the arguments for the
response object.
"""
status = headers = None
if isinstance(rv, tuple):
rv, status, headers = rv + (None,) * (3 - len(rv))
if rv is None:
raise ValueError('View function did not return a response')
if not isinstance(rv, self.response_class):
# When we create a response object directly, we let the constructor
# set the headers and status. We do this because there can be
# some extra logic involved when creating these objects with
# specific values (like default content type selection).
if isinstance(rv, (text_type, bytes, bytearray)):
rv = self.response_class(rv, headers=headers, status=status)
headers = status = None
else:
rv = self.response_class.force_type(rv, request.environ)
if status is not None:
if isinstance(status, string_types):
rv.status = status
else:
rv.status_code = status
if headers:
rv.headers.extend(headers)
return rv
def wsgi_app(self, environ, start_response): # 按照 WSGI 规范定义的 application
"""The actual WSGI application. This is not implemented in
`__call__` so that middlewares can be applied without losing a
reference to the class. So instead of doing this::
app = MyMiddleware(app)
It's a better idea to do this instead::
app.wsgi_app = MyMiddleware(app.wsgi_app)
Then you still have the original application object around and
can continue to call methods on it.
.. versionchanged:: 0.7
The behavior of the before and after request callbacks was changed
under error conditions and a new callback was added that will
always execute at the end of the request, independent on if an
error occurred or not. See :ref:`callbacks-and-errors`.
:param environ: a WSGI environment
:param start_response: a callable accepting a status code,
a list of headers and an optional
exception context to start the response
"""
ctx = self.request_context(environ)
ctx.push()
error = None
try:
try:
response = self.full_dispatch_request()
except Exception as e:
error = e
response = self.make_response(self.handle_exception(e))
return response(environ, start_response)
finally:
if self.should_ignore_error(error):
error = None
ctx.auto_pop(error)
def __call__(self, environ, start_response):
"""Shortcut for :attr:`wsgi_app`."""
return self.wsgi_app(environ, start_response)
django
# -*- coding:utf-8 -*-
"""
WSGI config for helloworld project.
It exposes the WSGI callable as a module-level variable named ``application``.
For more information on this file, see
https://docs.djangoproject.com/en/1.7/howto/deployment/wsgi/
"""
# 配置 settings 模块
import os
# os.environ.setdefault("DJANGO_SETTINGS_MODULE", "helloworld.settings")
os.environ["DJANGO_SETTINGS_MODULE"]="helloworld.settings"
'''
如果 "DJANGO_SETTINGS_MODULE"这个变量没有设置,默认 wsgi.py 设置为 mysite.settings, mysite 是你的项目的名称。这是默认情况下, runserver 发现配置的地方。
'''
# 应用 WSGI middleware
from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()
django.core.wsgi
import django
from django.core.handlers.wsgi import WSGIHandler
def get_wsgi_application():
"""
The public interface to Django's WSGI support. Should return a WSGI
callable.
Allows us to avoid making django.core.handlers.WSGIHandler public API, in
case the internal WSGI implementation changes or moves in the future.
"""
django.setup()
return WSGIHandler()
django.core.handlers.wsgi
class WSGIHandler(base.BaseHandler):
initLock = Lock()
request_class = WSGIRequest
def __call__(self, environ, start_response): # 按照 WSGI 规范定义的 application
# Set up middleware if needed. We couldn't do this earlier, because
# settings weren't available.
if self._request_middleware is None:
with self.initLock:
try:
# Check that middleware is still uninitialized.
if self._request_middleware is None:
self.load_middleware()
except:
# Unload whatever middleware we got
self._request_middleware = None
raise
set_script_prefix(get_script_name(environ))
signals.request_started.send(sender=self.__class__, environ=environ)
try:
request = self.request_class(environ)
except UnicodeDecodeError:
logger.warning('Bad Request (UnicodeDecodeError)',
exc_info=sys.exc_info(),
extra={
'status_code': 400,
}
)
response = http.HttpResponseBadRequest()
else:
response = self.get_response(request)
response._handler_class = self.__class__
status = '%s %s' % (response.status_code, response.reason_phrase)
response_headers = [(str(k), str(v)) for k, v in response.items()]
for c in response.cookies.values():
response_headers.append((str('Set-Cookie'), str(c.output(header=''))))
start_response(force_str(status), response_headers)
if getattr(response, 'file_to_stream', None) is not None and environ.get('wsgi.file_wrapper'):
response = environ['wsgi.file_wrapper'](response.file_to_stream)
return response
参考资料
- WSGI简介
- PEP-3333