Web 服务器网关接口(WSGI)是 Web 服务器软件和用 Python 编写的 Web 应用程序之间的标准接口。 wsgiref 是 PEP 333 定义的 WSGI 规范的实现,可用于向 Web 服务器或框架添加 WSGI 支持。wsgiref 提供了以下几个功能:
from wsgiref.simple_server import make_server
def hello_world_app(environ, start_response):
"""每个 WSGI 应用程序都应该有一个 Application 对象,一个接受
evirion 和 start_response 参数的 callable(可调用)对象
"""
status = "200 OK"
headers = [('Content-Type', 'text/plain; chartset=utf-8')]
start_response(status, headers)
return [b'Hello, World']
httpd = make_server('', 8000, hello_world_app)
print('Serving on port 8000 ...')
# 服务直到进程被 killed
httpd.serve_forever()
运行上面一段代码,使用 curl -i localhost:8000 访问,结果如下所示:
$ python test_wsgiref.py
Serving on port 8000 ...
127.0.0.1 - - [20/Oct/2018 15:10:13] "GET / HTTP/1.1" 200 12
$ curl -i localhost:8000
HTTP/1.0 200 OK
Date: Sat, 20 Oct 2018 07:10:13 GMT
Server: WSGIServer/0.1 Python/2.7.10
Content-Type: text/plain; chartset=utf-8
Content-Length: 12
Hello, World%
可以去 github 的 cpython 项目找到 wsgiref 的源码,下面是 wsgiref 的代码结构
.
├── __init__.py
├── handlers.py
├── headers.py
├── simple_server.py
├── util.py
└── validate.py
* util -- Miscellaneous useful functions and wrappers
* 一些有用的函数和包装器
* headers -- Manage response headers
* response 头部处理的逻辑
* handlers -- base classes for server/gateway implementations
* 服务端/网关 实现的基类(核心处理部分)
* simple_server -- a simple BaseHTTPServer that supports WSGI
* 一个简单的 WSGI HTTP服务端
* validate -- validation wrapper that sits between an app and a server to detect errors in either
* app 和 server 之间的包装器,用于检测其中的错误
前面简单示例中,使用了 simple_server 模块的 make_server 函数来开启一个 WSGI 服务器,所以先从这里当做入口,来看看 make_server 的源码实现:
def make_server(
host, port, app, server_class=WSGIServer, handler_class=WSGIRequestHandler
):
"""Create a new WSGI server listening on `host` and `port` for `app`"""
# 初始化 WSGIServer 实例
# WSGIServer -> HTTPServer.__init__
# -> TCPServer.__init__
# -> TCPServer.server_bind
# -> TCPServer.socket.bind (socket绑定监听地址)
# -> TCPServer.socker_activate
# -> TCPServer.socket.listen (开始 TCP 监听)
server = server_class((host, port), handler_class)
server.set_app(app)
return server
从代码看到,这段函数的作用是,监听主机 host 的 port 端口,当收到客户端的请求后,经过 WSGIServer 和 WSGIRequestHandler 处理后,再把处理后的请求发送给 app 应用程序,app 返回请求的结果。
代码虽然只有几行,但从中我们知道,一个 WSGI 服务启动需要一些东西:
从上可以看出,生成 server 实例时,默认的 server_class 是 WSGIServer 类,WSGIServer 是 HTTPServer 的子类,HTTPServer 类又是 TCPServer 的子类,而 TCPServer 的基类是 BaseServer。因此在实例化 WSGIServer 时会沿着继承链走下去,最终由 TCPServer 来实现 socket 的绑定(bind)和监听(listen)。
上面说到 WSGI 服务端收到客户端请求后,会经过 WSGIServer 和 WSGIRequestHandler 的处理,那么它们主要做了什么工作呢?可以简单通过一张图来看看:
其中 WSGIServer、WSGIRequestHandler 类的作用如下图所示:
可以看出 WSGIServer 主要是封装了 socket 连接、解析 http 请求然后把请求交给 WSGIRequestHandler 处理。下面进入 WSGIServer 来了解一下,该类具体做了什么:
class WSGIServer(HTTPServer):
"""BaseHTTPServer that implements the Python WSGI protocol"""
application = None
def server_bind(self):
"""Override server_bind to store the server name."""
HTTPServer.server_bind(self)
self.setup_environ()
def setup_environ(self):
# Set up base environment
env = self.base_environ = {}
env['SERVER_NAME'] = self.server_name
env['GATEWAY_INTERFACE'] = 'CGI/1.1'
env['SERVER_PORT'] = str(self.server_port)
env['REMOTE_HOST']=''
env['CONTENT_LENGTH']=''
env['SCRIPT_NAME'] = ''
def get_app(self):
return self.application
def set_app(self,application):
self.application = application
通过上面代码,了解到 WSGIServer 继承 HTTPServer, 并在该基础上添加一下符合 WSGI 规范的内容:
下面是 WSGIServer 的继承链:
+------------+
| BaseServer |
+------------+
|
v
+------------+ +------------------+
| TCPServer |------->| UnixStreamServer |
+------------+ +------------------+
|
v
+------------+
| HTTPServer |
+------------+
|
v
+------------+
| WSGIServer |
+------------+
从继承链中可以看出,WSGIServer 继承自 HTTPServer, 而 HTTPServer 继承自 TCPServer, 而 TCPServer 继承于 BaseServer。HTTPServer 来自于 http 模块的server.py 部分,其余的来自于 socketserver 模块。
接下来看看 WSGIRequestHandler 类的实现:
class WSGIRequestHandler(BaseHTTPRequestHandler):
server_version = "WSGIServer/" + __version__
def get_environ(self):
env = self.server.base_environ.copy()
env['SERVER_PROTOCOL'] = self.request_version
env['SERVER_SOFTWARE'] = self.server_version
env['REQUEST_METHOD'] = self.command
if '?' in self.path:
path,query = self.path.split('?',1)
else:
path,query = self.path,''
env['PATH_INFO'] = urllib.parse.unquote(path, 'iso-8859-1')
env['QUERY_STRING'] = query
host = self.address_string()
if host != self.client_address[0]:
env['REMOTE_HOST'] = host
env['REMOTE_ADDR'] = self.client_address[0]
if self.headers.get('content-type') is None:
env['CONTENT_TYPE'] = self.headers.get_content_type()
else:
env['CONTENT_TYPE'] = self.headers['content-type']
length = self.headers.get('content-length')
if length:
env['CONTENT_LENGTH'] = length
for k, v in self.headers.items():
k=k.replace('-','_').upper(); v=v.strip()
if k in env:
continue # skip content length, type,etc.
if 'HTTP_'+k in env:
env['HTTP_'+k] += ','+v # comma-separate multiple headers
else:
env['HTTP_'+k] = v
return env
def get_stderr(self):
return sys.stderr
def handle(self):
"""Handle a single HTTP request"""
# 读取客户端发送的请求行
self.raw_requestline = self.rfile.readline(65537)
# 如果请求 URI 过长,报 414 错误
if len(self.raw_requestline) > 65536:
self.requestline = ''
self.request_version = ''
self.command = ''
self.send_error(414)
return
# 解析客户端的请求行和请求头
if not self.parse_request(): # An error code has been sent, just exit
return
# 通过 ServerHandler 来调用wsgi application
handler = ServerHandler(
self.rfile, self.wfile, self.get_stderr(), self.get_environ()
)
handler.request_handler = self # backpointer for logging
handler.run(self.server.get_app())
从上面代码看,WSGIRequestHandler 继承自 BaseHTTPRequestHandler,该类主要作用是处理客户端 http 请求,WSGIRequestHandler在这个的基础上添加符合wsgi规范的相关内容。该类提供了几个函数:
下面是 WSGIRequestHandler 的继承链
+------------------------+
| BaseRequestHandler |
+------------------------+
|
v
+------------------------+
| StreamRequestHandler |
+------------------------+
|
v
+------------------------+
| BaseHTTPRequestHandler |
+------------------------+
|
v
+------------------------+
| WSGIRequestHandler |
+------------------------+
ServerHandler 类接受参数为socket读端(self.rfile),输出端(self.wfile),错误输出端(self.get_stderr)以及一个包含请求信息的字典(self.get_environ), 其中 self.get_environ 函数就是解析environ变量部分, 返回包含 web 应用程序的环境变量和请求的环境变量的字典。
ServerHandler 继承自 SimpleHandler, 而 SimpleHandler 继承自 BaseHandler, 下面继续查看 Server Handler 的源码:
# ServerHandler 类
class ServerHandler(SimpleHandler):
server_software = software_version
def close(self):
try:
self.request_handler.log_request(
self.status.split(' ',1)[0], self.bytes_sent
)
finally:
SimpleHandler.close(self)
# SimpleHandler 类
class SimpleHandler(BaseHandler):
"""Handler that's just initialized with streams, environment, etc.
This handler subclass is intended for synchronous HTTP/1.0 origin servers,
and handles sending the entire response output, given the correct inputs.
Usage::
handler = SimpleHandler(
inp,out,err,env, multithread=False, multiprocess=True
)
handler.run(app)"""
def __init__(self,stdin,stdout,stderr,environ,
multithread=True, multiprocess=False
):
self.stdin = stdin
self.stdout = stdout
self.stderr = stderr
self.base_env = environ
self.wsgi_multithread = multithread
self.wsgi_multiprocess = multiprocess
def get_stdin(self):
return self.stdin
def get_stderr(self):
return self.stderr
def add_cgi_vars(self):
self.environ.update(self.base_env)
def _write(self,data):
result = self.stdout.write(data)
if result is None or result == len(data):
return
from warnings import warn
warn("SimpleHandler.stdout.write() should not do partial writes",
DeprecationWarning)
while True:
data = data[result:]
if not data:
break
result = self.stdout.write(data)
def _flush(self):
self.stdout.flush()
self._flush = self.stdout.flush
# BaseHandler 的部分代码
class BaseHandler:
def run(self, application):
"""Invoke the application"""
# Note to self: don't move the close()! Asynchronous servers shouldn't
# call close() from finish_response(), so if you close() anywhere but
# the double-error branch here, you'll break asynchronous servers by
# prematurely closing. Async servers must return from 'run()' without
# closing if there might still be output to iterate over.
try:
self.setup_environ()
self.result = application(self.environ, self.start_response)
self.finish_response()
except:
try:
self.handle_error()
except:
# If we get an error handling an error, just give up already!
self.close()
raise # ...and let the actual server figure it out.
下面是 ServerHandler 的继承链
+--------------+
| BaseHandler |
+--------------+
|
v
+--------------+
| SimpleServer |
+--------------+
|
v
+---------------+
| ServerHandler |
+---------------+