摘要

本文主要是介绍用python自带的BaseHTTPRequestHandler,HTTPServer类实现一个简易的web服务器,从而加深对http协议和web服务器实现、运行原理的理解，同时对web服务器与客户端的交互过程进行详细介绍，明白服务器是如何处理客户端对其请求后，将服务器资源响应给客户端的，更重要的是通过本项目的实现可以了解python的网络开发基础模块和CGI协议，从而拥有能够深入学习python网络开发体系知识基础。

1．引言

对于web应用服务器,相信只要是接触B/S架构开发的都会非常熟悉,动态网页的运行是要在web服务器这样的容器中进行处理,然后将数据返回到客户端进行显示的,对于php,最常见的是Apache,而jsp则是tomcat等轻量级应用服务器的使用是很普遍的,而这些服务器部署起来很容易,但是其实现原理却值得探究。

正好这段时间学习了python,想着实现一个最简单的web服务器应用,可以对客户端的请求进行响应,从而正常显示资源页面。

2. 系统结构

对使用的相关技术，相关模块进，实现功能的原理进行介绍，采用框架图，示例图等进行表述，使人可以对系统的框架和原理有个比较好的把握；

用python实现简易服务器,前提需要知道以下几个点:

2.1 B/S架构原理

B代表的是浏览器,S代表的是服务器,B/S交互流程:

[if !supportLists]1. [endif]用户在客户端向服务器发送请求,等待服务器响应;

[if !supportLists]2. [endif]服务器端接收到请求后,对请求的数据进行处理,并产生响应资源数据;

[if !supportLists]3. [endif]服务器将产生的资源数据返回给客户端浏览器

[if !supportLists]4. [endif]浏览器接收并进行解析相关资源文件,然后呈现在客户端界面。

简单原理图如下:

图1 B/S架构原理

2.2 http协议工作原理

http是客户端和服务器端请求和应答的标准,是基于TCP/IP协议之上的应用层协议.

工作原理:当客户端发起一个http请求时,客户端创建一个到服务器指定端口(http默认端口号80)的TCP连接,服务器在指定端口号监听客户端的请求,一旦接收到请求,服务器回向客户端返回一个状态包含状态码以及返回的内容.

http状态码:

[if !supportLists]· [endif]1xx消息——请求已被服务器接收，继续处理

[if !supportLists]· [endif]2xx成功——请求已成功被服务器接收、理解、并接受

[if !supportLists]· [endif]3xx重定向——需要后续操作才能完成这一请求

[if !supportLists]· [endif]4xx请求错误——请求含有词法错误或者无法被执行

[if !supportLists]· [endif]5xx服务器错误——服务器在处理某个正确请求时发生错误

[if !vml]

[endif]

图2 http请求及响应数据

2.3 URL

URL,全称是UniformResourceLocator, 中文叫统一资源定位符,是互联网上用来标识某一处资源的地址,就是在浏览器中键入的网址.

2.4 HTTPServer、BaseHTTPRequestHandler

HTTPServer

继承SocketServer.TCPServer，用于获取请求，并将请求分配给应答程序处理

HttpServer的处理过程如下：

[if !supportLists]1. [endif]HTTPServer绑定对应的应答类（BaseHTTPRequestHandler ），http_server = HTTPServer(('', int(port)), ServerHTTP)；

[if !supportLists]2. [endif]监听端口：serve_forever()方法使用select.select()循环监听请求，当接收到请求后调用当监听到请求时，取出请求对象；

[if !supportLists]3. [endif] 应答：创建新线程以连接对象为参数实例化应答类:ServerHTTP()应答类根据请求方式调用ServerHTTP.do_XXX处理方法

BaseHTTPRequestHandler

继承SocketServer.StreamRequestHandler，对http连接的请求作出应答（response）是一个以TCPServer为基础开发的模块，可以在请求外层添加http协议报文，发送http协议。

3.实验代码

3.1首先导入相关模块

import sys,os, subprocess

fromhttp.server import BaseHTTPRequestHandler,HTTPServer

[if !supportLists]3.2[endif]条件处理基类

classbase_case(object):

def handle_file(self, handler, full_path):

try:

with open(full_path, 'rb') as reader:

content = reader.read()

handler.send_content(content)

except IOError as msg:

msg = "'{0}' cannot be read: {1}".format(full_path, msg)

handler.handle_error(msg)

def index_path(self, handler):

return os.path.join(handler.full_path, 'index.html')

def test(self, handler):

assert False, 'Not implemented.'

def act(self, handler):

assert False, 'Not implemented.'

[if !supportLists]3.3[endif]CGI协议处理实现类

classcase_cgi_file(base_case):

def run_cgi(self,handler):

data =subprocess.check_output(["python3", handler.full_path],shell=False)

handler.send_content(data)

def test(self, handler):

return os.path.isfile(handler.full_path) and \

handler.full_path.endswith('.py')

def act(self, handler):

self.run_cgi(handler)

[if !supportLists]3.4[endif]文件或目录不存在的情况下服务器处理实现

classcase_no_file(base_case):

def test(self,handler):

return not os.path.exists(handler.full_path)

def act(self, handler):

raise ServerException("'{0}' not found".format(handler.path))

3.5 当服务器存在文件时,服务器的处理实现

classcase_existing_file(base_case):

def test(self, handler):

return os.path.isfile(handler.full_path)

def act(self, handler):

self.handle_file(handler, handler.full_path)

3.6 客户端直接输入URL,返回index主页面

class case_directory_index_file(base_case):

def test(self, handler): #判断目标路径下是否有index.html主页面

returnos.path.isdir(handler.full_path) and \

os.path.isfile(self.index_path(handler))

def act(self, handler): #对index.html的内容进行响应

self.handle_file(handler,self.index_path(handler))

3.7 默认处理类

classcase_always_fail(base_case):

def test(self, handler):

return True

def act(self, handler):

raiseServerException("Unknown object '{0}'".format(handler.path))

3.8 当客户端请求路径合法返回响应的处理,如果不合法,返回错误页面实现RequestHandler类

class RequestHandler(BaseHTTPRequestHandler):

Cases = [case_no_file(),

case_cgi_file(),

case_existing_file(),

case_directory_index_file(),

case_always_fail()]

#当请求路径不合法时返回的错误页面模板

Error_Page = """\

Error accessing {path}

{msg}

"""

#重写do_GET函数

def do_GET(self):

try:

#得到完整的请求路径

self.full_path = os.getcwd() + self.path

#遍历所有的情况并处理

for case in self.Cases:

if case.test(self):

case.act(self)

break

#进行异常处理

except Exception as msg:

self.handle_error(msg)

def handle_error(self, msg):

content = self.Error_Page.format(path=self.path, msg=msg)

self.send_content(content.encode("utf-8"), 404)

# 将数据发送到客户端

def send_content(self, content, status=200):

self.send_response(status)

self.send_header("Content-type", "text/html")

self.send_header("Content-Length", str(len(content)))

self.end_headers()

self.wfile.write(content)

3.9服务器程序异常类

class ServerException(Exception):

pass

3.10主函数实现

if __name__ == '__main__':

serverAddress = ('', 8000) #设置对应端口

server = HTTPServer(serverAddress,

RequestHandler) #HTTPServer绑定对应的应答类

server.serve_forever() #serve_forever()方法使用select.select()循环监听请求,收到时取出请求对象,创建新线程进行应答

3.11最后在项目中建立几个用于测试的HTML页面:

Index.html

运行server.py程序,再开启浏览器程序进行访问测试

4.实验结果

文件目录结构:

[if !vml]

[endif]

客户端浏览器直接访问请求index主页：

[if !vml]

[endif]

客户端浏览器请求同一目录下存在的register.html页面：

[if !vml]

[endif]

客户端浏览器请求不存在的页面：

[if !vml]

[endif]

5.总结和展望

通过实现基于python的简易web服务器的实现对于B/S架构的理解也不断加深,而且两者之间的通信是基于http、TCP/IP等传输协议的，python中的BaseHTTPRequestHandler, HTTPServer等等相应模块类都能很好的实现这些功能。

基于python的简易web服务器基本能实现正确处理客户端浏览器的访问请求，并对其进行响应，但是其不足之处是仅限于web服务器中的静态网页，对于动态网页尚未实现相应功能。

本文核心代码和实现思路主要参考自《500 lines or less》项目，作者是 Mozilla 的 Greg Wilson

附:https://github.com/aosabook/500lines/blob/master/web-server,有兴趣的小伙伴可以参考参考哦!

基于python实现简易web服务器

Error accessing {path}

你可能感兴趣的:(基于python实现简易web服务器)