《How Tomcat Works》学习(三)——连接器(一)——解析http请求路径与参数

前言

本文内容根据《How Tomcat Works》第三章学习所写。书中第三章的程序比较庞大,实现了简单的请求解析、请求头部解析等。鉴于程序比较复杂,所以我自己实现了一个更简易的版本,很多特殊处理没做,不过代码简单一些。另外书中第三章内容比较多,我决定分成几篇文章来讲。本文的程序主要是为了讲解web服务器解析请求的路径与参数的过程到底做了什么。

 

程序

启动类

我们创建一个启动类,作为main入口。这里我们新建一个连接器,启动连接器

public final class Bootstrap {
	public static void main(String[] args) {
		HttpConnector connector = new HttpConnector();
		connector.start();
	}
}

 

连接器

连接器是个实现Runnable的类,就是说这个类是用来执行线程的。

public class HttpConnector implements Runnable {

	boolean stopped = false;
	private String scheme = "http";

	public String getScheme() {
		return scheme;
	}

	public void run() {
		ServerSocket serverSocket = null;
		int port = 8080;
		try {
			serverSocket = new ServerSocket(port, 1, InetAddress.getByName("127.0.0.1"));
		} catch (IOException e) {
			e.printStackTrace();
			System.exit(1);
		}
		while (!stopped) {
			Socket socket = null;
			try {
				socket = serverSocket.accept();
			} catch (Exception e) {
				continue;
			}
			// Hand this socket off to an HttpProcessor
			HttpProcessor processor = new HttpProcessor(this);
			processor.process(socket);
		}
	}

	public void start() {
		Thread thread = new Thread(this);
		thread.start();
	}
}

32-35行是通过启动类调用的,然后这里启动一个线程。然后线程执行run方法,创建ServerSocket并通过accept接受客户端的连接,类似前文中HttpServer类的功能。27行然后把请求处理统一交给HttpProcessor类。

 

处理类

处理类的功能承接了前文HttpServer类下面区分servlet与静态资源的处理,代码基本一致。

public class HttpProcessor {

	private HttpConnector httpConnector = null;

	public HttpProcessor(HttpConnector httpConnector) {
		this.httpConnector = httpConnector;
	}

	public void process(Socket socket) {
		InputStream input = null;
		OutputStream output = null;
		try {
			input = socket.getInputStream();
			output = socket.getOutputStream();

			HttpRequest request = new HttpRequest(input);
			request.parseRequest();

			Response response = new Response(output);
			response.setRequest(request);

			if (request.getRequestURI().startsWith("/servlet/")) {
				ServletProcessor processor = new ServletProcessor();
				processor.process(request, response);
			} else {
				StaticResourceProcessor processor = new StaticResourceProcessor();
				processor.process(request, response);
			}

			socket.close();
		} catch (Exception e) {
			e.printStackTrace();
		}
	}
}

16行的HttpRequest类是前文Request类的加强版。HttpRequest类是本文的重点。

 

HttpRequest类

HttpRequest继承了HttpServletRequest类,需要实现的方法非常多,这里我展现的代码先省略一部分,未用上的方法按集成开发环境默认生成的即可。

public class HttpRequest implements HttpServletRequest {

	private InputStream input = null;
	private String method = null;
	private String requestURI = null;
	private String protocol = null;
	private String queryString = null;
	private String requestedSessionId = null;
	private boolean requestedSessionURL = false;
	//判断请求参数是否已解析
	protected boolean parsed = false;
	protected Map parameters = null;

	public HttpRequest(InputStream input) {
		this.input = input;
	}

		public void parseRequest() {
		StringBuffer request = new StringBuffer(2048);
		int i = 0;
		int index;
		byte[] buffer = new byte[2048];
		try {
			while(true){
				input.read(buffer, i, 1);
				if(buffer[i] == '\r') {
					i++;
					input.read(buffer, i, 1);
					if(buffer[i] == '\n') {
						break;
					}
				}
				i++;
			}
		} catch (IOException e) {
			e.printStackTrace();
			i = -1;
		}
		for (int j = 0; j < i; j++) {
			request.append((char) buffer[j]);
		}
		System.out.println(request);
		method = parseMethod(request.toString());
		protocol = parseProtocol(request.toString());
		String requestString = parseRequestString(request.toString());
		requestString = normalize(requestString);
		parseQueryString(requestString);
		parseSessionId();
	}

//......
}

18行parseRequest是解析请求的主要实现方法。19-30行读取请求信息。32-33行截取HTTP请求的第一行。34行解析获取客户端请求方法,即“GET”、“POST”等。35行解析协议,如“HTTP/1.1”。36行获取http请求的字符串,37行对请求的字符串作标准化处理,38行分割请求路径和请求参数。下面逐个展示

 

获取请求方法

获取的方法非常简单,只需要截取http请求第一行第一个空格前的内容即可。另外要重写getMethod方法,让外部可以获取已解析的请求方法。

public class HttpRequest implements HttpServletRequest {

	private String method = null;

//......

	private String parseMethod(String requestString) {
		int index1;
		index1 = requestString.indexOf(' ');
		if (index1 != -1) {
			return requestString.substring(0, index1);
		}
		return null;
	}

	@Override
	public String getMethod() {
		return method;
	}

//......
}

 

获取协议

和获取请求方法差不多,获取的是http请求第一行第二个空格之后的内容。另外重写getProtocol方法

public class HttpRequest implements HttpServletRequest {

	private String protocol = null;

//......

	private String parseProtocol(String requestString) {
		int index1, index2;
		index1 = requestString.indexOf(' ');
		if (index1 != -1) {
			index2 = requestString.indexOf(' ', index1 + 1);
			if (index2 > index1)
				return requestString.substring(index2 + 1);
		}
		return null;
	}

	@Override
	public String getProtocol() {
		return protocol;
	}

//......
}

 

获取请求URL

这里获取的是http请求第一行第一个空格和第二个空格之间的内容。不过这不一定是单纯的路径,还可能包括请求参数和SessionId,所以后面还需要进行拆分。

public class HttpRequest implements HttpServletRequest {


//......

	private String parseRequestString(String requestString) {
		int index1, index2;
		index1 = requestString.indexOf(' ');
		if (index1 != -1) {
			index2 = requestString.indexOf(' ', index1 + 1);
			if (index2 > index1) {
				return requestString.substring(index1 + 1, index2);
			}
		}
		return null;
	}

//......
}

 

标准化处理

这里代码挺长的,基本就是对请求路径做一些标准化的处理,例如请求路径是“/.”,就把它改为“/”。

	protected String normalize(String path) {
		if (path == null)
			return null;

		String normalized = path;

		if (normalized.startsWith("/%7E") || normalized.startsWith("/%7e"))
			normalized = "/~" + normalized.substring(4);
		if ((normalized.indexOf("%25") >= 0) || (normalized.indexOf("%2F") >= 0) || (normalized.indexOf("%2E") >= 0)
				|| (normalized.indexOf("%5C") >= 0) || (normalized.indexOf("%2f") >= 0)
				|| (normalized.indexOf("%2e") >= 0) || (normalized.indexOf("%5c") >= 0)) {
			return null;
		}

		if (normalized.equals("/."))
			return "/";
		if (normalized.indexOf('\\') >= 0)
			normalized = normalized.replace('\\', '/');
		if (!normalized.startsWith("/"))
			normalized = "/" + normalized;

		while (true) {
			int index = normalized.indexOf("//");
			if (index < 0)
				break;
			normalized = normalized.substring(0, index) + normalized.substring(index + 1);
		}

		while (true) {
			int index = normalized.indexOf("/./");
			if (index < 0)
				break;
			normalized = normalized.substring(0, index) + normalized.substring(index + 2);
		}

		while (true) {
			int index = normalized.indexOf("/../");
			if (index < 0)
				break;
			if (index == 0)
				return (null);
			int index2 = normalized.lastIndexOf('/', index - 1);
			normalized = normalized.substring(0, index2) + normalized.substring(index + 3);
		}

		if (normalized.indexOf("/...") >= 0)
			return (null);

		return (normalized);
	}

 

分割请求路径和请求参数

前面讲到了请求的字符串可能会包含请求参数,因此需要分割。请求路径和参数使用问号“?”进行分割,因此我们只需找出这个符号进行分割即可。

public class HttpRequest implements HttpServletRequest {

	private String requestURI = null;
	private String queryString = null;

//......

	private void parseQueryString(String requestString) {
		int index;
		index = requestString.indexOf('?');
		if (index != -1) {
			requestURI = requestString.substring(0, index);
			queryString = requestString.substring(index + 1);
		} else {
			requestURI = requestString;
		}
	}

	@Override
	public String getRequestURI() {
		return requestURI;
	}

	@Override
	public String getQueryString() {
		return queryString;
	}
//......
}

 

测试

写一个servlet类进行测试

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
import java.util.*;

public class ModernServlet extends HttpServlet {

	public void init(ServletConfig config) {
		System.out.println("ModernServlet -- init");
	}

	public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {

		response.setContentType("text/html");
		PrintWriter out = response.getWriter();
		out.print("HTTP/1.1 200 OK\r\n\r\n");

		out.println("");
		out.println("");
		out.println("Modern Servlet");
		out.println("");
		out.println("");

		out.println("

Method" + request.getMethod()); out.println("

Query String" + request.getQueryString()); out.println("

Request URI" + request.getRequestURI()); out.println(""); out.println(""); } }

启动Bootstrap类,然后打开浏览器输入http://localhost:8080/servlet/ModernServlet?username=jack&password=123456

《How Tomcat Works》学习(三)——连接器(一)——解析http请求路径与参数_第1张图片

我们成功获取解析内容

 

小结

本文实现的程序属于精简版,仅仅为了讲述web服务器大致的工作过程,很多特殊处理是无法适配的。即使书中所附的代码也远比本文的复杂,tomcat的处理则更复杂了。读者可以循序渐进,先了解大致其工作原理,在深入源码学习。

你可能感兴趣的:(tomcat)