众所周知Python有一个一键启动Web服务器的方法:
python3 -m http.server port
在任意目录执行如上命令,即可启动一个web文件服务器,这个方法用到了http.server模块,该模块包含以下几个比较重要的类:
简单来说就是如下:
+-----------+ +------------------------+
| TCPServer | | BaseHTTPRequestHandler |
+-----------+ +------------------------+
^ |
| v
| +--------------------------+
+----------------| SimpleHTTPRequestHandler |
| +--------------------------+
| |
| v
| +-----------------------+
+-----------------| CGIHTTPRequestHandler |
+-----------------------+
下面我们看一下SimpleHTTPRequestHandler的源代码:
class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
"""Simple HTTP request handler with GET and HEAD commands.
This serves files from the current directory and any of its
subdirectories. The MIME type for files is determined by
calling the .guess_type() method.
The GET and HEAD requests are identical except that the HEAD
request omits the actual contents of the file.
"""
server_version = "SimpleHTTP/" + __version__
def __init__(self, *args, directory=None, **kwargs):
if directory is None:
directory = os.getcwd()
self.directory = directory
super().__init__(*args, **kwargs)
def do_GET(self):
"""Serve a GET request."""
f = self.send_head()
if f:
try:
self.copyfile(f, self.wfile)
finally:
f.close()
def do_HEAD(self):
"""Serve a HEAD request."""
f = self.send_head()
if f:
f.close()
def send_head(self):
"""Common code for GET and HEAD commands.
This sends the response code and MIME headers.
Return value is either a file object (which has to be copied
to the outputfile by the caller unless the command was HEAD,
and must be closed by the caller under all circumstances), or
None, in which case the caller has nothing further to do.
"""
path = self.translate_path(self.path)
f = None
if os.path.isdir(path):
parts = urllib.parse.urlsplit(self.path)
if not parts.path.endswith('/'):
# redirect browser - doing basically what apache does
self.send_response(HTTPStatus.MOVED_PERMANENTLY)
new_parts = (parts[0], parts[1], parts[2] + '/',
parts[3], parts[4])
new_url = urllib.parse.urlunsplit(new_parts)
self.send_header("Location", new_url)
self.end_headers()
return None
for index in "index.html", "index.htm":
index = os.path.join(path, index)
if os.path.exists(index):
path = index
break
else:
return self.list_directory(path)
ctype = self.guess_type(path)
try:
f = open(path, 'rb')
except OSError:
self.send_error(HTTPStatus.NOT_FOUND, "File not found")
return None
try:
fs = os.fstat(f.fileno())
# Use browser cache if possible
if ("If-Modified-Since" in self.headers
and "If-None-Match" not in self.headers):
# compare If-Modified-Since and time of last file modification
try:
ims = email.utils.parsedate_to_datetime(
self.headers["If-Modified-Since"])
except (TypeError, IndexError, OverflowError, ValueError):
# ignore ill-formed values
pass
else:
if ims.tzinfo is None:
# obsolete format with no timezone, cf.
# https://tools.ietf.org/html/rfc7231#section-7.1.1.1
ims = ims.replace(tzinfo=datetime.timezone.utc)
if ims.tzinfo is datetime.timezone.utc:
# compare to UTC datetime of last modification
last_modif = datetime.datetime.fromtimestamp(
fs.st_mtime, datetime.timezone.utc)
# remove microseconds, like in If-Modified-Since
last_modif = last_modif.replace(microsecond=0)
if last_modif <= ims:
self.send_response(HTTPStatus.NOT_MODIFIED)
self.end_headers()
f.close()
return None
self.send_response(HTTPStatus.OK)
self.send_header("Content-type", ctype)
self.send_header("Content-Length", str(fs[6]))
self.send_header("Last-Modified",
self.date_time_string(fs.st_mtime))
self.end_headers()
return f
except:
f.close()
raise
...
前面HTTP解析的部分不再分析,如果我们请求的是GET方法,将会被分配到do_GET函数里,在do_GET()中调用了send_head()方法
send_head()中调用了self.translate_path(self.path)将request path进行一个标准化操作,目的是获取用户真正请求的文件,如果这个path是一个已存在的目录,则进入if语句, 如果用户请求的path不是以/结尾,则进入第二个if语句,这个语句中执行了HTTP跳转的操作,这就是我们当前漏洞的关键点了:
在chrome、firefox等主流浏览器中,如果url以//domain开头,浏览器将会默认认为这个url是当前数据包的协议,比如,当我们在浏览器中访问http://example.com//baidu.com/时,浏览器会默认认为要跳转到http://baidu.com,而不是跳转到.//baidu.com/目录,所以,如果我们发送的请求的是GET //baidu.com HTTP/1.0\r\n\r\n,那么将会被重定向到//baidu.com/,也就产生了一个任意URL跳转漏洞。
在这里,由于目录baidu.com不存在,我们还需要绕过if os.path.isdir(path)这条if语句,而绕过方法也很简单,因为baidu.com不存在,我们跳转到上一层目录即可:
GET //baidu.com/%2f.. HTTP/1.0\r\n\r\n
下面我们做一个简单的测试,在本地的test目录下启动一个http.server服务:
之后在浏览器中访问http://127.0.0.1:1234//baidu.com%2f..即可发现跳转到了http://www.baidu.com/search/error.html
虽然说python核心库存在这个漏洞,不过通常情况下不会有人直接在生产环境用python -m http.server,但是我们在做类似审计的时候可以关注一些请求处理,查看一些doGet以及doPost时是否有继承并使用SimpleHTTPRequestHandler类的,如果有的话可以进行跟进一步的分析,查看是否可以利用~