WebSocket小试牛刀
一. 为什么需要WebSocket?
1.1 初次接触 WebSocket 的人,都会问同样的问题:我们已经有了 HTTP 协议,为什么还需要另一个协议?
WebSocket是HTML5出的东西(协议),也就是说HTTP协议没有变化,或者说没关系,但HTTP是不支持持久连接的(长连接,循环连接的不算)首先HTTP有1.1和1.0之说,也就是所谓的keep-alive,把多个HTTP请求合并为一个,但是Websocket其实是一个新协议,跟HTTP协议基本没有关系,只是为了兼容现有浏览器的握手规范而已,也就是说它是HTTP协议上的一种补充.
答案其实很简单,因为 HTTP 协议有一个缺陷:通信只能由客户端发起,特别是短链接,无状态,请求在获取相应信息之后立刻断开连接.
1.2 那么,WebSocet到底是什么呢?它能带来什么好处?
Websocket是什么样的协议,具体有什么优点首先,Websocket是一个持久化的协议,相对于HTTP这种非持久的协议来说。
简单的举个例子吧,用目前应用比较广泛的PHP生命周期来解释。
- HTTP的生命周期通过Request来界定,也就是一个Request 一个Response,那么在HTTP1.0中,这次HTTP请求就结束了。
- 在HTTP1.1中进行了改进,使得有一个keep-alive,也就是说,在一个HTTP连接中,可以发送多个Request,接收多个Response。但是请记住 Request = Response,在HTTP中永远是这样,也就是说一个request只能有一个response。而且这个response也是被动的,不能主动发起。
1.3 教练,你BB了这么多,跟Websocket有什么关系呢?
好吧,我正准备说Websocket呢。。首先Websocket是基于HTTP协议的,或者说借用了HTTP的协议来完成一部分握手。在握手阶段是一样的,以下涉及专业技术内容,不想看的可以跳过.
首先我们来看个典型的Websocket握手
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Origin: http://example.com
熟悉HTTP的童鞋可能发现了,这段类似HTTP协议的握手请求中,多了几个东西。
Upgrade: websocket
Connection: Upgrade
这个就是Websocket的核心了,告诉Apache、Nginx等服务器:注意啦,窝发起的是Websocket协议,快点帮我找到对应的助理处理~不是那个老土的HTTP:)
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
首先,Sec-WebSocket-Key 是一个Base64 encode的值,这个是浏览器随机生成的,告诉服务器:泥煤,不要忽悠窝,我要验证尼是不是真的是Websocket助理。然后,Sec_WebSocket-Protocol 是一个用户定义的字符串,用来区分同URL下,不同的服务所需要的协议。简单理解:今晚我要服务A,别搞错啦~最后,Sec-WebSocket-Version 是告诉服务器所使用的Websocket Draft(协议版本),在最初的时候,Websocket协议还在 Draft 阶段,各种奇奇怪怪的协议都有,而且还有很多期奇奇怪怪不同的东西,什么Firefox和Chrome用的不是一个版本之类的,当初Websocket协议太多可是一个大难题。。不过现在还好,已经定下来啦大家都使用的一个东西 脱水:服务员,我要的是13岁的噢→_→
然后服务器会返回下列东西,表示已经接受到请求, 成功建立Websocket啦!
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat
这里开始就是HTTP最后负责的区域了,告诉客户,我已经成功切换协议啦~
Upgrade: websocket
Connection: Upgrade
依然是固定的,告诉客户端即将升级的是Websocket协议,而不是mozillasocket,lurnarsocket或者shitsocket。然后,Sec-WebSocket-Accept 这个则是经过服务器确认,并且加密过后的 Sec-WebSocket-Key。服务器:好啦好啦,知道啦,给你看我的ID CARD来证明行了吧。。后面的,Sec-WebSocket-Protocol 则是表示最终使用的协议。至此,HTTP已经完成它所有工作了,接下来就是完全按照Websocket协议进行了。
⚠️ : WebSocket两个重要的流程: 1. 握手; 2. 加密;
二. 剖析WebSocket请求流程
下面讲使用Python编写Socket服务端,一步一步分析请求过程!!!
2.1 服务端
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(('127.0.0.1', 8002))
sock.listen(5)
# 等待用户连接
conn, address = sock.accept()
启动Socket服务器后,等待用户【连接】,然后进行收发数据。
2.2 客户端连接
Title
WebSocket协议学习
当客户端向服务端发送连接请求时,不仅连接还会发送[握手]信息,并等待服务端响应,至此连接才创建成功!
2.3 建立连接
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(('127.0.0.1', 8002))
sock.listen(5)
# 获取客户端socket对象
conn, address = sock.accept()
# 获取客户端的【握手】信息
data = conn.recv(1024)
...
conn.send('响应【握手】信息')
请求和响应的【握手】信息需要遵循规则:
- 从请求【握手】信息中提取 Sec-WebSocket-Key
- 利用magic_string 和 Sec-WebSocket-Key 进行hmac1加密,再进行base64加密
- 将加密结果响应给客户端
⚠️ :magic string为:258EAFA5-E914-47DA-95CA-C5AB0DC85B11
请求【握手】信息格式为:
GET /chatsocket HTTP/1.1
Host: 127.0.0.1:8002
Connection: Upgrade
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket
Origin: http://localhost:63342
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: mnwFxiOlctXFN/DeMt1Amg==
Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits
2.4 提取Sec-WebSocket-Key值并加密:
import socket
import base64
import hashlib
def get_headers(data):
"""
将请求头格式化成字典
:param data:
:return:
"""
header_dict = {}
data = str(data, encoding='utf-8')
for i in data.split('\r\n'):
print(i)
header, body = data.split('\r\n\r\n', 1)
header_list = header.split('\r\n')
for i in range(0, len(header_list)):
if i == 0:
if len(header_list[i].split(' ')) == 3:
header_dict['method'], header_dict['url'], header_dict['protocol'] = header_list[i].split(' ')
else:
k, v = header_list[i].split(':', 1)
header_dict[k] = v.strip()
return header_dict
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(('127.0.0.1', 8002))
sock.listen(5)
conn, address = sock.accept()
data = conn.recv(1024)
headers = get_headers(data) # 提取请求头信息
# 对请求头中的sec-websocket-key进行加密
response_tpl = "HTTP/1.1 101 Switching Protocols\r\n" \
"Upgrade:websocket\r\n" \
"Connection: Upgrade\r\n" \
"Sec-WebSocket-Accept: %s\r\n" \
"WebSocket-Location: ws://%s%s\r\n\r\n"
magic_string = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'
value = headers['Sec-WebSocket-Key'] + magic_string
ac = base64.b64encode(hashlib.sha1(value.encode('utf-8')).digest())
response_str = response_tpl % (ac.decode('utf-8'), headers['Host'], headers['url'])
# 响应【握手】信息
conn.send(bytes(response_str, encoding='utf-8'))
...
...
...
三. 客户端和服务端收发数据
客户端和服务端传输数据时,需要对数据进行【封包】和【解包】。客户端的JavaScript类库已经封装【封包】和【解包】过程,但Socket服务端需要手动实现。
3.1 第一步:获取客户端发送的数据【解包】
# 基于python实现的解包
info = conn.recv(8096)
payload_len = info[1] & 127
if payload_len == 126:
extend_payload_len = info[2:4]
mask = info[4:8]
decoded = info[8:]
elif payload_len == 127:
extend_payload_len = info[2:10]
mask = info[10:14]
decoded = info[14:]
else:
extend_payload_len = None
mask = info[2:6]
decoded = info[6:]
bytes_list = bytearray()
for i in range(len(decoded)):
chunk = decoded[i] ^ mask[i % 4]
bytes_list.append(chunk)
body = str(bytes_list, encoding='utf-8')
print(body)
3.2 解包详细过程:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
3.2.1 官方解释
The MASK bit simply tells whether the message is encoded. Messages from the client must be masked, so your server should expect this to be 1. (In fact, section 5.1 of the spec says that your server must disconnect from a client if that client sends an unmasked message.) When sending a frame back to the client, do not mask it and do not set the mask bit. We'll explain masking later. Note: You have to mask messages even when using a secure socket.RSV1-3 can be ignored, they are for extensions.
The opcode field defines how to interpret the payload data: 0x0 for continuation, 0x1 for text (which is always encoded in UTF-8), 0x2 for binary, and other so-called "control codes" that will be discussed later. In this version of WebSockets, 0x3 to 0x7 and 0xB to 0xF have no meaning.
The FIN bit tells whether this is the last message in a series. If it's 0, then the server will keep listening for more parts of the message; otherwise, the server should consider the message delivered. More on this later.
Decoding Payload Length
To read the payload data, you must know when to stop reading. That's why the payload length is important to know. Unfortunately, this is somewhat complicated. To read it, follow these steps:
Read bits 9-15 (inclusive) and interpret that as an unsigned integer. If it's 125 or less, then that's the length; you're done. If it's 126, go to step 2. If it's 127, go to step 3.
Read the next 16 bits and interpret those as an unsigned integer. You're done.
Read the next 64 bits and interpret those as an unsigned integer (The most significant bit MUST be 0). You're done.
Reading and Unmasking the Data
If the MASK bit was set (and it should be, for client-to-server messages), read the next 4 octets (32 bits); this is the masking key. Once the payload length and masking key is decoded, you can go ahead and read that number of bytes from the socket. Let's call the data ENCODED, and the key MASK. To get DECODED, loop through the octets (bytes a.k.a. characters for text data) of ENCODED and XOR the octet with the (i modulo 4)th octet of MASK. In pseudo-code (that happens to be valid JavaScript):
var DECODED = "";
for (var i = 0; i < ENCODED.length; i++) {
DECODED[i] = ENCODED[i] ^ MASK[i % 4];
}
Now you can figure out what DECODED means depending on your application.
3.3 向客户端发送数据【封包】
def send_msg(conn, msg_bytes):
"""
WebSocket服务端向客户端发送消息
:param conn: 客户端连接到服务器端的socket对象,即: conn,address = socket.accept()
:param msg_bytes: 向客户端发送的字节
:return:
"""
import struct
token = b"\x81"
length = len(msg_bytes)
if length < 126:
token += struct.pack("B", length)
elif length <= 0xFFFF:
token += struct.pack("!BH", 126, length)
else:
token += struct.pack("!BQ", 127, length)
msg = token + msg_bytes
conn.send(msg)
return True
四. 基于python实现简单示例
4.1 基于Python socket实现的WebSocket服务端
#! /usr/bin/env python
# -*- coding: utf-8 -*-
# __author__ = "shuke"
# Date: 2018/5/12
import socket
import hashlib
import base64
def get_headers(data):
"""
将请求头格式化成字典
:param data:
:return:
"""
header_dict = {}
data = str(data, encoding='utf-8')
header, body = data.split('\r\n\r\n', 1)
header_list = header.split('\r\n')
for i in range(0, len(header_list)):
if i == 0:
if len(header_list[i].split(' ')) == 3:
header_dict['method'], header_dict['url'], header_dict['protocol'] = header_list[i].split(' ')
else:
k, v = header_list[i].split(':', 1)
header_dict[k] = v.strip()
return header_dict
def send_msg(conn, msg_bytes):
"""
WebSocket服务端向客户端发送消息
:param conn: 客户端连接到服务器端的socket对象,即: conn,address = socket.accept()
:param msg_bytes: 向客户端发送的字节
:return:
"""
import struct
token = b"\x81"
length = len(msg_bytes)
if length < 126:
token += struct.pack("B", length)
elif length <= 0xFFFF:
token += struct.pack("!BH", 126, length)
else:
token += struct.pack("!BQ", 127, length)
msg = token + msg_bytes
conn.send(msg)
return True
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(('127.0.0.1', 8002))
sock.listen(5)
# 等待用户连接
conn, address = sock.accept()
# WebSocket发来的连接
# 1. 获取握手数据
data = conn.recv(1024)
headers = get_headers(data)
# 2. 对握手信息进行加密:
magic_string = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'
value = headers['Sec-WebSocket-Key'] + magic_string
ac = base64.b64encode(hashlib.sha1(value.encode('utf-8')).digest())
# 3. 返回握手信息
response_tpl = "HTTP/1.1 101 Switching Protocols\r\n" \
"Upgrade:websocket\r\n" \
"Connection: Upgrade\r\n" \
"Sec-WebSocket-Accept: %s\r\n" \
"WebSocket-Location: ws://127.0.0.1:8002\r\n\r\n"
response_str = response_tpl % (ac.decode('utf-8'),)
conn.sendall(bytes(response_str, encoding='utf-8'))
# 之后,才能进行收发数据。
while True:
# 对数据进行解密
# send_msg(conn, bytes('alex', encoding='utf-8'))
# send_msg(conn, bytes('SB', encoding='utf-8'))
# info = conn.recv(8096)
# print(info)
info = conn.recv(8096)
payload_len = info[1] & 127
if payload_len == 126:
extend_payload_len = info[2:4]
mask = info[4:8]
decoded = info[8:]
elif payload_len == 127:
extend_payload_len = info[2:10]
mask = info[10:14]
decoded = info[14:]
else:
extend_payload_len = None
mask = info[2:6]
decoded = info[6:]
bytes_list = bytearray()
for i in range(len(decoded)):
chunk = decoded[i] ^ mask[i % 4]
bytes_list.append(chunk)
msg = str(bytes_list, encoding='utf-8')
rep = msg + ' hello'
send_msg(conn, bytes(rep, encoding='utf-8'))
4.2 利用JavaScript类库实现客户端
此时,我们在浏览器的Console控制台利用Socket对象发送消息,将会收到返回信息
4.3 验证方式
五. 基于Tornado框架实现Web聊天室
Tornado是一个支持WebSocket的优秀框架,其内部原理正如1~5步骤描述,当然Tornado内部封装功能更加完整。
以下是基于Tornado实现的聊天室示例:
# cat app.py
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import uuid
import json
import tornado.ioloop
import tornado.web
import tornado.websocket
class IndexHandler(tornado.web.RequestHandler):
def get(self):
self.render('index.html')
class ChatHandler(tornado.websocket.WebSocketHandler):
# 用户存储当前聊天室用户
waiters = set()
# 用于存储历时消息
messages = []
def open(self):
"""
客户端连接成功时,自动执行
:return:
"""
ChatHandler.waiters.add(self)
uid = str(uuid.uuid4())
self.write_message(uid)
for msg in ChatHandler.messages:
content = self.render_string('message.html', **msg)
self.write_message(content)
def on_message(self, message):
"""
客户端连发送消息时,自动执行
:param message:
:return:
"""
msg = json.loads(message)
ChatHandler.messages.append(message)
for client in ChatHandler.waiters:
content = client.render_string('message.html', **msg)
client.write_message(content)
def on_close(self):
"""
客户端关闭连接时,,自动执行
:return:
"""
ChatHandler.waiters.remove(self)
def run():
settings = {
'template_path': 'templates',
'static_path': 'static',
}
application = tornado.web.Application([
(r"/", IndexHandler),
(r"/chat", ChatHandler),
], **settings)
application.listen(8888)
tornado.ioloop.IOLoop.instance().start()
if __name__ == "__main__":
run()
客户端
# cat index.html
Python聊天室
Flask-WebSocket投票示例
原文参考