前一段时间2014北京PyCon大会吐槽颇多,所以我就到InfoQ上找了找2013的大会视频,对网络射击手游High Noon 2基于Python的服务器架构的视频挺感兴趣,尤其是游戏服务器中的0 downtime,原理他们底层不是原生的socket,
而是基于ZeroMq的socket,由于ZeroMq的短线自动重连可以满足游戏服务器的热启动,不需要代码层面的热启动,热更新,当更新代码完成后直接重启服务器,之前未处理的请求会继续处理。瞬间觉得非常高大上,于是最近一段时间回家一直研究ZeroMq,在guide的LRU这边停留了较长时间,本篇就是谈谈ZeroMq LRU算法中间件。
对普通的请求回复代理,也就是使用了ROUTER-DEALER模式比较好理解,但这种方式有个天生缺点,DEALER会使用负载均衡的方式将客户端的请求转发给服务器端,由于服务器的处理能力各不相同,就会导致有些服务器很忙,有些很闲,这不是我们想看到的,我们希望能压榨所有服务器能力,所以就出现了通用型的LRU算法。
这里使用ROUTER-ROUTER模式,刚开始你可能很诧异,且听我慢慢道来。
那如何能充分使用每台服务器性能呢?
1.当woker启动时告诉backend自己已准备好,将该worker加入work_queue可用队列中
2.当client请求时,从work_queue中取出一个work,把请求交给该work处理
3.当work处理好后把响应发给client,并把自己再次加入到work_queue中
也就是使用work队列,记录可用work,只要work完成请求就代表已空闲,再次加入work队列。
那为什么backend需要使用ROUTER模式呢?关键点是当我要把client来的请求发给固定的work,而只有ROUTER模式具有路由标识功能,明白了这点,代码也就容易了。
""" Least-recently used (LRU) queue device Clients and workers are shown here in-process Author: Guillaume Aubert (gaubert) <guillaume(dot)aubert(at)gmail(dot)com> """ from __future__ import print_function import threading import time import zmq NBR_CLIENTS = 10 NBR_WORKERS = 3 def worker_thread(worker_url, context, i): """ Worker using REQ socket to do LRU routing work的响应消息格式为 -------------------- |Frame0| Client-idntity -------------------- |Frame1| -------------------- |Frame2| OK """ socket = context.socket(zmq.REQ) # Set the worker identity socket.identity = (u"Worker-%d" % (i)).encode('ascii') socket.connect(worker_url) # Tell the borker we are ready for work socket.send(b"READY") try: while True: address = socket.recv() empty = socket.recv() request = socket.recv() print("%s: %s\n" % (socket.identity.decode('ascii'), request.decode('ascii')), end='') socket.send(address, zmq.SNDMORE) socket.send(b"", zmq.SNDMORE) socket.send(b"OK") except zmq.ContextTerminated: # context terminated so quit silently return def client_thread(client_url, context, i): """ Basic request-reply client using REQ socket client的消息格式为 -------------------- |Frame0| Client-1 -------------------- |Frame1| -------------------- |Frame2| HELLO """ socket = context.socket(zmq.REQ) socket.identity = (u"Client-%d" % (i)).encode('ascii') socket.connect(client_url) # Send request, get reply socket.send(b"HELLO") reply = socket.recv() print("%s: %s\n" % (socket.identity.decode('ascii'), reply.decode('ascii')), end='') def main(): """ main method """ url_worker = "inproc://workers" url_client = "inproc://clients" client_nbr = NBR_CLIENTS # Prepare our context and sockets context = zmq.Context() frontend = context.socket(zmq.ROUTER) frontend.bind(url_client) backend = context.socket(zmq.ROUTER) backend.bind(url_worker) # create workers and clients threads for i in range(NBR_WORKERS): thread = threading.Thread(target=worker_thread, args=(url_worker, context, i, )) thread.start() for i in range(NBR_CLIENTS): thread_c = threading.Thread(target=client_thread, args=(url_client, context, i, )) thread_c.start() # Logic of LRU loop # - Poll backend always, frontend only if 1+ worker ready # - If worker replies, queue worker as ready and forward reply # to client if necessary # - If client requests, pop next worker and send request to it # Queue of available workers available_workers = 0 workers_list = [] # init poller poller = zmq.Poller() # Always poll for worker activity on backend poller.register(backend, zmq.POLLIN) # Poll front-end only if we have available workers poller.register(frontend, zmq.POLLIN) while True: socks = dict(poller.poll()) # Handle worker activity on backend if (backend in socks and socks[backend] == zmq.POLLIN): # Queue worker address for LRU routing worker_addr = backend.recv() assert available_workers < NBR_WORKERS # add worker back to the list of workers available_workers += 1 workers_list.append(worker_addr) # Second frame is empty empty = backend.recv() assert empty == b"" # Third frame is READY or else a client reply address client_addr = backend.recv() # If client reply, send rest back to frontend if client_addr != b"READY": # Following frame is empty empty = backend.recv() assert empty == b"" reply = backend.recv() frontend.send(client_addr, zmq.SNDMORE) frontend.send(b"", zmq.SNDMORE) frontend.send(reply) client_nbr -= 1 if client_nbr == 0: break # Exit after N messages # poll on frontend only if workers are available if available_workers > 0: if (frontend in socks and socks[frontend] == zmq.POLLIN): # Now get next client request, route to LRU worker # Client request is [address][empty][request] client_addr = frontend.recv() empty = frontend.recv() assert empty == b"" request = frontend.recv() # Dequeue and drop the next worker address available_workers -= 1 worker_id = workers_list.pop() """worker_id就是work的标识,也就是需要发给worker_id,所以backend需要使用ROUTER模式 在所有消息之间zmq需要一个空消息作为标识,当work接受到请求时会直接读到第一个空消息, 也就是work接受的第一个消息就是client_addr,然后work再把处理的响应发给client_addr, 当backend收到消息后,直接通过forentend转发给client_addr,这也是forentend也需要是ROUTER 模式的原因 """ backend.send(worker_id, zmq.SNDMORE) backend.send(b"", zmq.SNDMORE) backend.send(client_addr, zmq.SNDMORE) backend.send(b"", zmq.SNDMORE) backend.send(request) # Out of infinite loop: do some housekeeping frontend.close() backend.close() context.term() if __name__ == "__main__": main() """ 当需要具体转发的时候,就是ROUTER大显身手的时候 """