理解Python事件驱动编程(Event Loop)

  2年前我学习Python的时候只知道并发编程模型只有多进程和多线程,这两个模型的特点都是交由操作系统调度,无法人为控制,而且短板很明显,上下文切换和创建开销都是问题。后来又听说了Python的协程-用户级线程,可以人为调度,虽然轻量,但是本质上都是利用多个worker避免一个worker带来的阻塞问题。后来接触到Tornado,知道了Python的异步编程,号称单线程异步高性能web服务器。那个时候我一直有个疑问,既然是单线程,它是怎么做到这么高性能的,是不是内部在TCP层优化了,或者是利用了TCP层的什么复用技术。再后来我知道Tornado里面有个event loop的概念,由此我终于知道了事件驱动编程,Python所谓异步编程的真正面目。

  想要理解Python的事件驱动编程,就必须首先理解Python的协程、yield、以及IO多路复用(select、poll、epoll三件套)。但是我们还是要分两种场景来说明,一种是作为服务端,另一种是作为客户端。作为服务端,event loop最核心的就是IO多路复用技术(不懂的话可以自行百度),所有来自客户端的请求都由IO多路复用函数来处理,我觉得这个决定了Python的异步编程并不是真正的异步,在select返回准备好了的事件,依然是轮询处理,只要其中一个事件阻塞了,也会阻塞其他事件。

展示下tornado的源码,只看中文的注释,其他细节先不管,太复杂了

 

while True:
    # Prevent IO event starvation by delaying new callbacks
    # to the next iteration of the event loop.
    with self._callback_lock:
        callbacks = self._callbacks
        self._callbacks = []

    # Add any timeouts that have come due to the callback list.
    # Do not run anything until we have determined which ones
    # are ready, so timeouts that call add_timeout cannot
    # schedule anything in this iteration.
    due_timeouts = []
    if self._timeouts:
        now = self.time()
        while self._timeouts:
            if self._timeouts[0].callback is None:
                # The timeout was cancelled.  Note that the
                # cancellation check is repeated below for timeouts
                # that are cancelled by another timeout or callback.
                heapq.heappop(self._timeouts)
                self._cancellations -= 1
            elif self._timeouts[0].deadline <= now:
                due_timeouts.append(heapq.heappop(self._timeouts))
            else:
                break
        if (self._cancellations > 512
                and self._cancellations > (len(self._timeouts) >> 1)):
            # Clean up the timeout queue when it gets large and it's
            # more than half cancellations.
            self._cancellations = 0
            self._timeouts = [x for x in self._timeouts
                              if x.callback is not None]
            heapq.heapify(self._timeouts)

    for callback in callbacks:
        self._run_callback(callback)
    for timeout in due_timeouts:
        if timeout.callback is not None:
            self._run_callback(timeout.callback)
    # Closures may be holding on to a lot of memory, so allow
    # them to be freed before we go into our poll wait.
    callbacks = callback = due_timeouts = timeout = None

    if self._callbacks:
        # If any callbacks or timeouts called add_callback,
        # we don't want to wait in poll() before we run them.
        poll_timeout = 0.0
    elif self._timeouts:
        # If there are any timeouts, schedule the first one.
        # Use self.time() instead of 'now' to account for time
        # spent running callbacks.
        poll_timeout = self._timeouts[0].deadline - self.time()
        poll_timeout = max(0, min(poll_timeout, _POLL_TIMEOUT))
    else:
        # No timeouts and no callbacks, so use the default.
        poll_timeout = _POLL_TIMEOUT

    if not self._running:
        break

    if self._blocking_signal_threshold is not None:
        # clear alarm so it doesn't fire while poll is waiting for
        # events.
        signal.setitimer(signal.ITIMER_REAL, 0, 0)

    try:
        # 监听事件
        event_pairs = self._impl.poll(poll_timeout)
    except Exception as e:
        # Depending on python version and IOLoop implementation,
        # different exception types may be thrown and there are
        # two ways EINTR might be signaled:
        # * e.errno == errno.EINTR
        # * e.args is like (errno.EINTR, 'Interrupted system call')
        if errno_from_exception(e) == errno.EINTR:
            continue
        else:
            raise

    if self._blocking_signal_threshold is not None:
        signal.setitimer(signal.ITIMER_REAL,
                         self._blocking_signal_threshold, 0)

    # Pop one fd at a time from the set of pending fds and run
    # its handler. Since that handler may perform actions on
    # other file descriptors, there may be reentrant calls to
    # this IOLoop that update self._events
    self._events.update(event_pairs)
    # 循环处理
    while self._events:
        fd, events = self._events.popitem()
        try:
            fd_obj, handler_func = self._handlers[fd]
            handler_func(fd_obj, events)
        except (OSError, IOError) as e:
            if errno_from_exception(e) == errno.EPIPE:
                # Happens when the client closes the connection
                pass
            else:
                self.handle_callback_exception(self._handlers.get(fd))
        except Exception:
            self.handle_callback_exception(self._handlers.get(fd))
    fd_obj = handler_func = None

作为客户端,实际上并不会用到IO多路复用,在tornado中是通过注册回调函数,ioloop每次都会在开始轮询callbacks数组并处理这些回调函数,其中的run_sync函数其实就是拿到用户定义的函数,利用内部的run函数注册回调函数,到时候直接执行。所以如果用户定义的函数是个协程,就必须使用gen.coroutine来激发协程,这也是gen.coroutine本质的作用。

上面的代码里已经展示了运行callback的过程,如下:

for callback in callbacks:
     self._run_callback(callback)

总结:作为服务端,event loop的核心在于IO多路复用技术;作为客户端,event loop的核心在于利用Future对象延迟执行,并使用send函数激发协程,并不会使用到IO多路复用技术,但是对于IO还是做了优化的,可以看tornado TCPClient类的实现,socket设置为非阻塞,但是aiohttp对客户端的IO还是利用了IO多路复用技术,使得性能更好。所以不管是作为客户端还是服务端,event loop本质上都不是异步的,所以一定会有阻塞问题存在,在我知道的使用IO多路复用技术的这些框架中,大部分都是再利用线程池处理耗时的操作,这样会极大的提高并发量,神不知鬼不觉得达到了异步的效果。




你可能感兴趣的:(Python)