本文与编程语言无关,只是介绍协程的一些概念,协程能解决的问题等。只是文章的最后列出了一些协程在python中的实现。
wikipedia:Coroutines中对Coroutines进行了如下对解释。
Coroutines are computer program components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations
协程(Coroutines):计算机程序的一种组成,非抢占式的多任务子程序,可以在程序的多个地方
暂停执行
、恢复执行
。
以下内容参考Comparison with subroutines
When subroutines are invoked, execution begins at the start, and once a subroutine exits, it is finished; an instance of a subroutine only returns once, and does not hold state between invocations.
By contrast, coroutines can exit by calling other coroutines, which may later return to the point where they were invoked in the original coroutine; from the coroutine’s point of view, it is not exiting but calling another coroutine.
Thus, a coroutine instance holds state, and varies between invocations; there can be multiple instances of a given coroutine at once.
The difference between calling another coroutine by means of “yielding” to it and simply calling another routine (which then, also, would return to the original point), is that the latter is entered in the same continuous manner as the former. The relation between two coroutines which yield to each other is not that of caller-callee, but instead symmetric.
该节内容参考Comparison with generators
TODO:不是很理解Coroutines与Generator的对比,在python中generator不是coroutines的一种吗
Generators, also known as semicoroutines, are also a generalisation of subroutines, but are more limited than coroutines.
Specifically, while both of these can yield multiple times, suspending their execution and allowing re-entry at multiple entry points,
they differ in that coroutines can control where execution continues after they yield, while generators cannot, instead transferring control back to the generator’s caller.That is, since generators are primarily used to simplify the writing of iterators, the yield statement in a generator does not specify a coroutine to jump to, but rather passes a value back to a parent routine.
改节内容参考Comparison with mutual recursion
Using coroutines for state machines or concurrency is similar to using mutual recursion with tail calls, as in both cases the control changes to a different one of a set of routines.
However, coroutines are more flexible and generally more efficient.
Since coroutines yield rather than return, and then resume execution rather than restarting from the beginning, they are able to hold state, both variables (as in a closure) and execution point, and yields are not limited to being in tail position; mutually recursive subroutines must either use shared variables or pass state as parameters.
Further, each mutually recursive call of a subroutine requires a new stack frame (unless tail call elimination is implemented), while passing control between coroutines uses the existing contexts and can be implemented simply by a jump.
进程
、线程
的概念比较常听到,但是协程
的概念较少。根据协程的定义,也很难清楚协程的作用,协程用来解决什么问题,以及使用的方式。
以下内容参考知乎
一开始大家想要同一时间执行那么三五个程序,大家能一块跑一跑。特别是UI什么的,别一上计算量比较大的玩意就跟死机一样。于是就有了并发,从程序员的角度可以看成是多个独立的逻辑流。内部可以是多cpu并行,也可以是单cpu时间分片,能快速的切换逻辑流,看起来像是大家一块跑的就行。
但是一块跑就有问题了。我计算到一半,刚把多次方程解到最后一步,你突然插进来,我的中间状态咋办,我用来储存的内存被你覆盖了咋办?所以跑在一个cpu里面的并发都需要处理上下文切换的问题。进程就是这样抽象出来个一个概念,搭配虚拟内存、进程表之类的东西,用来管理独立的程序运行、切换。
后来一电脑上有了好几个cpu,好咧,大家都别闲着,一人跑一进程。就是所谓的并行。
因为程序的使用涉及大量的计算机资源配置,把这活随意的交给用户程序,非常容易让整个系统分分钟被搞跪,资源分配也很难做到相对的公平。所以核心的操作需要陷入内核(kernel),切换到操作系统,让老大帮你来做。
有的时候碰着I/O访问,阻塞了后面所有的计算。空着也是空着,老大就直接把CPU切换到其他进程,让人家先用着。当然除了I\O阻塞,还有时钟阻塞等等。一开始大家都这样弄,后来发现不成,太慢了。为啥呀,一切换进程得反复进入内核,置换掉一大堆状态。进程数一高,大部分系统资源就被进程切换给吃掉了。后来搞出线程的概念,大致意思就是,这个地方阻塞了,但我还有其他地方的逻辑流可以计算,这些逻辑流是共享一个地址空间的,不用特别麻烦的切换页表、刷新TLB,只要把寄存器刷新一遍就行,能比切换进程开销少点。
如果连时钟阻塞、 线程切换这些功能我们都不需要了,自己在进程里面写一个逻辑流调度的东西。那么我们即可以利用到并发优势,又可以避免反复系统调用,还有进程切换造成的开销,分分钟给你上几千个逻辑流不费力。这就是用户态线程。
从上面可以看到,实现一个用户态线程有两个必须要处理的问题:一是碰着阻塞式I\O会导致整个进程被挂起;二是由于缺乏时钟阻塞,进程需要自己拥有调度线程的能力。如果一种实现使得每个线程需要自己通过调用某个方法,主动交出控制权。那么我们就称这种用户态线程是协作式的,即是协程。
本质上协程就是用户空间下的线程。
python中也实现了协程,主要有以下几种方式(参考Implementations for Python):
上面列表中国年,比较常用的应该是eventlet,毕竟openstack的很多项目都使用到了eventlet。