如果Python Books是一些指导,那么,coroutines是最少被记载,晦涩的,看上去没什么用的功能。 -- David Beazley, Python author
caller从generator中拉取数据。
一个coroutine从结构上像是一个generator,只是一个包含yield关键字的函数而已。
coroutine可以从caller中接收数据,通过.send() 代替 .next()。甚至,yield关键字也可以没有数据流入流出。除了数据流,yield也是一个控制流程的设备,使多任务间协作。(yield的本身意思就是“放弃”)每个coroutine yields 控制返还给scheduler,因此,其他的coroutine被激活。
当你把yield主要想成流程控制的手段时,你就快理解coroutine了。
Python coroutine是一系列改进简陋的generator的成果。coroutine就是从PEP342中引入,即Coroutines via Enhandced Generators,Python2.5。通过send传入数据,这让generator可以被用作coroutine,一个协作的过程,yielding and receiving values from the caller。
除了send,PEP 342还加入了throw和close,允许caller抛出异常,该异常在generator里面处理,也可以结束generator。PEP 380让coroutine支持return和yield from。
coroutine可以处于4种状态,通过inspect.getgeneratorstate()函数来确定状态:
1.GEN_CREATED,等待开始
2.GEN_RUNNING,被解释器执行
3.GEN_SUSPENED,在yield表达式处挂起
4.GEN_CLOSED,执行结束
caller通过send传入coroutine的数据,会在yield处获得,所以,coroutine必须处于yield出挂起时(GEN_SUSPENDED),才能通过send传入数据。这也就解释了,coroutine第一次被激活,一定是通过next,否则处于GEN_CREADED状态下,无法send。初始的next调用,是为了让coroutine准备好。值得说的是,如果coroutine中有这样的表达式“b = yield a”,那么此处,caller是必须send一个值进来,也就是绑定给b的值,要不然b为None。
重要的一点,要理解为什么coroutine的执行流程,会恰好在yield处挂起,确切的说,如果b = yield a,是在yield a挂起,b =在下次执行。因为这个表达式“b = yield a”,只有在协程被客户端激活之后,b的值才能被设置。这对理解异步编程有用。
第一次调用next,让coroutine处于第一个yield处挂起,这个准备过程叫做coroutine priming,为了更方便,引入decorator,@coroutine。
对于yield from,在调用时自动prime协程,所以,与yield from搭配的asyncio.coroutine,其实没有做prime工作。
终止协程和异常处理
协程内一个未处理的异常会随着send或next传播到caller中。
generator.throw
导致generator被挂起的yield处,抛出异常。如果generator处理异常,则当前yield为throw本身,next会继续到下一个yield。如果generator不处理异常,异常传播给caller,状态为GEN_CLOSED。
generator.close
导致generator被挂起的yield处,抛出GeneratorExit异常,状态为GEN_CLOSED。close不会yield 值。
从coroutine中返回值
一些coroutine不会yield有趣的值,而是被设计成返回一个结果,一个累积的最后结果。
在coroutine中的return value语句,value回作为StopIteration的值偷偷地传回给caller。这是一个hack,但是又符合coroutine的行为:在exhausted时抛出StopIteration,我们不得不这样去获取coroutine的返回值:
try:
r = coro.send(None)
except StopIteration as exc:
r = exc.value
Ok,PEP380引入了yield from语法,它会自动地在内部捕获StopIteration。
PEP380的标题是“Syntax for Delegating to a Subgenerator”。
yield from语法最主要的功能是,打开一个双向的通道,在外部的caller与内部的subgenerator之间,我们可以send数据进去,也可以yield数据回来,而不用写异常处理代码。来看3个名词解释:
delegating generator
The generator function that contain the yield from
subgenerator
The generator obtained from the
caller
PEP380 uses the term "caller" to refer to the client code that calls the delegating generator. Depending on the context, I use "client" instead of "caller", to distinguish from the delegating generator, which is also a "caller"(it calls the subgenerator).
While the delegating generator is suspended at yield from, the caller sends data directly to the subgenerator, which yields data back to the caller. thee delegating generator resumes when the subgenerator returns and the interpreter raise StopIteration with the returned value attached.
delegating generator看不到caller传给subgenerator的值。
书中一个例子16-17非常好,结论:
如果一个subgenerator不曾终止,那么,delegating generator就不会被激活,一直处于挂起状态,一直挂在yield from处。但是,程序仍会进行,因为yield from返还控制权给client,就想yield一样,只是有些任务处于未完成状态。
one delegating generator uses yield from to call a subgenerator, which itself is a delegating generator calling another subgenerator with yield from, and so on. Eventually, this chain must end in a simple generator that use just yield, or a iterable object.
yield from must be driven by a client that calls next() or send().
"when the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the yield from expression. Furthermore, the subgenerator is allowed to execute a return statement with a value, and that value become the value of the yield from expression.
PEP380中关于yield from语法的六点说明:
1.any values that the subgenerator yields are passed directly to the caller of the delegating generator(i.e., the client code).
2.any values sent to the delegating generator using send() are passed directly to the subgenerator. If the sent value is None, the subgenerator's next() method is called. If the sent value is not none the subgenerator's send() method is called. If the call raise StopIteration, the delegating generator is resumed. Any other exception is propagated to the delegating generator.
3.return expr in a generator or a subgenerator causes StopIteration(expr) to be raised upon exit from the generator.
4.The value of the yield from expression is the first argument to the StopIteration exception raised by the subgenerator when it terminates.
5.Exceptions other than GeneratorExit thrown into the delegating generator are passed to the throw() method of the subgenerator. If the call raise StopIteration, the delegating generator is resumed. Any other exception is propagated to the delegating generator.
6.If a GeneratorExit exception is thrown into the delegating generator, or the close() method of the delegating generator is called, then the close() method of the subgenerator is called if it has one. If this call results in an exception, it is propagated to the delegating generator. Otherwise, GeneratorExit is raised in the delegating generator.
现实是复杂的,因为我们需要处理客户端的throw和close。PEP380的伪代码被作者给予高度评价,读了三遍才明白。要好好读这个代码,因为大多数的yield from例子都在asyncio中,不是独立的例子。
再强调一下,yield from是auto prime哦。
This is a form of multitasking: coroutines voluntarily and explicitly yield control to the central scheduler.
广义协程vsAsyncio协程
a broad, informal definition of a coroutine: a generator function driven by a client sending it data through send() calls or yield from. This broad definition is the one used in PEP 342 -- Coroutines via Enhanced Generators and in most existing Python books.
the asyncio coroutine, a stricter definition: asyncio coroutines are (usually) decorated with an @asyncio.coroutine decorator, and they are always driven by yield from, not by calling send() directly on them. Of course, asyncio coroutines are driven by next() and send() under the covers, but in user code, we only use yield from to make them run.