今天用到yield表达式了,找了一篇不错的讲解,贴在这个供自己和大家参考:
yield
When you see a function with yield
statements, apply this easy trick to understand what will happen:
result = []
at the start of the function.yield expr
with result.append(expr)
.return result
at the bottom of the function.yield
statements! Read and figure out code.This trick may give you an idea of the logic behind the function, but what actually happens with yield
is significantly different that what happens in the list based approach. In many cases the yield approach will be a lot more memory efficient and faster too. In other cases this trick will get you stuck in an infinite loop, even though the original function works just fine. Read on to learn more...
First, the iterator protocol - when you write
for x in mylist: ...loop body...
Python performs the following two steps:
Gets an iterator for mylist
:
Call iter(mylist)
-> this returns an object with a next()
method.
[This is the step most people forget to tell you about]
Uses the iterator to loop over items:
Keep calling the next()
method on the iterator returned from step 1. The return value fromnext()
is assigned to x
and the loop body is executed. If an exception StopIteration
is raised from within next()
, it means there are no more values in the iterator and the loop is exited.
The truth is Python performs the above two steps anytime it wants to loop over the contents of an object - so it could be a for loop, but it could also be code like otherlist.extend(mylist)
(whereotherlist
is a Python list).
Here mylist
is an iterable because it implements the iterator protocol. In a user defined class, you can implement the __iter__()
method to make instances of your class iterable. This method should return an iterator. An iterator is an object with a next()
method. It is possible to implement both__iter__()
and next()
on the same class, and have __iter__()
return self
. This will work for simple cases, but not when you want two iterators looping over the same object at the same time.
So that's the iterator protocol, many objects implement this protocol:
__iter__()
.Note that a for
loop doesn't know what kind of object it's dealing with - it just follows the iterator protocol, and is happy to get item after item as it calls next()
. Built-in lists return their items one by one, dictionaries return the keys one by one, files return the lines one by one, etc. And generators return... well that's where yield
comes in:
def f123(): yield 1 yield 2 yield 3 for item in f(): print item
Instead of yield
statements, if you had three return
statements in f123()
only the first would get executed, and the function would exit. But f123()
is no ordinary function. When f123()
is called, itdoes not return any of the values in the yield statements! It returns a generator object. Also, the function does not really exit - it goes into a suspended state. When the for
loop tries to loop over the generator object, the function resumes from its suspended state, runs until the next yield
statement and returns that as the next item. This happens until the function exits, at which point the generator raisesStopIteration
, and the loop exits.
So the generator object is sort of like an adapter - at one end it exhibits the iterator protocol, by exposing__iter__()
and next()
methods to keep the for
loop happy. At the other end however, it runs the function just enough to get the next value out of it, and puts it back in suspended mode.
Usually you can write code that doesn't use generators but implements the same logic. One option is to use the temporary list 'trick' I mentioned before. That will not work in all cases, for e.g. if you have infinite loops, or it may make inefficient use of memory when you have a really long list. The other approach is to implement a new iterable class SomethingIter
that keeps state in instance members and performs the next logical step in it's next()
method. Depending on the logic, the code inside the next()
method may end up looking very complex and be prone to bugs. Here generators provide a clean and easy solution.