本篇文章翻译自这里
回顾下上篇文章讨论python里面创建iterator.要想创建一个iterator,必须实现一个有__iter__()和__next__()方法的类,类要能够跟踪内部状态并且在没有元素返回的时候引发StopIteration异常.
这个过程很繁琐而且违反直觉.Generator能够解决这个问题.
python generator是一个简单的创建iterator的途径.前面讲的那些繁琐的步骤都可以被generator自动完成.
简单来说,generator是一个能够返回迭代器对象的函数.
# A simple generator function
def my_gen():
n = 1
print('This is printed first')
# Generator function contains yield statements
yield n
n += 1
print('This is printed second')
yield n
n += 1
print('This is printed at last')
yield n
在解释器里面交互运行的结果是:
>>> # It returns an object but does not start execution immediately.
>>> a = my_gen()
>>> # We can iterate through the items using next().
>>> next(a)
This is printed first
1
>>> # Once the function yields, the function is paused and the control is transferred to the caller.
>>> # Local variables and theirs states are remembered between successive calls.
>>> next(a)
This is printed second
2
>>> next(a)
This is printed at last
3
>>> # Finally, when the function terminates, StopIteration is raised automatically on further calls.
>>> next(a)
Traceback (most recent call last):
...
StopIteration
>>> next(a)
Traceback (most recent call last):
...
StopIteration
有趣的是,在这个例子里变量n在每次调用之间都被记住了。和一般函数不同的是,在函数yield之后本地变量没有被销毁,而且,generator对象只能被这样迭代一次。
# A simple generator function
def my_gen():
n = 1
print('This is printed first')
# Generator function contains yield statements
yield n
n += 1
print('This is printed second')
yield n
n += 1
print('This is printed at last')
yield n
# Using for loop
# Output:
# This is printed first
# 1
# This is printed second
# 2
# This is printed at last
# 3
for item in my_gen():
print(item)
def rev_str(my_str):
length = len(my_str)
for i in range(length - 1,-1,-1):
yield my_str[i]
# For loop to reverse the string
# Output:
# o
# l
# l
# e
# h
for char in rev_str("hello"):
print(char)
我们在for循环里面使用range()函数来获取反向顺序的index。
# Initialize the list
my_list = [1, 3, 6, 10]
# square each term using list comprehension
# Output: [1, 9, 36, 100]
[x**2 for x in my_list]
# same thing can be done using generator expression
# Output: at 0x0000000002EBDAF8>
(x**2 for x in my_list)
上面的例子中,generator表达式没有立即产生需要的结果,而是在需要产生item的时候返回一个generator对象。
# Intialize the list
my_list = [1, 3, 6, 10]
a = (x**2 for x in my_list)
# Output: 1
print(next(a))
# Output: 9
print(next(a))
# Output: 36
print(next(a))
# Output: 100
print(next(a))
# Output: StopIteration
next(a)
generator表达式可以在函数内部使用。当这样使用的时候,圆括号可以丢弃。
>>> sum(x**2 for x in my_list)
146
>>> max(x**2 for x in my_list)
100
class PowTwo:
def __init__(self, max = 0):
self.max = max
def __iter__(self):
self.n = 0
return self
def __next__(self):
if self.n > self.max:
raise StopIteration
result = 2 ** self.n
self.n += 1
return result
generator这样实现
def PowTwoGen(max = 0):
n = 0
while n < max:
yield 2 ** n
n += 1
因为generator自动跟踪实现细节,因此更加清晰、简洁。
def all_even():
n = 0
while True:
yield n
n += 2
with open('sells.log') as file:
pizza_col = (line[3] for line in file)
per_hour = (int(x) for x in pizza_col if x != 'N/A')
print("Total pizzas sold = ",sum(per_hour))
这个流水线既高效又易读,并且看起来很酷!:)