迭代器协议
:某个类如果实现了__iter()__和__next()__方法,那么就称这个类实现了迭代器协议。
iterable
: 可迭代对象,实现了__iter()__方法的类。
iterator
: 迭代器,实现了__iter()__和__next()__方法的类,也即,实现了迭代器协议的类。
generator
:生成器, 一种更加优美(特殊)的迭代器,但和iterator不一样,他不是类,一般有两种实现方法(如上图)。
container
: 容器,list, set,dict,tuple, str都是容器,许多container都是iterable。
>>> a=[1,2,3]
>>> '__iter__' in dir(a)
True
这些实现了__iter()__方法的类都可以被称作是iterable (可迭代对象)
,也即大部分我们熟知的容器都是可迭代对象。
实际上,容器只是用来"装"元素, 而不能用来"取"元素,因为他没有__next()__方法,这也是接下来要说的。
iter()
变为iterator(迭代器)
.而迭代器和可迭代对象相比,最直接的一个区别就是多了一个__next()__方法。# a list is a iterable obj (container), there is only '__iter__'
>>> a=[1,2,3]
>>> '__iter__' in dir(a)
True
>>> '__next__' in dir(a)
False
# an iterator, however, hold both __iter__' and '__next__'
>>> b=iter(a)
>>> '__iter__' in dir(b)
True
>>> '__next__' in dir(b)
True
也就是说,迭代器(iterator)一定是可迭代的(iterable),迭代器是加了__next()__方法的可迭代对象,外延更小。换种说法,可迭代对象因多加了个__next()__方法,而实现了迭代器协议,从而变成了迭代器。
之前提到过,也正是因为有了__next()__方法,迭代器可以通过next()内置函数来逐个取出容器中的元素,而可迭代对象不行。
# you can use next() function to get items gradually
>>> k=iter(['red', 'white', 'blue'])
>>> next(k)
'red'
>>> next(k)
'white'
# it dosen't work in container
>>> kk=['red', 'white', 'blue']
>>> next(kk)
Traceback (most recent call last):
File "" , line 1, in <module>
TypeError: 'list' object is not an iterator
但是迭代器也有一个缺点,一次性,很多迭代器取完所有元素就废掉了:
>>> k=iter(['red', 'white', 'blue'])
>>> next(k)
'red'
>>> next(k)
'white'
>>> next(k)
'blue'
>>> next(k)
Traceback (most recent call last):
File "" , line 1, in <module>
StopIteration
当然,python中的有些迭代器类可以实现循环取用,比方说cycle
:
>>> from itertools import cycle
>>> colors = cycle(['red', 'white', 'blue'])
>>> next(colors)
'red'
>>> next(colors)
'white'
>>> next(colors)
'blue'
>>> next(colors)
'red'
但总而言之,迭代器是一个懒惰的容器,他不会像普通容器一样直接把所有元素给你,而会在你每次想他索要下一个元素时,才会给你对应元素,并且记录下状态:
>>> k=iter(['red', 'white', 'blue'])
>>> next(k)
'red'
>>> next(k)
'white'
'''
you can do numerous things.... the iterator can still remember which is the 'next'
'''
>>> next(colors)
'blue'
所以,iterator相较于普通的container (iterable)有如下特点:
next()
)*Tips*:
>>> w=open("test.txt","a+")
>>> w
<_io.TextIOWrapper name='test.txt' mode='a+' encoding='cp936'>
# the io wrapper is also an iterator
>>> "__iter__" in dir(w)
True
>>> "__next__" in dir(w)
True
从他们的本质来看:
iterator是一个类,实现了__iter()__和__next()__方法的类。
generator不止是一个类,他表现得更像是一个函数,他无需实现__iter()__和__next()__方法。
而从使用角度来看:
generator和iterator很像,但是generator的构造更加简单,或者说优雅:
# there are mainly two ways to construct a generator:
## 1. yield
>>> def g():
... for i in range(3):
... yield i
...
>>> g = g()
>>> next(g)
0
>>> next(g)
1
>>> next(g)
2
>>> g
<generator object <genexpr> at 0x000001E5B5743830>
## 2. '()', note that, it isn't a tuple!
>>> a=(i**2 for i in range(9999999))
>>> a
<generator object <genexpr> at 0x000001E5B5743F68>
如上,你可以使用yield
来构造functional generator,也可以简单用一个generator expression :‘()‘
来构造。有意思的是,generator expression这种写法像极了列表推导式,但是他远比列表推导式省内存:
# all elements fill in your RAM...but, if you only need the first three elements?
a=[i**2 for i in range(9999999)]
# lazy may be good!
a=(i**2 for i in range(9999999))
因为,generator是一种特殊的iterator,所以他也具有懒惰的性质,只有你索要元素,他才会生成元素!
总之,generator的特点和iterator几乎一样,只不过他的构造方法和iterator不同,这也使得他更加优雅和易用,它的特点如下:
next()
)所以,为了能够使你的python代码更加pythonic
(简胜于繁,大气易懂), generator是一个很好的方法:
# if you have writted any code like this:
def something():
result = []
for ... in ...:
result.append(x)
return result
# replace it by:
def iter_something():
for ... in ...:
yield x
'''
def something(): # Only if you really need a list structure
return list(iter_something())
'''
*Tips*:
enumerate()
实际上就是把一个可迭代对象转化为一个生成器对象,这个生成器对象不仅可以逐个取元素,还能附带上该元素在容器中的index。建议多使用enumerate(),让你的代码更加pythonic:>>> a=[1,2,3]
>>> ab=enumerate(a)
>>> ab
<enumerate object at 0x0000014EAEE0A318>
>>> "__iter__" in dir(ab)
True
>>> "__next__" in dir(ab)
True
美胜于丑,简胜于繁; 所谓大道,至简至真
’具体而言,pythonic遵循的代码规范有如下:
- 命名合理
- 具有单一功能
- 包含文档注释
- 返回一个值
- 函数和类应该用两个空行隔开
- 尽量使用内置函数
可以试试看在python环境下,import this
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
网上的博客曾差不齐,大概看了7、8篇博客,大多叙述不一、自我矛盾、示例有误。只有下面2篇 (第3篇讲的是pythonic风格) 的结论经过笔者实际操作检验,所述基本正确,也符合笔者对iterable、iterator、generator的认知。本篇博客内容许多参考自第一篇博客,强烈推荐阅读第一篇。