1.xrange and enumerate
enumerate:enumerate is useful for obtaining an indexed list
xrange: generates the numbers in the range on demand. For looping, this is slightly faster than range() and more memory efficient.
根据性能比较还是xrange 好一点,如果数据量不大,用哪个都可以,哪个更符合要求您就可以使用哪个,而且enumerate和xrange同样使用的是next()方法,只是对返回数据的封装不同。
seq = [i for i in xrange(1000000)]
... n = datetime.datetime.now()
... for i in xrange(len(seq)):
... #print item, i
... pass
... print datetime.datetime.now() - n
0:00:00.066000
>>> seq = [i for i in xrange(1000000)]
... n = datetime.datetime.now()
... for i in xrange(len(seq)):
... #print item, i
... pass
... print datetime.datetime.now() - n
0:00:00.067000
print datetime.datetime.now() - n
0:00:00.142000
>>> seq = [i for i in xrange(1000000)]
... n = datetime.datetime.now()
... for i,item in enumerate(seq):
... #print item, i
... pass
... print datetime.datetime.now() - n
0:00:00.142000
2.xrange and zip
在这里zip和xrange的功能是不一样滴,不能做功能上的比较
劣化代码:
for i in xrange(len(seq1)):
foo(seq1[i], seq2[i])
推荐代码:
for i, j in zip(seq1, seq2)
foo(i, j)
更高效:
for i, j in itertools.izip(seq1, seq2):
foo(i, j)
这里是zip的定义:
zip(...)
zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]
Return a list of tuples, where each tuple contains the i-th element
from each of the argument sequences. The
returned list is truncated
in length to the length of the shortest argument sequence.
如果用zip做代替结果可能会产生错误!,如下代码没有list中的6
>>> zip([4,5,6],[2,3])
2: [(4, 2), (5, 3)]
而且zip的性能也不是太理想:
import datetime
... seq1 = [i for i in xrange(1000000)]
... seq2 = [j for j in xrange(1000000)]
... n = datetime.datetime.now()
... for i, j in zip(seq1, seq2):
... pass
... print datetime.datetime.now() - n
0:00:01.713000
>>> import itertools
>>> import datetime
... seq1 = [i for i in xrange(1000000)]
... seq2 = [j for j in xrange(1000000)]
... n = datetime.datetime.now()
... for i, j in itertools.izip(seq1, seq2):
... pass
... print datetime.datetime.now() - n
0:00:00.159000
3.filter
不过下面这个写法还是挺好的,可以提高复用性
劣化代码:
for i in seq:
if pred(i):
foo(i)
推荐代码:
for i in itertools.ifilter(pred, seq):
foo(i)
4.imap
map and iterator.imap 也是有很好的复用性,
但是imap和map的定义不同:
itertools.imap(function, *iterables)
Make an iterator that computes the function using arguments from each of the iterables. If function is set to None, then imap() returns the arguments as a tuple. Like map()
but stops when the shortest iterable is exhausted instead of filling in None for shorter iterables. The reason for the difference is that infinite iterator arguments are typically an error for map() (because the output is fully evaluated) but represent a common and useful way of supplying arguments to imap()
两者的区别
for i in map(pow, (2,3,10), (5,2)):
... print i
Traceback (most recent call last):
File "<pyshell#30>", line 1, in <module>
for i in map(pow, (2,3,10), (5,2)):
TypeError: unsupported operand type(s) for ** or pow(): 'int' and 'NoneType'
> for i in itertools.imap(pow, (2,3,10), (5,2)):
... print i
32
9
5.Generators
Since Python 2.2, generators provide an elegant way to write simple and efficient
code for functions that return a list of elements. Based on the yield directive, they
allow you to pause a function and return an intermediate result. The function saves
its execution context and can be resumed later if necessary.
For example (this is the example provided in the PEP about iterators), the Fibonacci
series can be written with an iterator:
>>> def fibonacci():
... a, b = 0, 1
... while True:
... yield b
... a, b = b, a + b
...
>>> fib = fibonacci()
>>> fib.next()
1
>>> fib.next()
1
>>> fib.next()
2
>>> [fib.next() for i in range(10)]
[3, 5, 8, 13, 21, 34, 55, 89, 144, 233]
>>> def my_generator():
... try:
... yield 'something'
... except ValueError:
... yield 'dealing with the exception'
... finally:
... print "ok let's clean"
>>> m = my_generator
>>> m = my_generator()
>>> m.next()
26: 'something'
>>> m.throw(ValueError('haha'))
27: 'dealing with the exception'
>>> m.close()
ok let's clean
>>> m.next
28: <method-wrapper 'next' of generator object at 0x01E20D50>
>>> m.next()
6.groupby
from itertools import groupby
>>> def compress(data):
... return ((len(list(group)), name)
... for name, group in groupby(data))
...
>>> def decompress(data):
... return (car * size for size, car in data)
...
>>> list(compress('get uuuuuuuuuuuuuuuuuup'))
[(1, 'g'), (1, 'e'), (1, 't'), (1, ' '),
(18, 'u'), (1, 'p')]
>>> compressed = compress('get uuuuuuuuuuuuuuuuuup')
>>> ''.join(decompress(compressed))
'get uuuuuuuuuuuuuuuuuup'