itertools— 迭代函数

翻译自:https://pymotw.com/3/itertools/index.html

itertools中函数设计的初衷是使用起来快速且更有效的利用内存,数据不会被创建直到真的需要,这种“lazy”模式使其不用存储大量数据在内存中。

组合和分割可迭代对象

1) chain
chain函数以多个iterators 为入参,返回一个iterators , 该iterators 包含了入参中的所有元素。

from itertools import *

for i in chain([1, 2, 3], ['a', 'b', 'c']):   #使用chain的好处是可以不用再构建一个大的list
    print(i, end=' ')

#结果
1 2 3 a b c

2) chain.from_iterable()函数
若iterables 事先不能确定,可以使用chain.from_iterable()函数

from itertools import *


def make_iterables_to_chain():
    yield [1, 2, 3]
    yield ['a', 'b', 'c']


for i in chain.from_iterable(make_iterables_to_chain()):
    print(i, end=' ')
print()

#结果
1 2 3 a b c

3) zip函数
将多个iterator中的相应位置数值组成一个个tuple,注意对比zip(较短的iterator结束后就停止)和zip_longest函数(较长的iterator结束后才停止,默认用None填充没有的值)。

for i in zip([1, 2, 3], ['a', 'b', 'c']):
    print(i) 

#结果
(1, 'a')
(2, 'b')
(3, 'c')

4) islice函数
返回入参iterator中的根据索引选择的数值

from itertools import *

print('Stop at 5:')
for i in islice(range(100), 5):
    print(i, end=' ')
print('\n')

print('Start at 5, Stop at 10:')
for i in islice(range(100), 5, 10):
    print(i, end=' ')
print('\n')

print('By tens to 100:')
for i in islice(range(100), 0, 100, 10):
    print(i, end=' ')
print('\n')

#结果
Stop at 5:
0 1 2 3 4

Start at 5, Stop at 10:
5 6 7 8 9

By tens to 100:
0 10 20 30 40 50 60 70 80 90

5) tee函数
根据输入itrerator返回多个独立的iterator

from itertools import *

r = islice(count(), 5)
i1, i2 = tee(r)

print('i1:', list(i1))
print('i2:', list(i2))

#结果
i1: [0, 1, 2, 3, 4]
i2: [0, 1, 2, 3, 4]

tee() 返回的新迭代器与源迭代器共享输入数据,因此,源迭代器中消耗了的数据,新迭代器都不会再出现。

from itertools import *

r = islice(count(), 5)
i1, i2 = tee(r)

print('r:', end=' ')
for i in r:
    print(i, end=' ')
    if i > 1:
        break
print()

print('i1:', list(i1))
print('i2:', list(i2))

#结果
r: 0 1 2
i1: [3, 4]
i2: [3, 4]

转换输入

1)内置map函数


def times_two(x):
    return 2 * x


def multiply(x, y):
    return (x, y, x * y)


print('Doubles:')
for i in map(times_two, range(5)):
    print(i)

print('\nMultiples:')
r1 = range(5)
r2 = range(5, 10)
for i in map(multiply, r1, r2):
    print('{:d} * {:d} = {:d}'.format(*i))

print('\nStopping:')
r1 = range(5)
r2 = range(2)
for i in map(multiply, r1, r2):
    print(i)

#结果
Doubles:
0
2
4
6
8

Multiples:
0 * 5 = 0
1 * 6 = 6
2 * 7 = 14
3 * 8 = 24
4 * 9 = 36

Stopping:
(0, 0, 0)
(1, 1, 1)

2)starmap函数
starmap函数跟map函数类似,只不过map函数接受多个iterator,而starmap函数只接受一个iterator,且使用*号将该iterator中的元素拆成单个

from itertools import *

values = [(0, 5), (1, 6), (2, 7), (3, 8), (4, 9)]

for i in starmap(lambda x, y: (x, y, x * y), values):
    print('{} * {} = {}'.format(*i))

#结果
0 * 5 = 0
1 * 6 = 6
2 * 7 = 14
3 * 8 = 24
4 * 9 = 36

产生新值

1)count函数
count函数返回一个无限产生连续整数的迭代器, 第一个值默认为0, 没有上限

from itertools import *

for i in zip(count(1), ['a', 'b', 'c']):     #count起始值为1
    print(i)
#结果

(1, 'a')
(2, 'b')
(3, 'c')

2)cycle函数
cycle() 返回的迭代器会重复产生参数中的内容。如果输入iterator中的内容比较大,很可能比较消耗内存

from itertools import *

for i in zip(range(7), cycle(['a', 'b', 'c'])):
    print(i)

# 结果
(0, 'a')
(1, 'b')
(2, 'c')
(3, 'a')
(4, 'b')
(5, 'c')
(6, 'a')

3)repeat函数
repeat() 返回的迭代器会重复相同的值,重复次数可由参数 times 指定。

from itertools import *

for i in repeat('over-and-over', 5):
    print(i)

#结果
over-and-over
over-and-over
over-and-over
over-and-over
over-and-over

用repeat()跟zip()或map()组合会有神奇的功效,比如跟zip组合,产生带序号的常量值:

from itertools import *

for i, s in zip(count(), repeat('over-and-over', 5)):
    print(i, s)

#结果
0 over-and-over
1 over-and-over
2 over-and-over
3 over-and-over
4 over-and-over

跟map组合,生成乘法表:

from itertools import *

for i in map(lambda x, y: (x, y, x * y), repeat(2), range(5)):
    print('{:d} * {:d} = {:d}'.format(*i))

#结果

2 * 0 = 0
2 * 1 = 2
2 * 2 = 4
2 * 3 = 6
2 * 4 = 8

Filtering

1) dropwhile函数
dropwhile() 返回的迭代器,它对输入迭代器中的每个元素逐一进行测试,丢弃所有满足测试条件的元素,直到碰到使条件测试返回值为 False 的元素,该元素及之后的所有元素作为返回迭代器中的元素。

from itertools import *


def should_drop(x):
    print('Testing:', x)
    return x < 1


for i in dropwhile(should_drop, [-1, 0, 1, 2, -2]):
    print('Yielding:', i)

#结果
Testing: -1
Testing: 0
Testing: 1
Yielding: 1
Yielding: 2
Yielding: -2

2)takewhile函数
takewhile函数正好与dropwhile相反,其返回的迭代器会一直返回条件为真的元素,直到遇到一个为false的。

from itertools import *


def should_take(x):
    print('Testing:', x)
    return x < 2


for i in takewhile(should_take, [-1, 0, 1, 2, -2]):
    print('Yielding:', i)

#结果
Testing: -1
Yielding: -1
Testing: 0
Yielding: 0
Testing: 1
Yielding: 1
Testing: 2

3)filter函数
filter函数返回的迭代器包含所有使条件为真的元素

from itertools import *


def check_item(x):
    print('Testing:', x)
    return x < 1


for i in filter(check_item, [-1, 0, 1, 2, -2]):
    print('Yielding:', i)

#结果
Testing: -1
Yielding: -1
Testing: 0
Yielding: 0
Testing: 1
Testing: 2
Testing: -2
Yielding: -2

4)filterfalse函数
filterfalse函数返回迭代器,只包含那些使值为false的元素

5)compress函数
与filter函数想比,compress函数提供了另一种过滤机制,不是提供一个函数,而是提供一个iterable ,根据该iterable 中的值来确定是否接受输入iterable 中的值

from itertools import *

every_third = cycle([False, False, True])
data = range(1, 10)

for i in compress(data, every_third):
    print(i, end=' ')
print()

#结果
3 6 9

Grouping Data

1)groupby函数
groupby函数对iterator中的元素根据某个key对其进行分类。

import functools
from itertools import *
import operator
import pprint


@functools.total_ordering
class Point:

    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return '({}, {})'.format(self.x, self.y)

    def __eq__(self, other):
        return (self.x, self.y) == (other.x, other.y)

    def __gt__(self, other):
        return (self.x, self.y) > (other.x, other.y)


# Create a dataset of Point instances
data = list(map(Point,
                cycle(islice(count(), 3)),
                islice(count(), 7)))
print('Data:')
pprint.pprint(data, width=35)
print()

# Try to group the unsorted data based on X values
print('Grouped, unsorted:')
for k, g in groupby(data, operator.attrgetter('x')):
    print(k, list(g))
print()

# Sort the data
data.sort()
print('Sorted:')
pprint.pprint(data, width=35)
print()

# Group the sorted data based on X values
print('Grouped, sorted:')
for k, g in groupby(data, operator.attrgetter('x')):
    print(k, list(g))
print()

#结果

Data:
[(0, 0),
 (1, 1),
 (2, 2),
 (0, 3),
 (1, 4),
 (2, 5),
 (0, 6)]

Grouped, unsorted:
0 [(0, 0)]
1 [(1, 1)]
2 [(2, 2)]
0 [(0, 3)]
1 [(1, 4)]
2 [(2, 5)]
0 [(0, 6)]

Sorted:
[(0, 0),
 (0, 3),
 (0, 6),
 (1, 1),
 (1, 4),
 (2, 2),
 (2, 5)]

Grouped, sorted:
0 [(0, 0), (0, 3), (0, 6)]
1 [(1, 1), (1, 4)]
2 [(2, 2), (2, 5)]

你可能感兴趣的:(Python)