Python 工匠 第六章 循环和可迭代对象

基础知识

迭代器和可迭代对象

iter() 与 next()

>>> iter([1,2,3])
<list_iterator object at 0x0000022FEFEA73A0>
>>> iter('abc')
<str_iterator object at 0x0000022FEFEA7190>
>>> iter(1)
Traceback (most recent call last):
  File "", line 1, in <module>
TypeError: 'int' object is not iterable

迭代器:帮助我们迭代其他对象的对象

>>> my_list = ['foo', 'bar']
>>> iter_my_list = iter(my_list)
>>> iter_my_list
<list_iterator object at 0x0000022FEFEA73A0>
>>> next(iter_my_list)
'foo'
>>> next(iter_my_list)
'bar'
>>> next(iter_my_list)
Traceback (most recent call last):
  File "", line 1, in <module>
StopIteration

此外,对迭代器执行iter()函数,获取迭代器的迭代器对象时,返回的结果一定是迭代器本身。

>>> iter(iter_my_list) is iter_my_list
True

当用for循环遍历某个可迭代对象时:先调用iter()拿到了它的迭代器,然后不断地用next()从迭代器中获取值

### 1
names = ['a', 'b', 'c']

for name in names:
    print(name)

### 2
iterator = iter(name)
while True:
    try:
         name = next(iterator)
         print(name)
    except StopInteration:
         break

自定义迭代器

关键在实现:__iter__ and __next__ 两者分别在调用iter() 和 next() 时触发

需求:编写一个类似range()的迭代器对象Range7, 其可以返回某个范围内所有可被7整除或者包含7的整数:

class Range7:
    """generate the number in certain range which could be divided by 7 exactly or contains 7 
    
    :param start: start number
    :param end: end number
    """
    
    def __init__(self, start, end):
        self.start = start
        self.end = end
        self.current = start
    
    def __iter__(self):
        return self
    
    def __next__(self):
        while True:
            if self.current >= self.end:
                raise StopIteration
            
            if self.num_is_valid(self.current):
                ret = self.current
                self.current += 1
                return ret
            self.current += 1
    
    def num_is_valid(self, num):
        if num == 0:
            return False
        return num % 7 == 0 or '7' in str(num)
r = Range7(0, 20)
for num in r:
    print(num)
r = Range7(0, 20)
for num in r:
    print(num)
7
14
17

发现一个问题,每个Range7对象只能被完整遍历一次

r = Range7(0, 20)
print(tuple(r))
print(tuple(r))

(7, 14, 17)
()

因为每个Range7对象的current属性在init之后逐渐增长到end,不会再回来了
除非手动更改其值

r = Range7(0, 20)
print(tuple(r))
r.current = 0
print(tuple(r))

(7, 14, 17)
(7, 14, 17)

区分迭代器和可迭代对象

迭代器是可迭代对象的一种。只需要实现__iter__

class Range7:
    def __init__(self, start, end):
        self.start = start
        self.end = end

    def __iter__(self):
        return Range7Iterator(self)


class Range7Iterator:
    def __init__(self, range_obj):
        self.range_obj = range_obj
        self.current = range_obj.start
        
    def __iter__(self):
        return self
    
    def __next__(self):
        while True:
            if self.current >= self.range_obj.end:
                raise StopIteration
            
            if self.num_is_valid(self.current):
                ret = self.current
                self.current += 1
                return ret
            self.current += 1
    
    def num_is_valid(self, num):
        if num == 0:
            return False
        return num % 7 == 0 or '7' in str(num)

r = Range7(0, 20)
print(tuple(r))
print(tuple(r))

(7, 14, 17)
(7, 14, 17)

新代码,每次遍历Range7对象时,都会创建出一个全新的迭代器对象Range7Iterator。

总结 迭代器和可迭代对象的比较
迭代器是可迭代对象(只需要实现__iter__)的子集
iter(可迭代对象)返回迭代器,iter(迭代器)返回自身

此外,如果一个类型没有定义__iter__, 但是有__getitem__, python也会认为其是可迭代的(遍历时0, 1, 2, 3…)

生成器是迭代器

def range_7_generator(start, end):
    num = start
    while num != 0 and (num % 7 == 0 or '7' in str(num)):
        yield num
    num += 1

修饰可迭代对象优化循环

使用生成器函数修饰可迭代对象

用生成器在循环外部包装原本的循环主体

def sum_even_only(numbers):
    result = 0
    for num in numbers:
        if num % 2 == 0:
            result += num
    return result

to

def even_only(numbers):
    for num in numbers:
        if num % 2 == 0:
            yield num

def sum_even_only_v2(numbers):
    result = 0
    for num in even_only(numbers):
        result += num
    return result

使用itertools模块优化循环

使用product() 扁平化多层嵌套循环

from itertools import product
list(product([1, 2], [3, 4]))

[(1, 3), (1, 4), (2, 3), (2, 4)]
def find_12(num_list1, num_list12, num_list13):
    for num1 in num_list1:
        for num2 in num_list2:
            for num3 in num_list3:
                if num1 + num2 + num3 ==  12:
                    return num1, num2, num3

to

def find_12_v2(num_list1, num_list12, num_list13):
    for num1, num2, num3 in product(num_list1, num_list12, num_list13):
        if num1 + num2 + num3 ==  12:
            return num1, num2, num3

使用islice()实现循环内隔行处理

在循环内部实现隔行处理

from itertools import islice
# islice(seq, start, end, step)

for i in islice(range(10), 0, None, 2): # 实际用例中range(10)可能是个无法分片的可迭代对象
    print(i)

使用takewhile()代替break

from itertools import takewhile
# takewhile(predicate, iterable) 会在迭代第二个参数iterable的过程中,不断使用当前值作为参数调用predicate()函数,并对返回结果进行真值测试,如果为True,则返回当前值并继续迭代,否则立即中断本地迭代

chain() zip_longest() etc.

循环语句的else关键字

案例故事

编程建议

中断嵌套循环的正确方式

一般是多个break
建议:把循环代码拆分为一个新函数,然后直接使用return

巧用 next() 函数

获取字典的第一个key
next(iter(d.keys()))

当心已被耗尽的迭代器

你可能感兴趣的:(Python,工匠,笔记,python)