[notes]Iterator-1:Sentence class Python迭代器基本概念

#!usr/bin/env python

import re
import reprlib

RE_WORD = re.compile("\w+")
class Sentence(object):
    def __init__(self, text):
        self.words = RE_WORD.findall(text)

    def __getitem__(self, idx):
        return self.words[idx]

    def __len__(self):
        return len(self.words)

    def __repr__(self):
        return "Sentence ({})".format(reprlib.repr(self.text))



# an object is considered iterable if it implements the __iter__ method 

class Foo(object):
    """docstring for Foo"""
    def __iter__(self):
        pass
"""
>>> from collections import abc
>>> issubclass(Foo, abc.Iterable)
True
>>> isinstance(Foo(), abc.Iterable)
True
"""
# However, the most accurate way the check whether an object is iterable is to call iter(x) and handle the TypeError
# exception if it isn't. This is more accurate than using isinstance(x, abc.Iterable), because iter(x) also considers
# the legacy __getitem__ method, while the Iterable ABC does not.

'''
iterable:
    any object from which the iter built-in function can obtain an iterator. Objects implements 
    an __iter__ method returning
an iterator are iterable. Sequences are always iterable; as are object implements a __getitem__ method 
that takes 0-based indexes.

'''

'''
it is important to clear about the relationship between iterables and iterators: 
Python obtains iterators from iterables.
'''

'''
the for manchinery by hand with a while  loop.
>>> s = 'ABC'
>>> for e in s:
        print(e)
that is like:
>>> s = 'ABC'
>>> it = iter(s)
>>> while True:
        try:
            print(next(it))
        except StopIteration, e:
            del it                 # decreace the reference by 1
            break
'''

'''
the standard interface for an iterator object:
first, __next__
        return the next available item, raising StopIteration when there are no more items.
second, __iter__
        return self; this allows iterators to be used where an iterable is expected, for example, in a for loop.
'''
'''
iterable builds iterators
'''
'''
# abc.Iterator class. __file__ = 'Lib/_collections_abc.py'

class Iterator(Iterable):
    slots = () # can not be used as an instance
    
    def __iter__(self):
        return self
    
    @abstactmethod
    def __next__(self):
        raise StopIteration

    @clasmethod
    def __subclasshook__(cls, C):
        if cls is Iterator:
            if any("__next__" in B.__dict__ for B in C.__mro__) and any("__iter__" in B.__dict__ for B in C.__mro__):
                return True
        return NotImplemented

'''

'''
Iterators in Python aren't a matter of type but of protocol. 
A large and changing number of builtin types implement *some* flavor of iterator.
Don't check the type! Use hasattr to check for both "__iter__" and "__next__" attributes instead.

In fact, that's exactly what the __subclasshook__ method of the abc.Iterator ABC does.

'''
# The best way to check if an object x is an iterator is to call isinstance(x, abc.Iterator)
# Thanks to Iterator.__subclasshook__, this test works even if the class of x is not a real or 
# virtual subclass of Iterator.

'''
how to reset an iterator
Because the only methods required of an iterator are __next__ and __iter__, there is no way to check whether 
there are remaining items, other than to call next() and catch StopIteration. Also, it's no possible 
to 'reset' an iterator.
if you need to start over, you need to call iter( ) on the iterable that built the iterator in the first place.
calling iter() on the original iterator itself won't help, because __iter__ return self.
'''

''' 
iterator:
    Any object that implement the __next__ no-argument method that returns the next item in a series or raise 
    StopIteration when there are no more items.
    Python iterators also implement the __iter__ method so they are iterable as well.

'''

# another version of setence class, the style is more like java
class Setence(object):
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __iter__(self):
        return SentenceIterator(self.words)
    def __repr__(self):
        return "Setence ({})".format(reprlib.repr(self.words))

class SentenceIterator(object):
    def __init__(self, words):
        self.words = words
        self.idx = 0

    def __iter__(self):
        return self

    def __next__(self):
        try:
            word = self.words[self.idx]
        except IndexError:
            raise StopIteration
        self.idx += 1
        return word
# this version has no __getitem__,  to make it clear that the class is iterable because it implements __iter__

''' Note!!!
Note that implementing the __iter__ method in SentenceIterator is not actually needed for this example to work, 
but the it's the right thing todo:
    iterators are supposed to implement both __next__ and __iter__, and doing so makes our iterator 
    pass the issubclass(SentenceIterator, abc.Iterator) test.
'''


'''
A common cause of errors in building iterables and iterators is to confuse the two.
To be clear:
iterables have an __iter__ method that instantiates a new iterator every time.
Iterators implement a __next__ method that returns individual items, and an __iter__ method that returns self.

Therefore, iterators are also iterable, but iterables are not iterators.

It may be tempting to add __next__ methos in the Sentence class, making Setence instance at the same time an 
iterable and iterator over itself. But this is a terrible idea.

it must be possible to obtain multiple independent iterators from the same iterable instance, and each iterator 
mustkeep its own internal state, so a proper implementation of the pattarn requires each call to iter(my_iterable) 
to create a new, independent, iterator. That is why we need the SentenceIterator class in this example.

'''

'''
conclude:
An iterable should never act as an iterator over itself.
In other words, iterable should implement __iter__ method, but not __next__.
On the other hand, for convenience, iterators should be iterable.
An iterator's __iter__ should just return self.
'''



# unfortunately, the versions  of Setence above  are bad ideas. not pythonic

你可能感兴趣的:(iterator,迭代器,Iterable)