首先看FileProxyMixin的定义:
class FileProxyMixin(object): encoding = property(lambda self: self.file.encoding) fileno = property(lambda self: self.file.fileno) flush = property(lambda self: self.file.flush) isatty = property(lambda self: self.file.isatty) newlines = property(lambda self: self.file.newlines) read = property(lambda self: self.file.read) readinto = property(lambda self: self.file.readinto) readline = property(lambda self: self.file.readline) readlines = property(lambda self: self.file.readlines) seek = property(lambda self: self.file.seek) softspace = property(lambda self: self.file.softspace) tell = property(lambda self: self.file.tell) truncate = property(lambda self: self.file.truncate) write = property(lambda self: self.file.write) writelines = property(lambda self: self.file.writelines) xreadlines = property(lambda self: self.file.xreadlines) def __iter__(self): return iter(self.file)
可以看到使用property属性装饰器和lambda函数,完成对self.file大部分操作的代理,并且也实现__iter__方法,支持迭代。
然后看File的定义,它的作用主要是完成对不同的类型的文件的抽象。增加chunks方法,用于分片读取数据大的文件,增加closed的属性,判断是否文件关闭, 增加size属性。
首先看size属性的实现函数:
def _get_size_from_underlying_file(self): if hasattr(self.file, 'size'): return self.file.size if hasattr(self.file, 'name'): try: return os.path.getsize(self.file.name) except (OSError, TypeError): pass if hasattr(self.file, 'tell') and hasattr(self.file, 'seek'): pos = self.file.tell() self.file.seek(0, os.SEEK_END) size = self.file.tell() self.file.seek(pos) return size raise AttributeError("Unable to determine the file's size.")
首先尝试获取self.file的size属性, 然后尝试获取本地文件的大小,最后使用seek方法获取大小。
self.file.seek(0, os.SEEK_END) size = self.file.tell()
接下来看看File如何支持迭代的:
def __iter__(self): # Iterate over this file-like object by newlines buffer_ = None for chunk in self.chunks(): for line in chunk.splitlines(True): if buffer_: if endswith_cr(buffer_) and not equals_lf(line): # Line split after a \r newline; yield buffer_. yield buffer_ # Continue with line. else: # Line either split without a newline (line # continues after buffer_) or with \r\n # newline (line == b'\n'). line = buffer_ + line # buffer_ handled, clear it. buffer_ = None # If this is the end of a \n or \r\n line, yield. if endswith_lf(line): yield line else: buffer_ = line if buffer_ is not None: yield buffer_
这里迭代的逻辑,主要是buffer_的使用。
首先会调用chunks进行块读取,然后将line进行分行(splitlines)。将每段line剩下的字节,保存在buffer_中,等待下段一起合并。