这里介绍Python的高级模块,用于专家级编程的需要,在小的脚本中很少使用。
>>> import reprlib >>> reprlib.repr(set('supercalifragilisticexpialidocious')) "set(['a', 'c', 'd', 'e', 'f', 'g', ...])"pprint模块为打印内建的和用户自定义的对象上提供了更复杂的控制,当结果超过了一行,“美化打印”增加了分行机制以更清晰的展示数据结构:
>>> import pprint >>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta', 'yellow'], 'blue']]] >>> pprint.pprint(t, width=30) [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta', 'yellow'], 'blue']]]textwrap模块格式化文章的段落以适应屏幕宽度:
>>> import textwrap >>> doc = """The wrap() method is just like fill() except that it returns a list of strings instead of one big string with newlines to separate the wrapped lines.""" >>> print(textwrap.fill(doc, width=40)) The wrap() method is just like fill() except that it returns a list of strings instead of one big string with newlines to separate the wrapped lines.locale模块进入一个文化特定数据格式的数据库,locale的格式化函数的grouping属性提供了一个使用组分隔符格式化数字的方法:
>>> import locale >>> locale.setlocale(locale.LC_ALL, 'English_United States.1252') 'English_United States.1252' >>> conv = locale.localeconv() # get a mapping of conventions >>> x = 1234567.8 >>> locale.format("%d", x, grouping=True) '1,234,567' >>> locale.format_string("%s%.*f", (conv['currency_symbol'], conv['frac_digits'], x), grouping=True) '$1,234,567.80'
>>> from string import Template >>> t = Template('${village}folk send $$10 to $cause.') >>> t.substitute(village='Nottingham', cause='the ditch fund') 'Nottinghamfolk send $10 to the ditch fund.'当一个占位符没有被一个字典或者key-value参数提供时,substitute()抛出一个KeyError。为邮件合并类型的应用,用户提供可以是不完整的,因此safe_substitute()方法可能更使用——它会忽略不能匹配的占位符。
>>> t = Template('Return the $item to $owner.') >>> d = dict(item='unladen swallow') >>> t.substitute(d) Traceback (most recent call last): KeyError: 'owner' >>> t.safe_substitute(d) 'Return the unladen swallow to $owner.'Template子类可以指定自己的定界符。例如,图片浏览器的批量重命名工具可以使用%作为占位符:
>>> import time, os.path >>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg'] >>> class BatchRename(Template): delimiter = '%' >>> fmt = input('Enter rename style (%d-date %n-seqnum %f-format): ') Enter rename style (%d-date %n-seqnum %f-format): Ashley_%n%f >>> t = BatchRename(fmt) >>> date = time.strftime('%d%b%y') >>> for i, filename in enumerate(photofiles): base, ext = os.path.splitext(filename) newname = t.substitute(d=date, n=i, f=ext) print('{0} --> {1}'.format(filename, newname)) img_1074.jpg --> Ashley_0.jpg img_1076.jpg --> Ashley_1.jpg img_1077.jpg --> Ashley_2.jpg使用模板的另一个应用是从多个输出格式的细节中分离程序逻辑。这样就可以替代XML文件、纯文本报表和HTML web报表的自定义模板。
import struct with open('myfile.zip', 'rb') as f: data = f.read() start = 0 for i in range(3): # show the first 3 file headers start += 14 fields = struct.unpack('<IIIHH', data[start:start+16]) crc32, comp_size, uncomp_size, filenamesize, extra_size = fields start += 16 filename = data[start:start+filenamesize] start += filenamesize extra = data[start:start+extra_size] print(filename, hex(crc32), comp_size, uncomp_size) start += extra_size + comp_size # skip to the next header
import threading, zipfile class AsyncZip(threading.Thread): def __init__(self, infile, outfile): threading.Thread.__init__(self) self.infile = infile self.outfile = outfile def run(self): f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED) f.write(self.infile) f.close() print('Finished background zip of:', self.infile) background = AsyncZip('mydata.txt', 'myarchive.zip') background.start() print('The main program continues to run in foreground.') background.join() # Wait for the background task to finish print('Main program waited until background was done.')多线程应用的首要挑战是协调处理多个线程之间的共享数据和其它资源,为了处理它们,threading模块提供了一个同步机制,包括:locks、events、condition variables和semaphores。
import logging logging.debug('Debugging information') logging.info('Informational message') logging.warning('Warning:config file %s not found', 'server.conf') logging.error('Error occurred') logging.critical('Critical error -- shutting down')上面的代码将得到下面的输出:
WARNING:root:Warning:config file server.conf not found ERROR:root:Error occurred CRITICAL:root:Critical error -- shutting down默认情况下,info和debug信息被抑制并且输出到标准错误窗口。其它输出选项包括通过email、数据电报、socket发送信息,也可以发送到一个HTTP server。新的过滤器能选择不同的路径,基于信息优先级:DEBUG、INFO、WARNING、ERROR和CRITICAL。
>>> import weakref, gc >>> class A: ... def __init__(self, value): ... self.value = value ... def __repr__(self): ... return str(self.value) ... >>> a = A(10) # create a reference >>> d = weakref.WeakValueDictionary() >>> d['primary'] = a # does not create a reference >>> d['primary'] # fetch the object if it is still alive 10 >>> del a # remove the one reference >>> gc.collect() # run garbage collection right away 0 >>> d['primary'] # entry was automatically removed Traceback (most recent call last): File "<stdin>", line 1, in <module> d['primary'] # entry was automatically removed File "C:/python34/lib/weakref.py", line 46, in __getitem__ o = self.data[key]() KeyError: 'primary'
>>> from array import array >>> a = array('H', [4000, 10, 700, 22222]) >>> sum(a) 26932 >>> a[1:3] array('H', [10, 700])collections模块提供了deque()对象,能更快的添加数据和从左端弹出数据,但查询中间的数据则更慢。这些对象很适合于作为队列和用于树的广度优先搜索:
>>> from collections import deque >>> d = deque(["task1", "task2", "task3"]) >>> d.append("task4") >>> print("Handling", d.popleft()) Handling task1
unsearched = deque([starting_node]) def breadth_first_search(unsearched): node = unsearched.popleft() for m in gen_moves(node): if is_goal(m): return m unsearched.append(m)还有bisect模块提供了管理排序列表的手段:
>>> import bisect >>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')] >>> bisect.insort(scores, (300, 'ruby')) >>> scores [(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]heapq模块提供了实现基于列表的堆的方法。最小的值总是保存在位置0:
>>> from heapq import heapify, heappop, heappush >>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0] >>> heapify(data) # rearrange the list into heap order >>> heappush(data, -5) # add a new entry >>> [heappop(data) for i in range(3)] # fetch the three smallest entries [-5, 0, 1]
>>> from decimal import * >>> round(Decimal('0.70') * Decimal('1.05'), 2) Decimal('0.74') >>> round(.70 * 1.05, 2) 0.73下面是decimal类执行模运算和和运算同二进制浮点数的比较:
>>> Decimal('1.00') % Decimal('.10') Decimal('0.00') >>> 1.00 % 0.10 0.09999999999999995 >>> sum([Decimal('0.1')]*10) == Decimal('1.0') True >>> sum([0.1]*10) == 1.0 FalseDecimal模块也提供了指定精度的办法:
>>> getcontext().prec = 36 >>> Decimal(1) / Decimal(7) Decimal('0.142857142857142857142857142857142857')