Python学习笔记（六）Python盒子：模块，包和程序

关于独立的程序：

我们可以将编写的代码放到一个文本里，并将文件名命名为xxx.py的形式。如果想要运行程序，直接在终端或者命令终端输入 python xxx.py。

命令行参数

我们编写文件test.py，包含下面两行代码：
import sys print('Program arguments:', sys.argv)
下面是我们在shell环境下运行这段程序：
python test.py ('Program arguments:', ['test.py']) python test.py rtcfvgbh fgh ('Program arguments:', ['test.py', 'rtcfvgbh', 'fgh'])

模块和import语句

一个模块仅仅是Python代码的一个文件，是一个不带.py扩展的文件的文件名。引用其他模块的代码时使用import语句，被引用模块种的代码和变量对该程序可见。

导入模块

下面是一个小栗子，主程序输出报告，一个单独的具有单个函数额度模块返回天气的描述。
主程序：命名为weatherman.py
import report description=report.get_description() print("Today's weather:", description)
天气模块的代码：
def get_description(): """Return random weather, just like the pros""" from random import choice possibilities = ['rain', 'snow', 'sleet', 'fog', 'sun', 'who knows'] return choice(possibilities)
我们在两个不同的地方使用了 import：

主程序 weatherman.py 导入模块 report；
在模块文件 report.py 中，函数get_description() 从 Python 标准模块 random 导入函数 choice。
我们以两种不同的方式使用了 import：
主程序调用 import report，然后运行 report.get_description()；
report.py 中的 get_description() 函数调用 from random import choice，然后运行 choice(possibilities)。
当导入整个模块的时候，需要把report. 作为get_description()的前缀。在这个 import 语句之后，只要在名称前加 report.，report.py 的所有内容（代码和变量）就会对主程序可见。通过模块名限定模块的冲突，能够避免命名的冲突。
当所有代码都在一个函数下，并且没有其他名为choice的函数，所以我们直接从random模块导入函数choice()。
一般被导入的代码使用多次，就应该考虑在函数外导入；如果被导入的代码使用有限，就在函数内导入。一般建议在开头导入。

使用别名导入模块

我们可以对导入的模块起一个方便记忆的别名：

import 模块名 as 别名

以后在用到模块名的地方都可以换成别名

导入模块的一部分

在python种可以导入模块的若干部分。每一部分都有自己的原始名字或者自己起的别名。首先，从 report 模块中用原始名字导入函数 get_description()：
from report import get_description description = get_description() print("Today's weather:",description)

使用它的别名do_it导入：
from report import get_description as do_it description = get_description() print("Today's weather:",description)

包

我们可以把多个模块组织称文件层次，称之为包。
许我们需要两种类型的天气预报：一种是次日的，一种是下周的。一种可行的方式是新建目录 sources，在该目录中新建两个模块 daily.py 和 weekly.py。每一个模块都有一个函数 forecast。每天的版本返回一个字符串，每周的版本返回包含 7 个字符串的列表。
下面是主程序和两个模块（函数 enumerate() 拆分一个列表，并对列表中的每一项通过 for 循环增加数字下标）。
主程序：weather.py

from sources import daily,weekly
print("daily",daily.forecast())
print("weekly:")
for number,outlook in enumerate(weekly.forecast(),1):
print(number,outlook)

模块1:sources/daily.py

def forecast():
'fake daily forecast'
return 'like yesterday'

模块2：sources/weekly.py

def forecast():
"""Fake weekly forecast"""
return ['snow', 'more snow', 'sleet', 'freezing rain', 'rain', 'fog', 'hail']

除了以上两个模块外，还需要在包 sources目录下添加一个文件：__init__.py。这个文件可以是空的，但python需要它，以便把该目录作为一个包。

Python标准库

使用setdefault()和defaultdict()处理缺失的键

读取字典中不存在的键的值会抛出异常。使用字典函数get()返回一个默认值会避免异常发生。函数setdefault()类似get()，当键不存在时它会在字典中添加一项：

periodic_table = {'Hydrogen': 1, 'Helium': 2}

如果键不在字典中，新的默认值会被添加进去：

carbon = periodic_table.setdefault('Carbon', 12) 
 periodic_table {'Helium': 2, 'Carbon': 12, 'Hydrogen': 1}

如果试图把一个不同的默认值赋给已经存在的键，不回改变原来的值，仍将返回初始值：

helium = periodic_table.setdefault('Helium', 947) 
 periodic_table {'Helium': 2, 'Carbon': 12, 'Hydrogen': 1}

defaultdict()也有同样的用法，但是在字典创建的时候，对每个新的键都会指定默认值。它的参数时一个函数，下面的栗子，把函数int()作为参数传入，会按照int()调用，返回整数0：

from collections import defaultdict
periodic_tab = defaultdict(int)
periodic_tab['Hycds'] = 1 
periodic_tab['Lead']
periodic_tab
defaultdict(int, {'Heklc': 0, 'Hycds': 1, 'Lead': 0})

任何缺失的值都将被赋值为整数0：

函数defaultdict()的参数是一个函数，它返回赋给缺失键的值，下面的栗子中no_idea()在需要时被执行，返回一个值：

from collections import defaultdict 
def no_idea():
    return 'Hub'  
bestity = defaultdict(no_idea)
bestity['A'] = 'nearjkl' 
bestity['B'] = 'jvkefk' 
bestity['A'] 
'nearjkl'
bestity['B'] 
'jvkefk'
bestity['C'] 
'Hub'

同样可以使用函数int()，list()或者dict()返回默认的值：int()返回0，list()返回空列表（[ ]），dict()返回空字典（{ }）。如果你删掉该函数参数，新建的初始值会被设置为None。
也可以使用lambda来定义我们的默认值：

bestiary = defaultdict(lambda: 'Huh?')
bestity['C'] 
'HHub'

使用int是一种定义计数器的方式：

from collections  import  defaultdict 
food_counter = defaultdict(int)
for food in ['spam','spam','eggs','spam']:
    food_counter[food]+=1
for food,count in food_counter.items():
    print(food,count)
结果：
spam 3
eggs 1

上面的栗子中，如果 food_counter 已经是一个普通的字典而不是 defaultdict 默认字典，那每次试图自增字典元素 food_counter[food] 值时，Python 会抛出一个异常，因为我们没有对它进行初始化。在普通字典中，需要做额外的工作，如下所示：

dict_counter = {}
for food in ['spam','spam','eggs','spam']:
    ...:     if not  food in dict_counter:
    ...:         dict_counter[food] = 0
    ...:     dict_counter[food] += 1   
for food,count in dict_counter.items():
    ...:     print(food,count) 
结果：
spam 3
eggs 1

在上面的栗子中，我们先创建一个空的字典，然后判断键时候在字典中，如果不在，设置键值为0，如果在将键值加1（肯定事先都不在）。最后遍历字典。

使用Counter()计数

python中有一个标准计数器：Counter()

from collections import Counter 
breakfast = ['spam','spam','eggs','spam']
breakfas_counter = Counter(breakfast)
breakfas_counter
Counter({'eggs': 1, 'spam': 3})

函数most_common()以降序返回所有元素，如果给定一个数字，会返回该数字前的元素：

breakfas_counter.most_common()
[('spam', 3), ('eggs', 1)]
breakfas_counter.most_common(1)
[('spam', 3)]

也可以组合计数器，下面先创建一个列表lunch和一个计数器lunch_counter:

lunch = ['eggs','eggs','bacon']
lunch_counter = Counter(lunch) 
lunch_counter
Counter({'bacon': 1, 'eggs': 2})

第一种组合方式是使用 +:
>breakfas_counter + lunch_counter
Counter({'bacon': 1, 'eggs': 3, 'spam': 3})

第二种组合方式是使用 -：查看早餐有午餐没有
>breakfas_counter - lunch_counter
Counter({'spam': 3})

第三种组合方式是使用交集运算符 & 得到两者共有的项：
>breakfas_counter & lunch_counter
Counter({'eggs': 1})

两者的交集通过取两者中的较小计数。

第四种组合方式是使用并集运算符 | 得到所有元素：
>breakfas_counter |  lunch_counter 
Counter({'bacon': 1, 'eggs': 2, 'spam': 3})

两者的并集通过取两者中较大的计数。

使用有序字典OrderedDict()按键排序

一个字典中键的顺序是不可预知的，字典返回的数据顺序和添加的顺序可能不一致，有序字典OrderedDict()记忆字典键添加的顺序，然后从一个迭代器按照相同的顺序返回。试着用元组（键，值）创建一个有序字典：

from  collections  import OrderedDict
quotes = OrderedDict([('Moe','bhfsnzkml'),('Laary','Oww'),('Curly','ebanj')])
for  stooge in quotes:
    print(stooge)    
Moe
Laary
Curly

双端队列：栈+队列

deque是一个双端队列，同时具有栈和队列的特征。他可以从序列的任何一端添加和删除项。现在，我们从一个词的两端扫向中间，判断是否为回文。函数popleft()去掉最左边的项并返回该项，pop()去掉最右边的项并返回该项。从两边一直向中间扫描，只要两端的字符匹配，一直弹出直到到达中间：

def palindrome(word):
    from collections import deque 
    dq = deque(word)
    while len(dq) > 1 :
        if dq.popleft() != dq.pop():
            return False 
    return True 
palindrome('a')
True
palindrome('anmmna')
True
palindrome('anmdcds')
False
palindrome('')
True
list = ['a','b','c']
palindrome(list)
False
list = ['a','b','a']
palindrome(list)
True

使用itertools迭代代码结构

itertools包含特殊用途的迭代器函数。在for...in循环中调用迭代函数，每次会返回一项，并记住当前调用的状态。
即使chain()的参数是单个迭代对象，它也会使用参数进行迭代：

import  itertools 
for item in itertools.chain([1,2],['a','b'],{1:'a',2:'b'}):
    print(item)
1
2
a
b
1
2

cycle()是一个在它的参数之间循环的无限迭代器：

import  itertools
for item in itertools.cycle([1,2]):
    print(item)
1
2
1
2

友情提示，上面小栗子会永无止境，要尽快结束

accumulate()计算累积的的值。默认的话，它的计算是累加和：

import  itertools 
for  item in itertools.accumulate([1,2,3,4]):
    print(item)     
1
3
6
10

可以把一个函数作为accumulate()的第二个参数，代替默认的加法函数。这个参数函数应该接受两个参数，返回单个结果。下面的栗子是计算乘积：

import  itertools 
def multiply(a,b):
    return a * b 
for item in itertools.accumulate([1,2,3,4],multiply):
    print(item)    
1
2
6
24

使用pprint()友好输出

之前的栗子中都用print()（或者在交互式解释器中用变量名）打印输出。有时输出结果的可读性差，我们有一个友好的输出函数：pprint()。

In [1]: from pprint import  pprint

In [2]: from collections import  OrderedDict 

In [3]: quotes = OrderedDict([    ('Moe','A djnk  ndk'),    ('Larry','Oww'),    ('Curry','jfnkl'),    ]) 

In [4]: print(quotes) 
OrderedDict([('Moe', 'A djnk  ndk'), ('Larry', 'Oww'), ('Curry', 'jfnkl')])

In [5]: pprint(quotes) 
OrderedDict([('Moe', 'A djnk  ndk'), ('Larry', 'Oww'), ('Curry', 'jfnkl')])

注：本文内容来自《Python语言及其应用》欢迎购买原书阅读