Python 语言关键特性汇总

Python 语言中有很多非常高效的方法和特性使得程序编写十分方便，这里会随着日常的学习，逐渐添加一些新学到的技巧。

List 解析和切片

Python 提供了非常方便的 list( ) 函数，但其实在编写代码过程中也可以利用列表解析 list comprehension 非常方便的生成一个 list，并通过切片更加有选择的进行元素操作：

#生成0-9的一个list
>>> s = [x for x in range(10)] 
>>> s 
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

列表解析的同时可以对元素进行数学计算：

#生成0-9的数字的平方列表
>>> squared = [x*x for x in range(10)] 
>>> squared
    [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

以及逻辑运算和筛选：

#筛选一个列表中的字符串，并转化为小写
>>> existed = ['Student', 'Catch', 'new', 13, None]
>>> generated = [x.lower() for x in existed if isinstance(x, str)]
>>> generated
    ['student', 'catch', 'new']

list 常规切片:

#取前10个元素，结果包含索引起始位置，但不包含终止位置
>>> s[0: 10] 
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

除常规切片中的给予起始和终止索引外，还可以给第三个参数以实现间隔提取元素：

#在前50个元素中每5个元素取一个
>>> s[0: 50: 5]
    [0, 5, 10, 15, 20, 25, 30, 35, 40, 45]

List 的备份

由于 list 是可变的数据结构，因此将 list 传递给函数后列表中的值可以被修改，而如果想保留原始列表，则可以在函数调用时将 list 做一个副本，其操作方法为：function_name(list_name[:])。注意这里如果通过变量赋值增加一个列表名使得其等于原有列表，此时只是增加了一个引用而不是列表的备份，将新列表名传入函数后函数仍然有能力对原有列表中的值进行修改。

类似列表解析的操作还可以在集合构建的时候进行，区别之处就在于外层用 { }而不是 [ ]。

格式化字符串

str.format()内建了很多非常有效的格式化字符串的方法，它通过 { }.format 来代替以前版本中的 % + { } 的方式格式化字符串，由于 % 形式最终会被废弃，因此未来在代码中字符串格式化都应该用新版本的形式，其语法形式为：template.format(p1, p1, .... , k1=v1, k2=v2)，其中 { } 中明确了要打印的样式和格式，后续 format()中的参数为用于最终显示的替代变量。

>>> '{:,}'.format(123456)
    '123,456'

以下代码来自 python 官方文档：

>>> '{0}, {1}, {2}'.format('a', 'b', 'c') # this indexing is highly recommended
'a, b, c'
>>> '{}, {}, {}'.format('a', 'b', 'c')  # 3.1+ only
'a, b, c'
>>> '{2}, {1}, {0}'.format('a', 'b', 'c') # you can shuffle the data
'c, b, a'
>>> '{2}, {1}, {0}'.format(*'abc')      # unpacking argument sequence
'c, b, a'
>>> '{0}{1}{0}'.format('abra', 'cad')   # arguments' indices can be repeated
'abracadabra'

完整的格式化字符串的教程可以参见 Python Guru 的讲解。

break & continue

一般的循环执行时，每一次循环执行之前都会执行循环判断语句以确定循环是否继续，如果希望在循环内部可以实施某些判断来确定语句执行的状况，则可以通过 break 和 continue 来完成，其中：

break 会使得程序终止循环，直接执行循环语句后的语句
continue 会使的程序跳出本次循环，程序会回到循环体的判断条件处判断是否继续循环

异常处理 Exception

在程序运行过程中，时常会因为各种各样的原因导致程序终止，如果可以预先判断可能出现的异常，则可以通过异常处理来使得程序不会在遇到异常时立即终止执行，同时还可以通过 raise 命令来抛出异常的原因。

try:
    # do something
    pass

except ValueError:
    # handle ValueError exception
    pass

except (TypeError, ZeroDivisionError):
    # handle multiple exceptions
    # TypeError and ZeroDivisionError
    pass

except:
    # handle all other exceptions
    raise sys.exc_info()[0]
    pass

finally:
    # the finally part is optional and if set, this part will execute no matter what happened

除了内建异常外，还可以自定义异常，一个简单的带有用户自定义异常类示例代码如下：

# define Python user-defined exceptions
class Error(Exception):
   """Base class for other exceptions"""
   pass

class ValueTooSmallError(Error):
   """Raised when the input value is too small"""
   pass

class ValueTooLargeError(Error):
   """Raised when the input value is too large"""
   pass

# our main program
# user guesses a number until he/she gets it right

# you need to guess this number
number = 10

while True:
   try:
       i_num = int(input("Enter a number: "))
       if i_num < number:
           raise ValueTooSmallError
       elif i_num > number:
           raise ValueTooLargeError
       break
   except ValueTooSmallError:
       print("This value is too small, try again!")
       print()
   except ValueTooLargeError:
       print("This value is too large, try again!")
       print()

print("Congratulations! You guessed it correctly.")

程序运行如下：

Enter a number: 12
This value is too large, try again!

Enter a number: 0
This value is too small, try again!

Enter a number: 8
This value is too small, try again!

Enter a number: 10
Congratulations! You guessed it correctly.

迭代器 iterator

在 Python 语言中，迭代器 iterator 是所有可以通过循环的方式对其中的元素进行遍历的对象的总称，需要注意的是迭代器每次返回一个元素。所有可以用于构造迭代器的数据结构被统称为可迭代对象 iterables，如 list，tuple，string 等。

手动构造可迭代器对象时只需要在普通的对象定义基础上定义 __iter__() 和 __next__() 这两个方法来定义对应的迭代特性，其中：

__iter__() 方法会返回 iterable 对象本身，同时也可以在这个方法中定义初始化特新
__next__()方法会返回序列中的下一个元素，并且会在元素耗尽时发出 StopIteration 异常

class PowTwo:
    """An example class for implementing an iterable of powers of two."""
    def __init__(self, max=0):
        self.max = max

    def __iter__(self):
        self.n = 0
        return self

    def __next__(self):
        if self.n < self.max:
            result = 2 ** self.n
            self.n += 1
            return result
        else:
            raise StopIteration

# Initialize an iterator
example = iter(PowTwo(4))
next(example) # return 1
next(example) # return 2
next(example) # return 4
next(example) # return 8
next(example) # raise exception

上述元素访问方式对应的 for 循环版本为：

for element in PowTwo(4):
   print(element)

上述代码反映了在可迭代对象中获取元素的两种方式：

通过可迭代对象构造迭代器 my_iterator = iter(some_iterable)，iter() 函数会在内部调用 __iter()__ 方法来完成迭代器构造，再通过 next(my_iterator) 来完成元素的迭代。类似地，next() 函数在内部调用 __next()__ ，也即 next(obj) 与 obj.__next()__ 是等同的

my_list = [4, 7, 0, 3]
my_iterator = iter(my_list)
print(next(my_iterator)) # the same as `my_iterator.__next()__`
4

无需构造迭代器，直接通过 for 循环来直接访问可迭代对象： for element in iterable，这一方法在内部会通过 iter() 自动创建迭代器并通过 next() 来完成逐个元素的访问，其具体的实现方式为：

# Create an iterator object from that iterable
iter_obj = iter(iterable)
# infinite loop
while True:
    try:
        # Get the next item
        element = next(iter_obj)
        # Do something with the element
    except StopIteration:
        # If StopIteration is raised, break from loop
        break

生成器 generator

与手动定义 __iter__() 和 __next__() 相比，一个更加简便和常用的构造可迭代对象的方式是通过生成器 generator 函数，其简便到仅需要在一般的函数定义中将 return 更改为 yield。

前述在 iterator 部分的 PowTwo 对象的定义可以更改为：

def pow_two(max = 0):
    n = 0
    while n < max:
        yield 2 ** n
        n += 1

与一般函数相比，生成器函数有以下几个特点：

Here is how a generator function differs from a normal function:

Generator function contains one or more yield statement.

When called, it returns an object (iterator) but does not start execution immediately.

Methods like __iter__() and __next__() are implemented automatically. So we can iterate through the items using next().

Once the function yields, the function is paused and the control is transferred to the caller.

Local variables and their states are remembered between successive calls.

Finally, when the function terminates, StopIteration is raised automatically on further calls.

列表解析虽然可以快速生成列表，但当列表的长度非常大且列表中的元素可以通过某种算法计算出来（其结果可能为无限多个）的时候，则推荐通过生成器表达式 generator expression 来构造生成器 generator 以提高性能，再通过循环或 next() 方法访问其中的元素。通过生成器表达式的方式构造生成器时只需将列表解析的 [ ] 改成 ( )，其主要区别在于列表解析一次性生成整个列表，而后者每一次只生成列表中的一个元素：

#注意和列表解析的区别
g = (x for x in range(10)) 
g
 at 0x10749e308>
next(g)
0

Lambda函数

lambda 函数在其他语言中成为匿名函数，即无需专门命名这个函数。一般在程序中如果不需要二次使用一个函数时这样操作，其工作方式为：

lambda 参数: 对参数的操作
#快速求和
>>> addition = lambda x, y: x + y

enumerate() 和 items() 函数

enumerate() 函数的参数为可遍历的迭代器或者序列，如列表，元组，字符串，集合等，返回序列中元素的索引位置及元素的值。

>>> a = [2, 3, 5, 7]
>>> for index, value in enumerate(a):
        print(index, value)
>>> 0 2
    1 3
    2 5
    3 7

这个函数对于字典操作的时候返回的是字典的键的索引和键的名称，而非键值对。类似的对于字典的键值操作可以通过 dict.items() 来进行：

>>> b = {'a' : 1, 'e' : 3}
>>> for key, value in b.items():
        print(key, value)
>>> a 1
    e 3

装饰器 Decorator

装饰器作用于已有的函数以修改或添加函数的功能，这一方式在大型的项目源码里非常常见。其较为常见的一个应用场景是我们在初期定义的函数功能不能满足后续的要求，除了可以直接更改函数定义的源码之外，还可以通过在其他位置定义的函数作为装饰器来装饰已有函数。

We can see that the decorator function added some new functionality to the original function. This is similar to packing a gift. The decorator acts as a wrapper. The nature of the object that got decorated (actual gift inside) does not alter. But now, it looks pretty (since it got decorated).

Generally, we decorate a function and reassign it as ordinary = make_pretty(ordinary). This is a common construct and for this reason, Python has a syntax to simplify this.

We can use the @ symbol along with the name of the decorator function and place it above the definition of the function to be decorated. For example,

@make_pretty
def ordinary():
    print("I am ordinary")

is equivalent to

def ordinary():
    print("I am ordinary")
ordinary = make_pretty(ordinary)

一个非常常见的装饰器的例子是 @property，其详细教程可以参考下面这个链接，讲解的非常清楚。需要注意的是 property 在 Python 中是保留字符，请勿轻易使用。

断言 Assertation

在程序编写过程中很多情况下我们需要有一种方式来实现程序运行的自检，以尽量减少某些隐形的 Bugs 或者尽早发现某些错误，在 Python 中这一自检可以轻松的通过 assert condition 或 assert condition, "warnings to print in case the condition are false" 来完成，这里唯一的建议就是 Use it！

特殊方法Special methods

在 Python 类的自定义过程中，为了使得自定义的类可以轻松的具备 Python 内建类的很多操作特性，如切片，算术运算，设置打印样式等，Python 语言的设计者们特别内建了许多特殊方法 Special method 可供选择，这些方法通常以双下划线开始和结束。几个常用的特殊方法及其作用如下：

__str__()：定义类的实例的打印样式

class Point:
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y
    
    def __str__(self):
        return "({0},{1})".format(self.x,self.y)

p = Point(2, 3)
print(p)
(2, 3)

__add__()，__sub__()：使得类的实例可以通过 +，- 来完成算数加减运算

2to3

做为防御性编程的一部分，Google 的 Python 编程规范里明确建议每一个代码的开头附上下面这三行代码：

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

参考阅读

Python string formating
Python Iterators
Python Generators
Python @property
Python Pickle
Python special method names

Python 语言关键特性汇总 - Pythonic tricks