如何写出优雅的代码---从细节入手

优雅的代码都是相似的，丑陋的代码各有不同

省时省力的 `defaultdict`

有这样一个需求，要统计某个商户在拉黑列表中出现的次数

black_list = [1001, 1002, 1003, 1001, 1005, 1002]

首先想到的是用dict存储结果，想法很好，写出了如下代码

result = {}
for supplier_id in black_list:
    if supplier_id in result:
        result[supplier_id] += 1
    else:
        result[supplier_id] = 1

有if语句，太麻烦，改用defaultdict版本

from collections import defaultdict
default_result = defaultdict(int)

for supplier_id in black_list:
    default_result[supplier_id] += 1

酸爽的`Counter`

还是刚才的需求，一行代码即可解决

from collections import Counter
counter = Counter(black_list)

用生成器替换列表

列表执行时会全部加载到内存中，当列表非常大的时候，可以使用生成器替换

from guppy import hpy

hp = hpy()
l = [x for x in range(1000000)]
for _ in l:
    pass
print hp.heap()

image.png

from guppy import hpy

hp = hpy()
def generator(x):
    while x < 1000000:
        yield x
        x += 1
for _ in generator(0):
    pass
print hp.heap()

image.png

列表执行时占用31M左右的内存，生成器只有3M左右

列表解包

工作中常见的场景是解析日志文件，分析用户访问信息，现有一行日志信息如下

[pid: 19069|app 10.10.158.30 GET /attendance/city/list => generated 3189 bytes in 32 msecs HTTP/1.0 200

分析ip,请求方法，api，状态的话要这么写

with open('log.txt') as log:
    lines = log.readlines()
    for line in lines:
        ip = line.split(' ')[2]
        method = line.split(' ')[3]
        api = line.split(' ')[4]
        status = line.split(' ')[13]

如果这么写就清晰很多了

with open('log.txt') as log:
    lines = log.readlines()
    for line in lines:
        _, _, ip, method, api, *other = line.split(' ')
        status = other[-1:]

集合是个好东西

如果想比较两个列表，找出不在对方列表中的元素，你会这样吗?

new = [11, 22, 33, 44, 55]
old = [22, 55, 66, 77]

d1, d2 = [], []
for item in new:
    if item not in old:
        d1.append(item)
for item in old:
    if item not in new:
        d2.append(item)

还是这样呢？

new = [11, 22, 33, 44, 55]
old = [22, 55, 66, 77]
d1, d2 = list(set(new).difference(set(old))), list(set(old).difference(set(new)))

积极使用`enumerate`

牵涉到数据库新增记录的时候，经常是一次新增很多的数据，需要分批次提交，像这样

index = 0
for order in orders:
    if index % 100 == 0:
        db.session.commit()
    insert = order()
    db.session.add(insert)
    index += 1

还是这样？

for index, order in enumerate(orders):
    if index % 100 == 0:
        db.session.commit()
    insert = order()
    db.session.add(insert)

大小比较

x = 10
if 10 < x < 20:
  print x

不要这样写

x = 10
if x > 10 and x < 20:
  print x

三元表达式

哈哈，经典方式

if a != 0:
    b = a
else:
    b = 1

在python里要这么写

b = a or 1

反转列表

开发中还经常遇到反转列表的需求，如果这么写，可就搞复杂了

l = [1, 2, 3, 4, 5]
new, length = [], len(l)
for i in xrange(length - 1, -1, -1):
    new.append(l[i])

要这么写

new = l[::-1]
# 或者
new = reversed(l)

再说几个有用的小tpis

`for ... else...`

else语句在for遍历完所有元素后才会被执行

for i in xrange(1, 10):
    if i == 0:
        break
else:
    print '没有0'

列表累加

a = [1,2,3] * 3

不定参数

def func(*args, **kwargs):
    print args, kwargs

a, b = [1,2], dict(c=1, d=2)
func(*a, **b)

交换两个数

a, b = b, a

如果装饰器是一个类的话，需要实现`call`方法

from functools import wraps

class DecorateClass(object):
    def __init__(self, message):
        self.message = message

    def __call__(self, f):
        @wraps(f)
        def _decorate(*args, **kwargs):
            print self.message
            return f(*args, **kwargs)
        return _decorate


@DecorateClass('hello world')
def func(*args, **kwargs):
    print args, kwargs

func(1, 2, a=1)

「欢迎分享，优化」