Effective Python 笔记摘录5.2

元类(Metaclasses)和属性(Attributes)

元类经常被提及,但是很少知道实际如何使用。简单地说,元类可以让你拦截 Python 的class语句,并在每次定义类时提供特殊行为。

动态属性使您能够覆盖对象并导致意外的副作用。元类可以创建非常奇怪的行为,最好实现容易理解的代码,而不要意外发生。

  • Item44: 使用原始的Attributes而不是Setter和Getter方法

通常都会实现类似Java的getter和setter来进行内部属性的获取:

class OldResistor:
    def __init__(self, ohms):
        self._ohms = ohms
    def get_ohms(self):
        return self._ohms
    def set_ohms(self, ohms):
        self._ohms = ohms

用起来简单但是并不Pythonic:

r0 = OldResistor(50e3)
print('Before:', r0.get_ohms())
r0.set_ohms(10e3)
print('After: ', r0.get_ohms())

>>>
Before: 50000.0
After: 10000.0

有些时候要做增量的操作的时候,比较复杂:

r0.set_ohms(r0.get_ohms() - 4e3)
assert r0.get_ohms() == 6e3

在python里面,不用明确定义getter和setter,首先从public的属性开始:

class Resistor:
    def __init__(self, ohms):
        self.ohms = ohms
        self.voltage = 0
        self.current = 0

r1 = Resistor(50e3)
r1.ohms = 10e3

对属性进行增量操作就显得自然:

r1.ohms += 5e3

如果需要一些行为设置,可以用@property装饰器。
下面的类继承了Resistor,然后维护了自己的voltage。

class VoltageResistance(Resistor):
    def __init__(self, ohms):
        super().__init__(ohms)
        self._voltage = 0
    @property
    def voltage(self):
        return self._voltage
    @voltage.setter
    def voltage(self, voltage):
        self._voltage = voltage
        self.current = self._voltage / self.ohms

这样就可以直接以属性进行调用:

r2 = VoltageResistance(1e3)
print(f'Before: {r2.current:.2f} amps')
r2.voltage = 10
print(f'After: {r2.current:.2f} amps')

>>>
Before: 0.00 amps
After: 0.01 amps

而且setter可以执行类型检查和值校验,比如只允许电阻大于0:

class BoundedResistance(Resistor):
    def __init__(self, ohms):
        super().__init__(ohms)
    @property
    def ohms(self):
        return self._ohms
    @ohms.setter
    def ohms(self, ohms):
        if ohms <= 0:
            raise ValueError(f'ohms must be > 0; got {ohms}')
        self._ohms = ohms

赋值的时候不行。

r3 = BoundedResistance(1e3)
r3.ohms = 0
>>>
Traceback ...
ValueError: ohms must be > 0; got 0

构建函数的时候也不行:

BoundedResistance(-5)

>>>
Traceback ...
ValueError: ohms must be > 0; got -5

因为调BoundedResistance.init的时候,调用了super().init,而super().init调用了self.ohms = ohms,此时,@ohms.setter就会执行并检查数值。

甚至可以用@property使得父类属性不可变:

class FixedResistance(Resistor):
    def __init__(self, ohms):
        super().__init__(ohms)
    @property
    def ohms(self):
        return self._ohms
    @ohms.setter
    def ohms(self, ohms):
        if hasattr(self, '_ohms'):
            raise AttributeError("Ohms is immutable")
        self._ohms = ohms

第一次初始化的时候,还没有_ohms,以后访问的时候就有,所以会报错:

r4 = FixedResistance(1e3)
r4.ohms = 2e3
>>>
Traceback ...
AttributeError: Ohms is immutable

不要在getter里面设置其它属性:

class MysteriousResistor(Resistor):
    @property
    def ohms(self):
        self.voltage = self._ohms * self.current
        return self._ohms
    @ohms.setter
    def ohms(self, ohms):
        self._ohms = ohms

行为怪异:

r7 = MysteriousResistor(10)
r7.current = 0.01
print(f'Before: {r7.voltage:.2f}')
r7.ohms
print(f'After: {r7.voltage:.2f}')
>>>
Before: 0.00
After: 0.10

最好的方式就是在属性的setter方法里面修改和对象相关的状态。
@property 最大的缺点是属性的方法只能由子类共享。更多可参见Item46。


  • Item45: 使用注解@property而不是重构属性

随着时间推移,@property 还为改进接口提供了重要的方案。
比如,现在用Python对象实现leaky bucket quota(漏桶配额)。
Bucket类表示还有多少配额剩余,还有配额的可用持续时间:

from datetime import datetime, timedelta

class Bucket:
    def __init__(self, period):
        self.period_delta = timedelta(seconds=period)
        self.reset_time = datetime.now()
        self.quota = 0
    def __repr__(self):
        return f'Bucket(quota={self.quota})'

填充桶的算法如下:

def fill(bucket, amount):
    now = datetime.now()
    if (now - bucket.reset_time) > bucket.period_delta:
        bucket.quota = 0
        bucket.reset_time = now
    bucket.quota += amount

消费配额的算法如下:

def deduct(bucket, amount):
    now = datetime.now()
    if (now - bucket.reset_time) > bucket.period_delta:
        return False # Bucket hasn't been filled this period
    if bucket.quota - amount < 0:
        return False # Bucket was filled, but not enough
    bucket.quota -= amount
    return True # Bucket had enough, quota consumed

填满桶:

bucket = Bucket(60)
fill(bucket, 100)
print(bucket)

>>>
Bucket(quota=100)

消费配额:

if deduct(bucket, 99):
    print('Had 99 quota')
else:
    print('Not enough for 99 quota')
print(bucket)

>>>
Had 99 quota
Bucket(quota=1)

如果不够,配额水平保持不变:

if deduct(bucket, 3):
    print('Had 3 quota')
else:
    print('Not enough for 3 quota')
print(bucket)
>>>
Not enough for 3 quota
Bucket(quota=1)

这个实现的问题是:我永远不知道存储桶开始的配额级别。
新的桶:

class NewBucket:
      def __init__(self, period):
        self.period_delta = timedelta(seconds=period)
        self.reset_time = datetime.now()
        self.max_quota = 0
        self.quota_consumed = 0
    def __repr__(self):
        return (f'NewBucket(max_quota={self.max_quota}, '
                  f'quota_consumed={self.quota_consumed})')

当前的配额获取就是最大配额减已经消费的配额:

@property
def quota(self):
    return self.max_quota - self.quota_consumed

另外设置配额:

@quota.setter
def quota(self, amount):
    delta = self.max_quota - amount
    if amount == 0:
        # Quota being reset for a new period
        self.quota_consumed = 0
        self.max_quota = 0
    elif delta < 0:
        # Quota being filled for the new period
        assert self.quota_consumed == 0
        self.max_quota = amount
    else:
        # Quota being consumed during the period
        assert self.max_quota >= self.quota_consumed
        self.quota_consumed += delta

重新再运行一次实例:

bucket = NewBucket(60)
print('Initial', bucket)
fill(bucket, 100)
print('Filled', bucket)
if deduct(bucket, 99):
    print('Had 99 quota')
else:
    print('Not enough for 99 quota')
    print('Now', bucket)
if deduct(bucket, 3):
    print('Had 3 quota')
else:
    print('Not enough for 3 quota')
    print('Still', bucket)

>>>
Initial NewBucket(max_quota=0, quota_consumed=0)
Filled NewBucket(max_quota=100, quota_consumed=0)
Had 99 quota
Now NewBucket(max_quota=100, quota_consumed=99)
Not enough for 3 quota
Still NewBucket(max_quota=100, quota_consumed=99)

使用@property 在数据模型方面不断取得进展。
主要是服务顶层设计不变的情况下进行的,但是当迭代的时候反复使用@property的话,应该考虑重构这个类。


  • Item46: 对可重用的@property方法们,使用描述符(Descriptors)

比如现在有一个作业给分的实现:

class Homework:
    def __init__(self):
        self._grade = 0
    @property
    def grade(self):
        return self._grade
    @grade.setter
    def grade(self, value):
        if not (0 <= value <= 100):
            raise ValueError(
                'Grade must be between 0 and 100')
        self._grade = value

(用了@property很容易实现)

galileo = Homework()
galileo.grade = 95

当需要给考试成绩的时候,可能多个学科有各自的成绩,此时重用起来就比较麻烦:

class Exam:
    def __init__(self):
        self._writing_grade = 0
        self._math_grade = 0
    @staticmethod
    def _check_grade(value):
        if not (0 <= value <= 100):
            raise ValueError(
                'Grade must be between 0 and 100')

然后就是乏味的property步骤:

@property
def writing_grade(self):
    return self._writing_grade
@writing_grade.setter
def writing_grade(self, value):
    self._check_grade(value)
    self._writing_grade = value
@property
def math_grade(self):
    return self._math_grade
@math_grade.setter
def math_grade(self, value):
    self._check_grade(value)
    self._math_grade = value

如果要重用这个分数检查机制,需要每次都重写这个样板。
最好的操作是用描述符(descriptor protocol,定义了语言如何解释属性访问):

class Grade:
    def __get__(self, instance, instance_type):
        ...
    def __set__(self, instance, value):
        ...

class Exam:
    # Class attributes
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()
exam = Exam()
# 当调用这个的时候
exam.writing_grade = 40
# 等价于这个表达式
Exam.__dict__['writing_grade'].__set__(exam, 40)
# 同理
exam.writing_grade

Exam.__dict__['writing_grade'].__get__(exam, Exam)

访问getattribute的时候,如果实例变量没有,则会用类变量。
如果实现了__get__和__set__方法,则假定要用描述符协议。

class Grade:
    def __init__(self):
        self._value = 0
    def __get__(self, instance, instance_type):
        return self._value
    def __set__(self, instance, value):
        if not (0 <= value <= 100):
            raise ValueError(
                'Grade must be between 0 and 100')
        self._value = value

然而,这是错误的,在单一类变量上面操作:

class Exam:
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()
first_exam = Exam()
first_exam.writing_grade = 82
first_exam.science_grade = 99
print('Writing', first_exam.writing_grade)
print('Science', first_exam.science_grade)
>>>
Writing 82
Science 99
second_exam = Exam()
second_exam.writing_grade = 75
print(f'Second {second_exam.writing_grade} is right')
print(f'First {first_exam.writing_grade} is wrong; '
f'should be 82')
>>>
Second 75 is right
First 75 is wrong; should be 82

应该对每个实例变量维护相应的结果:

class Grade:
    def __init__(self):
        self._values = {}
    def __get__(self, instance, instance_type):
        if instance is None:
            return self
        return self._values.get(instance, 0)
    def __set__(self, instance, value):
        if not (0 <= value <= 100):
            raise ValueError(
                'Grade must be between 0 and 100')
        self._values[instance] = value

虽然实现容易,但是会造成内存泄漏。_values会一直持有每个实例的引用。
为了解决,可以用weakref,让python自己来管理引用,当实例不再使用时,字典会为空。

from weakref import WeakKeyDictionary
class Grade:
def __init__(self):
    self._values = WeakKeyDictionary()
def __get__(self, instance, instance_type):
    ...
def __set__(self, instance, value):
    ...

这样,所有的东西都可以正常工作了:

class Exam:
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()

first_exam = Exam()
first_exam.writing_grade = 82
second_exam = Exam()
second_exam.writing_grade = 75
print(f'First {first_exam.writing_grade} is right')
print(f'Second {second_exam.writing_grade} is right')
>>>
First 82 is right
Second 75 is right

  • Item47: 对于懒惰(Lazy)的属性,使用getattr, getattribute, and setattr

比如用对象来表示数据库里的记录。代码也必须知道数据库的样子,但是在 Python 中,将 Python 对象连接到数据库的代码不需要显式指定记录的模式,而是通用的。
Python 使用 __getattr__ 特殊方法使这种动态行为成为可能。如果一个类定义了 __getattr__,则每次在对象的实例字典中找不到属性时都会调用该方法:

class LazyRecord:
    def __init__(self):
        self.exists = 5
    def __getattr__(self, name):
        value = f'Value for {name}'
        setattr(self, name, value)
        return value

如果我访问了缺失的foo,会先调用__getattr__:

data = LazyRecord()
print('Before:', data.__dict__)
print('foo: ', data.foo)
print('After: ', data.__dict__)
>>>
Before: {'exists': 5}
foo: Value for foo
After: {'exists': 5, 'foo': 'Value for foo'}

这里加了一些log语句来观察,其中用了super()的__getattr__来获得结果:

class LoggingLazyRecord(LazyRecord):
    def __getattr__(self, name):
        print(f'* Called __getattr__({name!r}), '
                f'populating instance dictionary')
        result = super().__getattr__(name)
        print(f'* Returning {result!r}')
        return result
data = LoggingLazyRecord()
print('exists: ', data.exists)
print('First foo: ', data.foo)
print('Second foo: ', data.foo)
>>>
exists: 5
* Called __getattr__('foo'), populating instance dictionary
* Returning 'Value for foo'
First foo: Value for foo
Second foo: Value for foo

可以看到确实调用了一次__getattr__。
这种懒加载的方式对无模式(schemaless)数据特别有用。
为了完成数据库系统的事务,比如:下次用户访问某个属性时,想知道数据库中对应的记录是否还有效,事务是否还处于打开状态。
Python有另一个对象的hook,叫__getattribute__。
每次对象属性被访问的时候,都会调用。
需要注意的是,这样的操作会产生大量开销并对性能产生负面影响。
比如这里,在方法中打log来观察:

class ValidatingRecord:
    def __init__(self):
        self.exists = 5
    def __getattribute__(self, name):
        print(f'* Called __getattribute__({name!r})')
        try:
            value = super().__getattribute__(name)
            print(f'* Found {name!r}, returning {value!r}')
            return value
        except AttributeError:
            value = f'Value for {name}'
            print(f'* Setting {name!r} to {value!r}')
            setattr(self, name, value)
            return value

data = ValidatingRecord()
print('exists: ', data.exists)
print('First foo: ', data.foo)
print('Second foo: ', data.foo)

>>>
* Called __getattribute__('exists')
* Found 'exists', returning 5
exists: 5
* Called __getattribute__('foo')
* Setting 'foo' to 'Value for foo'
First foo: Value for foo
* Called __getattribute__('foo')
* Found 'foo', returning 'Value for foo'
Second foo: Value for foo

找不到对应属性的时候抛出AttributeError的错误。
比如例子:

class MissingPropertyRecord:
    def __getattr__(self, name):
        if name == 'bad_name':
            raise AttributeError(f'{name} is missing')
        ...

data = MissingPropertyRecord()
data.bad_name

>>>
Traceback ...
AttributeError: bad_name is missing

实现通用代码少不了用hasattr来检查属性是否存在,还有getattr来提取属性的数值:

data = LoggingLazyRecord() # Implements __getattr__
print('Before: ', data.__dict__)
print('Has first foo: ', hasattr(data, 'foo'))
print('After: ', data.__dict__)
print('Has second foo: ', hasattr(data, 'foo'))
>>>
Before: {'exists': 5}
* Called __getattr__('foo'), populating instance dictionary
* Returning 'Value for foo'
Has first foo: True
After: {'exists': 5, 'foo': 'Value for foo'}
Has second foo: True

同样观察到,调用了一次__getattr__。

data = ValidatingRecord() # Implements __getattribute__
print('Has first foo: ', hasattr(data, 'foo'))
print('Has second foo: ', hasattr(data, 'foo'))
>>>
* Called __getattribute__('foo')
* Setting 'foo' to 'Value for foo'
Has first foo: True
* Called __getattribute__('foo')
* Found 'foo', returning 'Value for foo'
Has second foo: True

同样观察到,调用了两次__getattribute__。

现在,可以用__setattr__(或者内建的setattr方法)来做到:将值分配给Python对象时,懒惰地将数据推回数据库:

class SavingRecord:
    def __setattr__(self, name, value):
        # Save some data for the record
        ...
        super().__setattr__(name, value)

建一个打log的实例:

class LoggingSavingRecord(SavingRecord):
    def __setattr__(self, name, value):
        print(f'* Called __setattr__({name!r}, {value!r})')
        super().__setattr__(name, value)
data = LoggingSavingRecord()
print('Before: ', data.__dict__)
data.foo = 5
print('After: ', data.__dict__)
data.foo = 7
print('Finally:', data.__dict__)

>>>
Before: {}
* Called __setattr__('foo', 5)
After: {'foo': 5}
* Called __setattr__('foo', 7)
Finally: {'foo': 7}

假设我希望对我的对象进行属性访问以实际查找关联字典中的键:

class BrokenDictionaryRecord:
    def __init__(self, data):
        self._data = {}
    def __getattribute__(self, name):
        print(f'* Called __getattribute__({name!r})')
        return self._data[name]

但是,程序会直接运行到报错:

data = BrokenDictionaryRecord({'foo': 3})
data.foo

>>>
* Called __getattribute__('foo')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
...
Traceback ...
RecursionError: maximum recursion depth exceeded while
calling a Python object

主要是由于运行了self._data导致又运行了__getattribute__,导致了无限循环。
而是应该从父类中获取属性,然后从这个值里面返回对应的结果:

class DictionaryRecord:
    def __init__(self, data):
        self._data = data
    def __getattribute__(self, name):
        print(f'* Called __getattribute__({name!r})')
        data_dict = super().__getattribute__('_data')
        return data_dict[name]

data = DictionaryRecord({'foo': 3})
print('foo: ', data.foo)

>>>
* Called __getattribute__('foo')
foo: 3

  • Item48: 用__init_subclass__验证子类

元类(MetaClass)通过继承type来定义。在运行时构建类的类型。元类通过__new__来接收关联类的内容:

class Meta(type):
    def __new__(meta, name, bases, class_dict):
        print(f'* Running {meta}.__new__ for {name}')
        print('Bases:', bases)
        print(class_dict)
        return type.__new__(meta, name, bases, class_dict)

class MyClass(metaclass=Meta):
    stuff = 123
    def foo(self):
        pass

class MySubclass(MyClass):
    other = 567
    def bar(self):
        pass

元类可以访问类的名称,以及它的父类继承自(bases)和中定义的所有类属性

>>>
* Running .__new__ for MyClass
Bases: ()
{'__module__': '__main__',
'__qualname__': 'MyClass',
'stuff': 123,
'foo': }
* Running .__new__ for MySubclass
Bases: (,)
{'__module__': '__main__',
'__qualname__': 'MySubclass',
'other': 567,
'bar': }

主要的作用可以验证子类,比如验证是否是多边形:

class ValidatePolygon(type):
    def __new__(meta, name, bases, class_dict):
        # Only validate subclasses of the Polygon class
        if bases:
            if class_dict['sides'] < 3:
                raise ValueError('Polygons need 3+ sides')
        return type.__new__(meta, name, bases, class_dict)
class Polygon(metaclass=ValidatePolygon):
    sides = None # Must be specified by subclasses
    @classmethod
    def interior_angles(cls):
        return (cls.sides - 2) * 180
class Triangle(Polygon):
    sides = 3
class Rectangle(Polygon):
    sides = 4
class Nonagon(Polygon):
    sides = 9

assert Triangle.interior_angles() == 180
assert Rectangle.interior_angles() == 360
assert Nonagon.interior_angles() == 1260

当边大于2的时候正常,但是当边为2的时候,报错:

print('Before class')

class Line(Polygon):
    print('Before sides')
    sides = 2
    print('After sides')

print('After class')
>>>
Before class
Before sides
After sides
Traceback ...
ValueError: Polygons need 3+ sides

Python3.6引入了__init_subclass__来方便引入相同的特性:

class BetterPolygon:
    sides = None # Must be specified by subclasses
    def __init_subclass__(cls):
        super().__init_subclass__()
        if cls.sides < 3:
            raise ValueError('Polygons need 3+ sides')
        @classmethod
        def interior_angles(cls):
            return (cls.sides - 2) * 180

class Hexagon(BetterPolygon):
    sides = 6

assert Hexagon.interior_angles() == 720

代码更短了,可以直接从cls获取sides,而不用从class_dict['sides']获得。

print('Before class')
class Point(BetterPolygon):
    sides = 1
    print('After class')
>>>
Before class
Traceback ...
ValueError: Polygons need 3+ sides

每个类只能指定一个元类。当我想要第二个元类来验证颜色时:

class ValidateFilled(type):
    def __new__(meta, name, bases, class_dict):
    # Only validate subclasses of the Filled class
        if bases:
            if class_dict['color'] not in ('red', 'green'):
                raise ValueError('Fill color must be supported')
        return type.__new__(meta, name, bases, class_dict)

class Filled(metaclass=ValidateFilled):
    color = None # Must be specified by subclasses

然后期望同样的方式来验证:

class RedPentagon(Filled, Polygon):
    color = 'red'
    sides = 5
>>>
Traceback ...
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

只能通过比较复杂的继承来修复:

class ValidatePolygon(type):
    def __new__(meta, name, bases, class_dict):
    # Only validate non-root classes
        if not class_dict.get('is_root'):
            if class_dict['sides'] < 3:
                raise ValueError('Polygons need 3+ sides')
        return type.__new__(meta, name, bases, class_dict)

class Polygon(metaclass=ValidatePolygon):
    is_root = True
    sides = None # Must be specified by subclasses

class ValidateFilledPolygon(ValidatePolygon):
    def __new__(meta, name, bases, class_dict):
    # Only validate non-root classes
        if not class_dict.get('is_root'):
            if class_dict['color'] not in ('red', 'green'):
                raise ValueError('Fill color must be
                    supported')
        return super().__new__(meta, name, bases, class_dict)

class FilledPolygon(Polygon, metaclass=ValidateFilledPolygon):
    is_root = True
    color = None # Must be specified by subclasses

只能继承FilledPolygon:

class GreenPentagon(FilledPolygon):
    color = 'green'
    sides = 5

greenie = GreenPentagon()
assert isinstance(greenie, Polygon)

验证颜色和验证边:

class OrangePentagon(FilledPolygon):
    color = 'orange'
    sides = 5

>>>
Traceback ...
ValueError: Fill color must be supported
class RedLine(FilledPolygon):
color = 'red'
sides = 2

>>>
Traceback ...
ValueError: Polygons need 3+ sides

如果是用__init_subclass__来做:

class Filled:
    color = None # Must be specified by subclasses
    def __init_subclass__(cls):
        super().__init_subclass__()
        if cls.color not in ('red', 'green', 'blue'):
            raise ValueError('Fills need a valid color')

则不会破坏组合性(当然,也可以像上面一样定义多层的类继承):

class RedTriangle(Filled, Polygon):
    color = 'red'
    sides = 3

ruddy = RedTriangle()
assert isinstance(ruddy, Filled)
assert isinstance(ruddy, Polygon)

以下是更多的测试:

print('Before class')
class BlueLine(Filled, Polygon):
    color = 'blue'
    sides = 2
    print('After class')

>>>
Before class
Traceback ...
ValueError: Polygons need 3+ sides
print('Before class')
class BeigeSquare(Filled, Polygon):
    color = 'beige'
    sides = 4
print('After class')
>>>
Before class
Traceback ...
ValueError: Fills need a valid color

甚至可以用它来做一些复杂的场景的继承:

class Top:
    def __init_subclass__(cls):
        super().__init_subclass__()
        print(f'Top for {cls}')

class Left(Top):
    def __init_subclass__(cls):
        super().__init_subclass__()
        print(f'Left for {cls}')

class Right(Top):
    def __init_subclass__(cls):
        super().__init_subclass__()
        print(f'Right for {cls}')

class Bottom(Left, Right):
    def __init_subclass__(cls):
        super().__init_subclass__()
        print(f'Bottom for {cls}')

>>>
Top for 
Top for 
Top for 
Right for 
Left for 

每个类只调用了一次Top.__init_subclass__。


  • Item49: 用__init_subclass__注册类存在(Existence)

另一种公共的使用元类的方式是自动注册程序里的类型。当反向搜索时(把简单的identifier映射回对应的类),“注册”是有用的。
比如,用JSON来序列化object。

import json
class Serializable:
    def __init__(self, *args):
        self.args = args
    def serialize(self):
        return json.dumps({'args': self.args})

可以成功序列化点:

class Point2D(Serializable):
    def __init__(self, x, y):
        super().__init__(x, y)
        self.x = x
        self.y = y
    def __repr__(self):
        return f'Point2D({self.x}, {self.y})'

point = Point2D(5, 3)
print('Object: ', point)
print('Serialized:', point.serialize())
>>>
Object: Point2D(5, 3)
Serialized: {"args": [5, 3]}

此时,需要反序列化JSON,然后构建点:

class Deserializable(Serializable):
    @classmethod
    def deserialize(cls, json_data):
        params = json.loads(json_data)
        return cls(*params['args'])
class BetterPoint2D(Deserializable):
    ...
before = BetterPoint2D(5, 3)
print('Before: ', before)
data = before.serialize()
print('Serialized:', data)
after = BetterPoint2D.deserialize(data)
print('After: ', after)
>>>
Before: Point2D(5, 3)
Serialized: {"args": [5, 3]}
After: Point2D(5, 3)

问题在于需要提前知道类的类型(BetterPoint2D,Point2D)。应该是接收一个很大的JSON,然后分别都构建出对应的对象:

class BetterSerializable:
    def __init__(self, *args):
        self.args = args
    def serialize(self):
        return json.dumps({
        'class': self.__class__.__name__,
        'args': self.args,
        })
    def __repr__(self):
        name = self.__class__.__name__
        args_str = ', '.join(str(x) for x in self.args)
        return f'{name}({args_str})'

可以注册类以及反序列化:

registry = {}

def register_class(target_class):
    registry[target_class.__name__] = target_class

def deserialize(data):
    params = json.loads(data)
    name = params['class']
    target_class = registry[name]
    return target_class(*params['args'])

但是每次都要对类调用注册:

class EvenBetterPoint2D(BetterSerializable):
    def __init__(self, x, y):
        super().__init__(x, y)
        self.x = x
        self.y = y
register_class(EvenBetterPoint2D)

可以反序列化任何JSON串:

before = EvenBetterPoint2D(5, 3)
print('Before: ', before)
data = before.serialize()
print('Serialized:', data)
after = deserialize(data)
print('After: ', after)
>>>
Before: EvenBetterPoint2D(5, 3)
Serialized: {"class": "EvenBetterPoint2D", "args": [5, 3]}
After: EvenBetterPoint2D(5, 3)

但是忘记注册就会有问题:

class Point3D(BetterSerializable):
    def __init__(self, x, y, z):
        super().__init__(x, y, z)
        self.x = x
        self.y = y
        self.z = z
# Forgot to call register_class! Whoops!

反序列化了忘记注册的类:

point = Point3D(5, 9, -4)
data = point.serialize()
deserialize(data)
>>>
Traceback ...
KeyError: 'Point3D'

元类可以来做到每次都注册的需求:

class Meta(type):
    def __new__(meta, name, bases, class_dict):
        cls = type.__new__(meta, name, bases, class_dict)
        register_class(cls)
        return cls
class RegisteredSerializable(BetterSerializable, metaclass=Meta):
    pass

这样每次都能使得类得到注册:

class Vector3D(RegisteredSerializable):
    def __init__(self, x, y, z):
        super().__init__(x, y, z)
        self.x, self.y, self.z = x, y, z

before = Vector3D(10, -7, 3)
print('Before: ', before)
data = before.serialize()
print('Serialized:', data)
print('After: ', deserialize(data))
>>>
Before: Vector3D(10, -7, 3)
Serialized: {"class": "Vector3D", "args": [10, -7, 3]}
After: Vector3D(10, -7, 3)

能用__init_subclass__就更好了:

class BetterRegisteredSerializable(BetterSerializable):
    def __init_subclass__(cls):
        super().__init_subclass__()
        register_class(cls)
class Vector1D(BetterRegisteredSerializable):
    def __init__(self, magnitude):
        super().__init__(magnitude)
        self.magnitude = magnitude

before = Vector1D(6)
print('Before: ', before)
data = before.serialize()
print('Serialized: ', data)
print('After: ', deserialize(data))
>>>
Before: Vector1D(6)
Serialized: {"class": "Vector1D", "args": [6]}
After: Vector1D(6)

以上是用__init_subclass__来替代元类实现一些类注册功能。


  • Item50: 用__set_name__注解类属性

在这里,定义一个描述符类来将属性连接到数据库表的列名:

class Field:
    def __init__(self, name):
        self.name = name
        self.internal_name = '_' + self.name
    def __get__(self, instance, instance_type):
        if instance is None:
            return self
        return getattr(instance, self.internal_name, '')
    def __set__(self, instance, value):
        setattr(instance, self.internal_name, value)

然后定义一个顾客类:

class Customer:
    # Class attributes
    first_name = Field('first_name')
    last_name = Field('last_name')
    prefix = Field('prefix')
    suffix = Field('suffix')

可以直接赋值属性。

cust = Customer()
print(f'Before: {cust.first_name!r} {cust.__dict__}')
cust.first_name = 'Euclid'
print(f'After: {cust.first_name!r} {cust.__dict__}')
>>>
Before: '' {}
After: 'Euclid' {'_first_name': 'Euclid'}

但是显得冗余,因为first_name已经可以表示了,为什么还要构建出一个Field来保存同样的信息?

class Customer:
    # Left side is redundant with right side
    first_name = Field('first_name')
...

此时可以用元类来处理:

class Meta(type):
    def __new__(meta, name, bases, class_dict):
        for key, value in class_dict.items():
            if isinstance(value, Field):
                value.name = key
                value.internal_name = '_' + key
        cls = type.__new__(meta, name, bases, class_dict)
        return cls

让元类来提取到key,赋值给相应的Field。然后数据行继承元类:

class DatabaseRow(metaclass=Meta):
    pass

最后,每个属性用无参的init即可:

class Field:
    def __init__(self):
        # These will be assigned by the metaclass.
        self.name = None
        self.internal_name = None
    def __get__(self, instance, instance_type):
        if instance is None:
            return self
        return getattr(instance, self.internal_name, '')
    def __set__(self, instance, value):
        setattr(instance, self.internal_name, value)

然后,实际使用时,直接继承:

class BetterCustomer(DatabaseRow):
    first_name = Field()
    last_name = Field()
    prefix = Field()
    suffix = Field()
cust = BetterCustomer()
print(f'Before: {cust.first_name!r} {cust.__dict__}')
cust.first_name = 'Euler'
print(f'After: {cust.first_name!r} {cust.__dict__}')

>>>
Before: '' {}
After: 'Euler' {'_first_name': 'Euler'}

但是,当忘记继承的时候,会出错:

class BrokenCustomer:
    first_name = Field()
    last_name = Field()
    prefix = Field()
    suffix = Field()

cust = BrokenCustomer()
cust.first_name = 'Mersenne'

>>>
Traceback ...
TypeError: attribute name must be string, not 'NoneType'

Python3.6之后引入了__set_name__,可以代替元类的new来完成工作:

class Field:
    def __init__(self):
        self.name = None
        self.internal_name = None
    def __set_name__(self, owner, name):
        # Called on class creation for each descriptor
        self.name = name
        self.internal_name = '_' + name
    def __get__(self, instance, instance_type):
        if instance is None:
            return self
        return getattr(instance, self.internal_name, '')
    def __set__(self, instance, value):
        setattr(instance, self.internal_name, value)

这样就不用继承元类也能完成工作了:

class FixedCustomer:
    first_name = Field()
    last_name = Field()
    prefix = Field()
    suffix = Field()
cust = FixedCustomer()
print(f'Before: {cust.first_name!r} {cust.__dict__}')
cust.first_name = 'Mersenne'
print(f'After: {cust.first_name!r} {cust.__dict__}')

>>>
Before: '' {}
After: 'Mersenne' {'_first_name': 'Mersenne'}
  • 元类使您能够在完全定义类之前修改类的属性
  • 描述符(descriptors)和元类(metaclasses)让声明式行为和运行时自省(introspection)有力的组合
  • 在描述符类上定义 set_name 以允许它们考虑周围的类及其属性名称。
  • 通过让描述符将它们直接操作的数据存储在类的实例字典中,避免内存泄漏和 weakref 内置模块。

  • Item51: 对可组合的类扩展,使用类装饰器而不是元类

假如现在要追踪各个函数的函数名,参数和返回值:

from functools import wraps

def trace_func(func):
    if hasattr(func, 'tracing'): # Only decorate once
        return func

@wraps(func)
def wrapper(*args, **kwargs):
    result = None
    try:
        result = func(*args, **kwargs)
        return result
    except Exception as e:
        result = e
        raise
    finally:
        print(f'{func.__name__}({args!r}, {kwargs!r}) -> '
        f'{result!r}')
    wrapper.tracing = True
    return wrapper

需要每个函数都打上decorator,比较麻烦:

class TraceDict(dict):
    @trace_func
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
    @trace_func
    def __setitem__(self, *args, **kwargs):
        return super().__setitem__(*args, **kwargs)
    @trace_func
    def __getitem__(self, *args, **kwargs):
        return super().__getitem__(*args, **kwargs)
...
trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
    trace_dict['does not exist']
except KeyError:
    pass # Expected
>>>
__init__(({'hi': 1}, [('hi', 1)]), {}) -> None
__setitem__(({'hi': 1, 'there': 2}, 'there', 2), {}) -> None
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'), {}) -> KeyError('does not exist')

如果换种方式:

import types
trace_types = (
    types.MethodType,
    types.FunctionType,
    types.BuiltinFunctionType,
    types.BuiltinMethodType,
    types.MethodDescriptorType,
    types.ClassMethodDescriptorType)
class TraceMeta(type):
    def __new__(meta, name, bases, class_dict):
        klass = super().__new__(meta, name, bases, class_dict)
        for key in dir(klass):
            value = getattr(klass, key)
            if isinstance(value, trace_types):
                wrapped = trace_func(value)
                setattr(klass, key, wrapped)
        return klass

用元类也可以解决问题:

class TraceDict(dict, metaclass=TraceMeta):
    pass
trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
    trace_dict['does not exist']
except KeyError:
    pass # Expected
>>>
__new__((, [('hi', 1)]), {}) ->
{}
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'), {}) -> KeyError('does not exist')

按理说应该每个继承的类是可以的,实际却会冲突:

class OtherMeta(type):
    pass
class SimpleDict(dict, metaclass=OtherMeta):
    pass
class TraceDict(SimpleDict, metaclass=TraceMeta):
    pass

>>>
Traceback ...
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

只能这么进行:

class TraceMeta(type):
    ...
class OtherMeta(TraceMeta):
    pass
class SimpleDict(dict, metaclass=OtherMeta):
    pass
class TraceDict(SimpleDict, metaclass=TraceMeta):
    pass
trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
    trace_dict['does not exist']
except KeyError:
    pass # Expected

>>>
__init_subclass__((), {}) -> None
__new__((, [('hi', 1)]), {}) ->
{}
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'), {}) -> KeyError('does not exist')
def my_class_decorator(klass):
    klass.extra_param = 'hello'
    return klass
@my_class_decorator
class MyClass:
    pass

print(MyClass)
print(MyClass.extra_param)

>>>

hello

实际上,Python提供了类的decorator来使用:

def trace(klass):
    for key in dir(klass):
        value = getattr(klass, key)
        if isinstance(value, trace_types):
            wrapped = trace_func(value) # 将函数修改
            setattr(klass, key, wrapped) # 重新赋值函数
    return klass
@trace
class TraceDict(dict):
    pass

trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
    trace_dict['does not exist']
except KeyError:
    pass # Expected
>>>
__new__((, [('hi', 1)]), {}) ->
{}
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'), {}) -> KeyError('does not exist')

已经有元类的类也可以用类的装饰器:

class OtherMeta(type):
    pass

@trace
class TraceDict(dict, metaclass=OtherMeta):
    pass

trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
    trace_dict['does not exist']
except KeyError:
    pass # Expected

>>>
__new__((, [('hi', 1)]), {}) ->
{}
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'),{}) -> KeyError('does not exist')
  • 类装饰器是一个简单的函数,它接收一个类实例作为参数并返回一个新类或原始类的修改版本。
  • 当你想用最少的样板修改类的每个方法或属性时,类装饰器很有用。
  • 元类不容易组合在一起,而许多类装饰器可以用来扩展同一个类而不会发生冲突。

你可能感兴趣的:(Effective Python 笔记摘录5.2)