元类(Metaclasses)和属性(Attributes)
元类经常被提及,但是很少知道实际如何使用。简单地说,元类可以让你拦截 Python 的class语句,并在每次定义类时提供特殊行为。
动态属性使您能够覆盖对象并导致意外的副作用。元类可以创建非常奇怪的行为,最好实现容易理解的代码,而不要意外发生。
- Item44: 使用原始的Attributes而不是Setter和Getter方法
通常都会实现类似Java的getter和setter来进行内部属性的获取:
class OldResistor:
def __init__(self, ohms):
self._ohms = ohms
def get_ohms(self):
return self._ohms
def set_ohms(self, ohms):
self._ohms = ohms
用起来简单但是并不Pythonic:
r0 = OldResistor(50e3)
print('Before:', r0.get_ohms())
r0.set_ohms(10e3)
print('After: ', r0.get_ohms())
>>>
Before: 50000.0
After: 10000.0
有些时候要做增量的操作的时候,比较复杂:
r0.set_ohms(r0.get_ohms() - 4e3)
assert r0.get_ohms() == 6e3
在python里面,不用明确定义getter和setter,首先从public的属性开始:
class Resistor:
def __init__(self, ohms):
self.ohms = ohms
self.voltage = 0
self.current = 0
r1 = Resistor(50e3)
r1.ohms = 10e3
对属性进行增量操作就显得自然:
r1.ohms += 5e3
如果需要一些行为设置,可以用@property装饰器。
下面的类继承了Resistor,然后维护了自己的voltage。
class VoltageResistance(Resistor):
def __init__(self, ohms):
super().__init__(ohms)
self._voltage = 0
@property
def voltage(self):
return self._voltage
@voltage.setter
def voltage(self, voltage):
self._voltage = voltage
self.current = self._voltage / self.ohms
这样就可以直接以属性进行调用:
r2 = VoltageResistance(1e3)
print(f'Before: {r2.current:.2f} amps')
r2.voltage = 10
print(f'After: {r2.current:.2f} amps')
>>>
Before: 0.00 amps
After: 0.01 amps
而且setter可以执行类型检查和值校验,比如只允许电阻大于0:
class BoundedResistance(Resistor):
def __init__(self, ohms):
super().__init__(ohms)
@property
def ohms(self):
return self._ohms
@ohms.setter
def ohms(self, ohms):
if ohms <= 0:
raise ValueError(f'ohms must be > 0; got {ohms}')
self._ohms = ohms
赋值的时候不行。
r3 = BoundedResistance(1e3)
r3.ohms = 0
>>>
Traceback ...
ValueError: ohms must be > 0; got 0
构建函数的时候也不行:
BoundedResistance(-5)
>>>
Traceback ...
ValueError: ohms must be > 0; got -5
因为调BoundedResistance.init的时候,调用了super().init,而super().init调用了self.ohms = ohms,此时,@ohms.setter就会执行并检查数值。
甚至可以用@property使得父类属性不可变:
class FixedResistance(Resistor):
def __init__(self, ohms):
super().__init__(ohms)
@property
def ohms(self):
return self._ohms
@ohms.setter
def ohms(self, ohms):
if hasattr(self, '_ohms'):
raise AttributeError("Ohms is immutable")
self._ohms = ohms
第一次初始化的时候,还没有_ohms,以后访问的时候就有,所以会报错:
r4 = FixedResistance(1e3)
r4.ohms = 2e3
>>>
Traceback ...
AttributeError: Ohms is immutable
不要在getter里面设置其它属性:
class MysteriousResistor(Resistor):
@property
def ohms(self):
self.voltage = self._ohms * self.current
return self._ohms
@ohms.setter
def ohms(self, ohms):
self._ohms = ohms
行为怪异:
r7 = MysteriousResistor(10)
r7.current = 0.01
print(f'Before: {r7.voltage:.2f}')
r7.ohms
print(f'After: {r7.voltage:.2f}')
>>>
Before: 0.00
After: 0.10
最好的方式就是在属性的setter方法里面修改和对象相关的状态。
@property 最大的缺点是属性的方法只能由子类共享。更多可参见Item46。
- Item45: 使用注解@property而不是重构属性
随着时间推移,@property 还为改进接口提供了重要的方案。
比如,现在用Python对象实现leaky bucket quota(漏桶配额)。
Bucket类表示还有多少配额剩余,还有配额的可用持续时间:
from datetime import datetime, timedelta
class Bucket:
def __init__(self, period):
self.period_delta = timedelta(seconds=period)
self.reset_time = datetime.now()
self.quota = 0
def __repr__(self):
return f'Bucket(quota={self.quota})'
填充桶的算法如下:
def fill(bucket, amount):
now = datetime.now()
if (now - bucket.reset_time) > bucket.period_delta:
bucket.quota = 0
bucket.reset_time = now
bucket.quota += amount
消费配额的算法如下:
def deduct(bucket, amount):
now = datetime.now()
if (now - bucket.reset_time) > bucket.period_delta:
return False # Bucket hasn't been filled this period
if bucket.quota - amount < 0:
return False # Bucket was filled, but not enough
bucket.quota -= amount
return True # Bucket had enough, quota consumed
填满桶:
bucket = Bucket(60)
fill(bucket, 100)
print(bucket)
>>>
Bucket(quota=100)
消费配额:
if deduct(bucket, 99):
print('Had 99 quota')
else:
print('Not enough for 99 quota')
print(bucket)
>>>
Had 99 quota
Bucket(quota=1)
如果不够,配额水平保持不变:
if deduct(bucket, 3):
print('Had 3 quota')
else:
print('Not enough for 3 quota')
print(bucket)
>>>
Not enough for 3 quota
Bucket(quota=1)
这个实现的问题是:我永远不知道存储桶开始的配额级别。
新的桶:
class NewBucket:
def __init__(self, period):
self.period_delta = timedelta(seconds=period)
self.reset_time = datetime.now()
self.max_quota = 0
self.quota_consumed = 0
def __repr__(self):
return (f'NewBucket(max_quota={self.max_quota}, '
f'quota_consumed={self.quota_consumed})')
当前的配额获取就是最大配额减已经消费的配额:
@property
def quota(self):
return self.max_quota - self.quota_consumed
另外设置配额:
@quota.setter
def quota(self, amount):
delta = self.max_quota - amount
if amount == 0:
# Quota being reset for a new period
self.quota_consumed = 0
self.max_quota = 0
elif delta < 0:
# Quota being filled for the new period
assert self.quota_consumed == 0
self.max_quota = amount
else:
# Quota being consumed during the period
assert self.max_quota >= self.quota_consumed
self.quota_consumed += delta
重新再运行一次实例:
bucket = NewBucket(60)
print('Initial', bucket)
fill(bucket, 100)
print('Filled', bucket)
if deduct(bucket, 99):
print('Had 99 quota')
else:
print('Not enough for 99 quota')
print('Now', bucket)
if deduct(bucket, 3):
print('Had 3 quota')
else:
print('Not enough for 3 quota')
print('Still', bucket)
>>>
Initial NewBucket(max_quota=0, quota_consumed=0)
Filled NewBucket(max_quota=100, quota_consumed=0)
Had 99 quota
Now NewBucket(max_quota=100, quota_consumed=99)
Not enough for 3 quota
Still NewBucket(max_quota=100, quota_consumed=99)
使用@property 在数据模型方面不断取得进展。
主要是服务顶层设计不变的情况下进行的,但是当迭代的时候反复使用@property的话,应该考虑重构这个类。
- Item46: 对可重用的@property方法们,使用描述符(Descriptors)
比如现在有一个作业给分的实现:
class Homework:
def __init__(self):
self._grade = 0
@property
def grade(self):
return self._grade
@grade.setter
def grade(self, value):
if not (0 <= value <= 100):
raise ValueError(
'Grade must be between 0 and 100')
self._grade = value
(用了@property很容易实现)
galileo = Homework()
galileo.grade = 95
当需要给考试成绩的时候,可能多个学科有各自的成绩,此时重用起来就比较麻烦:
class Exam:
def __init__(self):
self._writing_grade = 0
self._math_grade = 0
@staticmethod
def _check_grade(value):
if not (0 <= value <= 100):
raise ValueError(
'Grade must be between 0 and 100')
然后就是乏味的property步骤:
@property
def writing_grade(self):
return self._writing_grade
@writing_grade.setter
def writing_grade(self, value):
self._check_grade(value)
self._writing_grade = value
@property
def math_grade(self):
return self._math_grade
@math_grade.setter
def math_grade(self, value):
self._check_grade(value)
self._math_grade = value
如果要重用这个分数检查机制,需要每次都重写这个样板。
最好的操作是用描述符(descriptor protocol,定义了语言如何解释属性访问):
class Grade:
def __get__(self, instance, instance_type):
...
def __set__(self, instance, value):
...
class Exam:
# Class attributes
math_grade = Grade()
writing_grade = Grade()
science_grade = Grade()
exam = Exam()
# 当调用这个的时候
exam.writing_grade = 40
# 等价于这个表达式
Exam.__dict__['writing_grade'].__set__(exam, 40)
# 同理
exam.writing_grade
Exam.__dict__['writing_grade'].__get__(exam, Exam)
访问getattribute的时候,如果实例变量没有,则会用类变量。
如果实现了__get__和__set__方法,则假定要用描述符协议。
class Grade:
def __init__(self):
self._value = 0
def __get__(self, instance, instance_type):
return self._value
def __set__(self, instance, value):
if not (0 <= value <= 100):
raise ValueError(
'Grade must be between 0 and 100')
self._value = value
然而,这是错误的,在单一类变量上面操作:
class Exam:
math_grade = Grade()
writing_grade = Grade()
science_grade = Grade()
first_exam = Exam()
first_exam.writing_grade = 82
first_exam.science_grade = 99
print('Writing', first_exam.writing_grade)
print('Science', first_exam.science_grade)
>>>
Writing 82
Science 99
second_exam = Exam()
second_exam.writing_grade = 75
print(f'Second {second_exam.writing_grade} is right')
print(f'First {first_exam.writing_grade} is wrong; '
f'should be 82')
>>>
Second 75 is right
First 75 is wrong; should be 82
应该对每个实例变量维护相应的结果:
class Grade:
def __init__(self):
self._values = {}
def __get__(self, instance, instance_type):
if instance is None:
return self
return self._values.get(instance, 0)
def __set__(self, instance, value):
if not (0 <= value <= 100):
raise ValueError(
'Grade must be between 0 and 100')
self._values[instance] = value
虽然实现容易,但是会造成内存泄漏。_values会一直持有每个实例的引用。
为了解决,可以用weakref,让python自己来管理引用,当实例不再使用时,字典会为空。
from weakref import WeakKeyDictionary
class Grade:
def __init__(self):
self._values = WeakKeyDictionary()
def __get__(self, instance, instance_type):
...
def __set__(self, instance, value):
...
这样,所有的东西都可以正常工作了:
class Exam:
math_grade = Grade()
writing_grade = Grade()
science_grade = Grade()
first_exam = Exam()
first_exam.writing_grade = 82
second_exam = Exam()
second_exam.writing_grade = 75
print(f'First {first_exam.writing_grade} is right')
print(f'Second {second_exam.writing_grade} is right')
>>>
First 82 is right
Second 75 is right
- Item47: 对于懒惰(Lazy)的属性,使用getattr, getattribute, and setattr
比如用对象来表示数据库里的记录。代码也必须知道数据库的样子,但是在 Python 中,将 Python 对象连接到数据库的代码不需要显式指定记录的模式,而是通用的。
Python 使用 __getattr__ 特殊方法使这种动态行为成为可能。如果一个类定义了 __getattr__,则每次在对象的实例字典中找不到属性时都会调用该方法:
class LazyRecord:
def __init__(self):
self.exists = 5
def __getattr__(self, name):
value = f'Value for {name}'
setattr(self, name, value)
return value
如果我访问了缺失的foo,会先调用__getattr__:
data = LazyRecord()
print('Before:', data.__dict__)
print('foo: ', data.foo)
print('After: ', data.__dict__)
>>>
Before: {'exists': 5}
foo: Value for foo
After: {'exists': 5, 'foo': 'Value for foo'}
这里加了一些log语句来观察,其中用了super()的__getattr__来获得结果:
class LoggingLazyRecord(LazyRecord):
def __getattr__(self, name):
print(f'* Called __getattr__({name!r}), '
f'populating instance dictionary')
result = super().__getattr__(name)
print(f'* Returning {result!r}')
return result
data = LoggingLazyRecord()
print('exists: ', data.exists)
print('First foo: ', data.foo)
print('Second foo: ', data.foo)
>>>
exists: 5
* Called __getattr__('foo'), populating instance dictionary
* Returning 'Value for foo'
First foo: Value for foo
Second foo: Value for foo
可以看到确实调用了一次__getattr__。
这种懒加载的方式对无模式(schemaless)数据特别有用。
为了完成数据库系统的事务,比如:下次用户访问某个属性时,想知道数据库中对应的记录是否还有效,事务是否还处于打开状态。
Python有另一个对象的hook,叫__getattribute__。
每次对象属性被访问的时候,都会调用。
需要注意的是,这样的操作会产生大量开销并对性能产生负面影响。
比如这里,在方法中打log来观察:
class ValidatingRecord:
def __init__(self):
self.exists = 5
def __getattribute__(self, name):
print(f'* Called __getattribute__({name!r})')
try:
value = super().__getattribute__(name)
print(f'* Found {name!r}, returning {value!r}')
return value
except AttributeError:
value = f'Value for {name}'
print(f'* Setting {name!r} to {value!r}')
setattr(self, name, value)
return value
data = ValidatingRecord()
print('exists: ', data.exists)
print('First foo: ', data.foo)
print('Second foo: ', data.foo)
>>>
* Called __getattribute__('exists')
* Found 'exists', returning 5
exists: 5
* Called __getattribute__('foo')
* Setting 'foo' to 'Value for foo'
First foo: Value for foo
* Called __getattribute__('foo')
* Found 'foo', returning 'Value for foo'
Second foo: Value for foo
找不到对应属性的时候抛出AttributeError的错误。
比如例子:
class MissingPropertyRecord:
def __getattr__(self, name):
if name == 'bad_name':
raise AttributeError(f'{name} is missing')
...
data = MissingPropertyRecord()
data.bad_name
>>>
Traceback ...
AttributeError: bad_name is missing
实现通用代码少不了用hasattr来检查属性是否存在,还有getattr来提取属性的数值:
data = LoggingLazyRecord() # Implements __getattr__
print('Before: ', data.__dict__)
print('Has first foo: ', hasattr(data, 'foo'))
print('After: ', data.__dict__)
print('Has second foo: ', hasattr(data, 'foo'))
>>>
Before: {'exists': 5}
* Called __getattr__('foo'), populating instance dictionary
* Returning 'Value for foo'
Has first foo: True
After: {'exists': 5, 'foo': 'Value for foo'}
Has second foo: True
同样观察到,调用了一次__getattr__。
data = ValidatingRecord() # Implements __getattribute__
print('Has first foo: ', hasattr(data, 'foo'))
print('Has second foo: ', hasattr(data, 'foo'))
>>>
* Called __getattribute__('foo')
* Setting 'foo' to 'Value for foo'
Has first foo: True
* Called __getattribute__('foo')
* Found 'foo', returning 'Value for foo'
Has second foo: True
同样观察到,调用了两次__getattribute__。
现在,可以用__setattr__(或者内建的setattr方法)来做到:将值分配给Python对象时,懒惰地将数据推回数据库:
class SavingRecord:
def __setattr__(self, name, value):
# Save some data for the record
...
super().__setattr__(name, value)
建一个打log的实例:
class LoggingSavingRecord(SavingRecord):
def __setattr__(self, name, value):
print(f'* Called __setattr__({name!r}, {value!r})')
super().__setattr__(name, value)
data = LoggingSavingRecord()
print('Before: ', data.__dict__)
data.foo = 5
print('After: ', data.__dict__)
data.foo = 7
print('Finally:', data.__dict__)
>>>
Before: {}
* Called __setattr__('foo', 5)
After: {'foo': 5}
* Called __setattr__('foo', 7)
Finally: {'foo': 7}
假设我希望对我的对象进行属性访问以实际查找关联字典中的键:
class BrokenDictionaryRecord:
def __init__(self, data):
self._data = {}
def __getattribute__(self, name):
print(f'* Called __getattribute__({name!r})')
return self._data[name]
但是,程序会直接运行到报错:
data = BrokenDictionaryRecord({'foo': 3})
data.foo
>>>
* Called __getattribute__('foo')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
...
Traceback ...
RecursionError: maximum recursion depth exceeded while
calling a Python object
主要是由于运行了self._data导致又运行了__getattribute__,导致了无限循环。
而是应该从父类中获取属性,然后从这个值里面返回对应的结果:
class DictionaryRecord:
def __init__(self, data):
self._data = data
def __getattribute__(self, name):
print(f'* Called __getattribute__({name!r})')
data_dict = super().__getattribute__('_data')
return data_dict[name]
data = DictionaryRecord({'foo': 3})
print('foo: ', data.foo)
>>>
* Called __getattribute__('foo')
foo: 3
- Item48: 用__init_subclass__验证子类
元类(MetaClass)通过继承type来定义。在运行时构建类的类型。元类通过__new__来接收关联类的内容:
class Meta(type):
def __new__(meta, name, bases, class_dict):
print(f'* Running {meta}.__new__ for {name}')
print('Bases:', bases)
print(class_dict)
return type.__new__(meta, name, bases, class_dict)
class MyClass(metaclass=Meta):
stuff = 123
def foo(self):
pass
class MySubclass(MyClass):
other = 567
def bar(self):
pass
元类可以访问类的名称,以及它的父类继承自(bases)和中定义的所有类属性
>>>
* Running .__new__ for MyClass
Bases: ()
{'__module__': '__main__',
'__qualname__': 'MyClass',
'stuff': 123,
'foo': }
* Running .__new__ for MySubclass
Bases: (,)
{'__module__': '__main__',
'__qualname__': 'MySubclass',
'other': 567,
'bar': }
主要的作用可以验证子类,比如验证是否是多边形:
class ValidatePolygon(type):
def __new__(meta, name, bases, class_dict):
# Only validate subclasses of the Polygon class
if bases:
if class_dict['sides'] < 3:
raise ValueError('Polygons need 3+ sides')
return type.__new__(meta, name, bases, class_dict)
class Polygon(metaclass=ValidatePolygon):
sides = None # Must be specified by subclasses
@classmethod
def interior_angles(cls):
return (cls.sides - 2) * 180
class Triangle(Polygon):
sides = 3
class Rectangle(Polygon):
sides = 4
class Nonagon(Polygon):
sides = 9
assert Triangle.interior_angles() == 180
assert Rectangle.interior_angles() == 360
assert Nonagon.interior_angles() == 1260
当边大于2的时候正常,但是当边为2的时候,报错:
print('Before class')
class Line(Polygon):
print('Before sides')
sides = 2
print('After sides')
print('After class')
>>>
Before class
Before sides
After sides
Traceback ...
ValueError: Polygons need 3+ sides
Python3.6引入了__init_subclass__来方便引入相同的特性:
class BetterPolygon:
sides = None # Must be specified by subclasses
def __init_subclass__(cls):
super().__init_subclass__()
if cls.sides < 3:
raise ValueError('Polygons need 3+ sides')
@classmethod
def interior_angles(cls):
return (cls.sides - 2) * 180
class Hexagon(BetterPolygon):
sides = 6
assert Hexagon.interior_angles() == 720
代码更短了,可以直接从cls获取sides,而不用从class_dict['sides']获得。
print('Before class')
class Point(BetterPolygon):
sides = 1
print('After class')
>>>
Before class
Traceback ...
ValueError: Polygons need 3+ sides
每个类只能指定一个元类。当我想要第二个元类来验证颜色时:
class ValidateFilled(type):
def __new__(meta, name, bases, class_dict):
# Only validate subclasses of the Filled class
if bases:
if class_dict['color'] not in ('red', 'green'):
raise ValueError('Fill color must be supported')
return type.__new__(meta, name, bases, class_dict)
class Filled(metaclass=ValidateFilled):
color = None # Must be specified by subclasses
然后期望同样的方式来验证:
class RedPentagon(Filled, Polygon):
color = 'red'
sides = 5
>>>
Traceback ...
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases
只能通过比较复杂的继承来修复:
class ValidatePolygon(type):
def __new__(meta, name, bases, class_dict):
# Only validate non-root classes
if not class_dict.get('is_root'):
if class_dict['sides'] < 3:
raise ValueError('Polygons need 3+ sides')
return type.__new__(meta, name, bases, class_dict)
class Polygon(metaclass=ValidatePolygon):
is_root = True
sides = None # Must be specified by subclasses
class ValidateFilledPolygon(ValidatePolygon):
def __new__(meta, name, bases, class_dict):
# Only validate non-root classes
if not class_dict.get('is_root'):
if class_dict['color'] not in ('red', 'green'):
raise ValueError('Fill color must be
supported')
return super().__new__(meta, name, bases, class_dict)
class FilledPolygon(Polygon, metaclass=ValidateFilledPolygon):
is_root = True
color = None # Must be specified by subclasses
只能继承FilledPolygon:
class GreenPentagon(FilledPolygon):
color = 'green'
sides = 5
greenie = GreenPentagon()
assert isinstance(greenie, Polygon)
验证颜色和验证边:
class OrangePentagon(FilledPolygon):
color = 'orange'
sides = 5
>>>
Traceback ...
ValueError: Fill color must be supported
class RedLine(FilledPolygon):
color = 'red'
sides = 2
>>>
Traceback ...
ValueError: Polygons need 3+ sides
如果是用__init_subclass__来做:
class Filled:
color = None # Must be specified by subclasses
def __init_subclass__(cls):
super().__init_subclass__()
if cls.color not in ('red', 'green', 'blue'):
raise ValueError('Fills need a valid color')
则不会破坏组合性(当然,也可以像上面一样定义多层的类继承):
class RedTriangle(Filled, Polygon):
color = 'red'
sides = 3
ruddy = RedTriangle()
assert isinstance(ruddy, Filled)
assert isinstance(ruddy, Polygon)
以下是更多的测试:
print('Before class')
class BlueLine(Filled, Polygon):
color = 'blue'
sides = 2
print('After class')
>>>
Before class
Traceback ...
ValueError: Polygons need 3+ sides
print('Before class')
class BeigeSquare(Filled, Polygon):
color = 'beige'
sides = 4
print('After class')
>>>
Before class
Traceback ...
ValueError: Fills need a valid color
甚至可以用它来做一些复杂的场景的继承:
class Top:
def __init_subclass__(cls):
super().__init_subclass__()
print(f'Top for {cls}')
class Left(Top):
def __init_subclass__(cls):
super().__init_subclass__()
print(f'Left for {cls}')
class Right(Top):
def __init_subclass__(cls):
super().__init_subclass__()
print(f'Right for {cls}')
class Bottom(Left, Right):
def __init_subclass__(cls):
super().__init_subclass__()
print(f'Bottom for {cls}')
>>>
Top for
Top for
Top for
Right for
Left for
每个类只调用了一次Top.__init_subclass__。
- Item49: 用__init_subclass__注册类存在(Existence)
另一种公共的使用元类的方式是自动注册程序里的类型。当反向搜索时(把简单的identifier映射回对应的类),“注册”是有用的。
比如,用JSON来序列化object。
import json
class Serializable:
def __init__(self, *args):
self.args = args
def serialize(self):
return json.dumps({'args': self.args})
可以成功序列化点:
class Point2D(Serializable):
def __init__(self, x, y):
super().__init__(x, y)
self.x = x
self.y = y
def __repr__(self):
return f'Point2D({self.x}, {self.y})'
point = Point2D(5, 3)
print('Object: ', point)
print('Serialized:', point.serialize())
>>>
Object: Point2D(5, 3)
Serialized: {"args": [5, 3]}
此时,需要反序列化JSON,然后构建点:
class Deserializable(Serializable):
@classmethod
def deserialize(cls, json_data):
params = json.loads(json_data)
return cls(*params['args'])
class BetterPoint2D(Deserializable):
...
before = BetterPoint2D(5, 3)
print('Before: ', before)
data = before.serialize()
print('Serialized:', data)
after = BetterPoint2D.deserialize(data)
print('After: ', after)
>>>
Before: Point2D(5, 3)
Serialized: {"args": [5, 3]}
After: Point2D(5, 3)
问题在于需要提前知道类的类型(BetterPoint2D,Point2D)。应该是接收一个很大的JSON,然后分别都构建出对应的对象:
class BetterSerializable:
def __init__(self, *args):
self.args = args
def serialize(self):
return json.dumps({
'class': self.__class__.__name__,
'args': self.args,
})
def __repr__(self):
name = self.__class__.__name__
args_str = ', '.join(str(x) for x in self.args)
return f'{name}({args_str})'
可以注册类以及反序列化:
registry = {}
def register_class(target_class):
registry[target_class.__name__] = target_class
def deserialize(data):
params = json.loads(data)
name = params['class']
target_class = registry[name]
return target_class(*params['args'])
但是每次都要对类调用注册:
class EvenBetterPoint2D(BetterSerializable):
def __init__(self, x, y):
super().__init__(x, y)
self.x = x
self.y = y
register_class(EvenBetterPoint2D)
可以反序列化任何JSON串:
before = EvenBetterPoint2D(5, 3)
print('Before: ', before)
data = before.serialize()
print('Serialized:', data)
after = deserialize(data)
print('After: ', after)
>>>
Before: EvenBetterPoint2D(5, 3)
Serialized: {"class": "EvenBetterPoint2D", "args": [5, 3]}
After: EvenBetterPoint2D(5, 3)
但是忘记注册就会有问题:
class Point3D(BetterSerializable):
def __init__(self, x, y, z):
super().__init__(x, y, z)
self.x = x
self.y = y
self.z = z
# Forgot to call register_class! Whoops!
反序列化了忘记注册的类:
point = Point3D(5, 9, -4)
data = point.serialize()
deserialize(data)
>>>
Traceback ...
KeyError: 'Point3D'
元类可以来做到每次都注册的需求:
class Meta(type):
def __new__(meta, name, bases, class_dict):
cls = type.__new__(meta, name, bases, class_dict)
register_class(cls)
return cls
class RegisteredSerializable(BetterSerializable, metaclass=Meta):
pass
这样每次都能使得类得到注册:
class Vector3D(RegisteredSerializable):
def __init__(self, x, y, z):
super().__init__(x, y, z)
self.x, self.y, self.z = x, y, z
before = Vector3D(10, -7, 3)
print('Before: ', before)
data = before.serialize()
print('Serialized:', data)
print('After: ', deserialize(data))
>>>
Before: Vector3D(10, -7, 3)
Serialized: {"class": "Vector3D", "args": [10, -7, 3]}
After: Vector3D(10, -7, 3)
能用__init_subclass__就更好了:
class BetterRegisteredSerializable(BetterSerializable):
def __init_subclass__(cls):
super().__init_subclass__()
register_class(cls)
class Vector1D(BetterRegisteredSerializable):
def __init__(self, magnitude):
super().__init__(magnitude)
self.magnitude = magnitude
before = Vector1D(6)
print('Before: ', before)
data = before.serialize()
print('Serialized: ', data)
print('After: ', deserialize(data))
>>>
Before: Vector1D(6)
Serialized: {"class": "Vector1D", "args": [6]}
After: Vector1D(6)
以上是用__init_subclass__来替代元类实现一些类注册功能。
- Item50: 用__set_name__注解类属性
在这里,定义一个描述符类来将属性连接到数据库表的列名:
class Field:
def __init__(self, name):
self.name = name
self.internal_name = '_' + self.name
def __get__(self, instance, instance_type):
if instance is None:
return self
return getattr(instance, self.internal_name, '')
def __set__(self, instance, value):
setattr(instance, self.internal_name, value)
然后定义一个顾客类:
class Customer:
# Class attributes
first_name = Field('first_name')
last_name = Field('last_name')
prefix = Field('prefix')
suffix = Field('suffix')
可以直接赋值属性。
cust = Customer()
print(f'Before: {cust.first_name!r} {cust.__dict__}')
cust.first_name = 'Euclid'
print(f'After: {cust.first_name!r} {cust.__dict__}')
>>>
Before: '' {}
After: 'Euclid' {'_first_name': 'Euclid'}
但是显得冗余,因为first_name已经可以表示了,为什么还要构建出一个Field来保存同样的信息?
class Customer:
# Left side is redundant with right side
first_name = Field('first_name')
...
此时可以用元类来处理:
class Meta(type):
def __new__(meta, name, bases, class_dict):
for key, value in class_dict.items():
if isinstance(value, Field):
value.name = key
value.internal_name = '_' + key
cls = type.__new__(meta, name, bases, class_dict)
return cls
让元类来提取到key,赋值给相应的Field。然后数据行继承元类:
class DatabaseRow(metaclass=Meta):
pass
最后,每个属性用无参的init即可:
class Field:
def __init__(self):
# These will be assigned by the metaclass.
self.name = None
self.internal_name = None
def __get__(self, instance, instance_type):
if instance is None:
return self
return getattr(instance, self.internal_name, '')
def __set__(self, instance, value):
setattr(instance, self.internal_name, value)
然后,实际使用时,直接继承:
class BetterCustomer(DatabaseRow):
first_name = Field()
last_name = Field()
prefix = Field()
suffix = Field()
cust = BetterCustomer()
print(f'Before: {cust.first_name!r} {cust.__dict__}')
cust.first_name = 'Euler'
print(f'After: {cust.first_name!r} {cust.__dict__}')
>>>
Before: '' {}
After: 'Euler' {'_first_name': 'Euler'}
但是,当忘记继承的时候,会出错:
class BrokenCustomer:
first_name = Field()
last_name = Field()
prefix = Field()
suffix = Field()
cust = BrokenCustomer()
cust.first_name = 'Mersenne'
>>>
Traceback ...
TypeError: attribute name must be string, not 'NoneType'
Python3.6之后引入了__set_name__,可以代替元类的new来完成工作:
class Field:
def __init__(self):
self.name = None
self.internal_name = None
def __set_name__(self, owner, name):
# Called on class creation for each descriptor
self.name = name
self.internal_name = '_' + name
def __get__(self, instance, instance_type):
if instance is None:
return self
return getattr(instance, self.internal_name, '')
def __set__(self, instance, value):
setattr(instance, self.internal_name, value)
这样就不用继承元类也能完成工作了:
class FixedCustomer:
first_name = Field()
last_name = Field()
prefix = Field()
suffix = Field()
cust = FixedCustomer()
print(f'Before: {cust.first_name!r} {cust.__dict__}')
cust.first_name = 'Mersenne'
print(f'After: {cust.first_name!r} {cust.__dict__}')
>>>
Before: '' {}
After: 'Mersenne' {'_first_name': 'Mersenne'}
- 元类使您能够在完全定义类之前修改类的属性
- 描述符(descriptors)和元类(metaclasses)让声明式行为和运行时自省(introspection)有力的组合
- 在描述符类上定义 set_name 以允许它们考虑周围的类及其属性名称。
- 通过让描述符将它们直接操作的数据存储在类的实例字典中,避免内存泄漏和 weakref 内置模块。
- Item51: 对可组合的类扩展,使用类装饰器而不是元类
假如现在要追踪各个函数的函数名,参数和返回值:
from functools import wraps
def trace_func(func):
if hasattr(func, 'tracing'): # Only decorate once
return func
@wraps(func)
def wrapper(*args, **kwargs):
result = None
try:
result = func(*args, **kwargs)
return result
except Exception as e:
result = e
raise
finally:
print(f'{func.__name__}({args!r}, {kwargs!r}) -> '
f'{result!r}')
wrapper.tracing = True
return wrapper
需要每个函数都打上decorator,比较麻烦:
class TraceDict(dict):
@trace_func
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
@trace_func
def __setitem__(self, *args, **kwargs):
return super().__setitem__(*args, **kwargs)
@trace_func
def __getitem__(self, *args, **kwargs):
return super().__getitem__(*args, **kwargs)
...
trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
trace_dict['does not exist']
except KeyError:
pass # Expected
>>>
__init__(({'hi': 1}, [('hi', 1)]), {}) -> None
__setitem__(({'hi': 1, 'there': 2}, 'there', 2), {}) -> None
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'), {}) -> KeyError('does not exist')
如果换种方式:
import types
trace_types = (
types.MethodType,
types.FunctionType,
types.BuiltinFunctionType,
types.BuiltinMethodType,
types.MethodDescriptorType,
types.ClassMethodDescriptorType)
class TraceMeta(type):
def __new__(meta, name, bases, class_dict):
klass = super().__new__(meta, name, bases, class_dict)
for key in dir(klass):
value = getattr(klass, key)
if isinstance(value, trace_types):
wrapped = trace_func(value)
setattr(klass, key, wrapped)
return klass
用元类也可以解决问题:
class TraceDict(dict, metaclass=TraceMeta):
pass
trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
trace_dict['does not exist']
except KeyError:
pass # Expected
>>>
__new__((, [('hi', 1)]), {}) ->
{}
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'), {}) -> KeyError('does not exist')
按理说应该每个继承的类是可以的,实际却会冲突:
class OtherMeta(type):
pass
class SimpleDict(dict, metaclass=OtherMeta):
pass
class TraceDict(SimpleDict, metaclass=TraceMeta):
pass
>>>
Traceback ...
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases
只能这么进行:
class TraceMeta(type):
...
class OtherMeta(TraceMeta):
pass
class SimpleDict(dict, metaclass=OtherMeta):
pass
class TraceDict(SimpleDict, metaclass=TraceMeta):
pass
trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
trace_dict['does not exist']
except KeyError:
pass # Expected
>>>
__init_subclass__((), {}) -> None
__new__((, [('hi', 1)]), {}) ->
{}
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'), {}) -> KeyError('does not exist')
def my_class_decorator(klass):
klass.extra_param = 'hello'
return klass
@my_class_decorator
class MyClass:
pass
print(MyClass)
print(MyClass.extra_param)
>>>
hello
实际上,Python提供了类的decorator来使用:
def trace(klass):
for key in dir(klass):
value = getattr(klass, key)
if isinstance(value, trace_types):
wrapped = trace_func(value) # 将函数修改
setattr(klass, key, wrapped) # 重新赋值函数
return klass
@trace
class TraceDict(dict):
pass
trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
trace_dict['does not exist']
except KeyError:
pass # Expected
>>>
__new__((, [('hi', 1)]), {}) ->
{}
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'), {}) -> KeyError('does not exist')
已经有元类的类也可以用类的装饰器:
class OtherMeta(type):
pass
@trace
class TraceDict(dict, metaclass=OtherMeta):
pass
trace_dict = TraceDict([('hi', 1)])
trace_dict['there'] = 2
trace_dict['hi']
try:
trace_dict['does not exist']
except KeyError:
pass # Expected
>>>
__new__((, [('hi', 1)]), {}) ->
{}
__getitem__(({'hi': 1, 'there': 2}, 'hi'), {}) -> 1
__getitem__(({'hi': 1, 'there': 2}, 'does not exist'),{}) -> KeyError('does not exist')
- 类装饰器是一个简单的函数,它接收一个类实例作为参数并返回一个新类或原始类的修改版本。
- 当你想用最少的样板修改类的每个方法或属性时,类装饰器很有用。
- 元类不容易组合在一起,而许多类装饰器可以用来扩展同一个类而不会发生冲突。