dataclass
数据类在Python 3.7(PEP 557)
后引入一个新功能是装饰器@dataclass
,它通过自动生成特殊方法(如__init__() 和 __repr__() ...等魔术方法
)来简化数据类
的创建。
数据类和普通类一样,但设计用于存储数据、结构简单、用于将相关的数据组织在一起、具有清晰字段的类。
这种类,也称为数据结构,非常常见。例如,用于存储点坐标
的类只是一个具有 3 个字段(x、y、z)的类。
而如果不使用类来表示,python中也有其它可替换的数据结构。
假设我们现在遇到一个场景, 需要一个数据对象来保存一些运动员信息,信息包括球员姓名,号码,位置,年龄。
harden = ('James Harden', 1, 'PG', 34)
print(harden[2]) # PG
劣势: 不灵活,创建和取值基于位置,需要记住坐标对应的信息。
from collections import namedtuple
Player = namedtuple('Player', ['name', 'number', 'position', 'age', 'grade'])
jordan = Player('James Harden', 1, 'PG', 1, 'S+')
print(jordan) # Player(name='James Harden', number=1, position='PG', age=1, grade='S+')
print(jordan.name) # James Harden
使用namedtuple
可以使用.
获取数据的属性, 可以明确数据的属性名称,但是仍然存在一些问题,比如:
from typing import NamedTuple
class Player(NamedTuple):
name: str
number: int
position: str
age: int
grade: str
jordan = Player('James Harden', 1, 'PG', 1, 'S+')
print(jordan) # Player(name='James Harden', number=1, position='PG', age=1, grade='S+')
print(jordan.name) # James Harden
通过类型提示让代码更具可读性和可维护性。但同样有namedtuple
的一些问题,如不可变性等。
使用dict
来存放一些参数,配置信息,相比tuple
来说可以支持更复杂的嵌套结构。
jordan = {'name': 'James Harden', 'number': 1, 'position': 'PG', 'age': 34}
print(jordan['position']) # PG
劣势: 无法对数据属性名进行控制。
可以更多的利用类型检查来帮助减少错误发生的可能,同时也能帮助其他开发者理解复杂数据结构。
from typing import TypedDict
class Player(TypedDict):
name: str
number: int
position: str
age: int
jordan: Player = {'name': 'James Harden', 'number': 1, 'position': 'PG', 'age': 34}
print(jordan['position']) # Output: PG
总的来说,对于一些简单的场景,
tuple
、namedtuple
、dict
还是有一席用武之地的,但是在一些更复杂的场景中,这三者就显得没那么好用了,比如:数据比较,设置默认值等。
因此,我们一般会通过自定义类来实现复杂场景的数据类。
class Player:
def __init__(self, name, number, position, age, grade):
self.name = name
self.number = number
self.position = position
self.age = age
self.grade = grade
harden = Player('James Harden', 1, 'PG', 34, 'S+')
bryant = Player(name='Kobe Bryant', number=24, position='PG', age=41, grade='S+')
print(harden.name) # James Harden
print(bryant.name) # Kobe Bryant
print(harden) # <__main__.Player object at 0x000002431AFC6E00>
print(harden < bryant)
结果:
James Harden
Kobe Bryant
<__main__.Player object at 0x000002431AFC6E00>
Traceback (most recent call last):
File "F:\study\django-restframesork-jwt-demo\test\1.py", line 33, in <module>
print(harden < bryant)
TypeError: '<' not supported between instances of 'Player' and 'Player'
然而,这样定义的类还是有以下问题:
为了解决上面两个问题,可以通过实现__repr__
方法来自定义描述, 实现__gt__
方法来支持比较的功能。
假设比较的属性为
age
, 更新代码如下:
class Player:
def __init__(self, name, number, position, age, grade):
self.name = name
self.number = number
self.position = position
self.age = age
self.grade = grade
def __repr__(self):
return f'Player: {self.name} : {self.age}'
def __gt__(self, other):
return self.age > other.age
def __eq__(self, other):
return self.age == other.age
harden = Player('James Harden', 1, 'PG', 34, 'S+')
bryant = Player(name='Kobe Bryant', number=24, position='PG', age=41, grade='S+')
print(harden.name) # James Harden
print(bryant.name) # Kobe Bryant
print(harden) # Player: James Harden : 34
print(harden < bryant) # True
这样,这个数据对象有了更直观的描述, 支持了对比。
我们
经常
需要添加构造函数
、表示方法
、比较函数
等。这些函数很麻烦,而这正是语言应该透明地处理的。
from dataclasses import dataclass
@dataclass(order=True)
class Player:
name: str
number: int
position: str
grade: str
age: int = 18 # 默认值,跟函数定义一样,需要往后放
harden = Player('James Harden', 1, 'PG', 'S+', 34)
bryant = Player(name='Kobe Bryant', number=24, position='PG', grade='S+', age=41)
print(harden.name) # James Harden
print(bryant.name) # Kobe Bryant
print(harden) # Player(name='James Harden', number=1, position='PG', grade='S+', age=34)
# 比较, 默认按照属性定义的顺序比较的
print(harden < bryant) # True
dataclass
相较于dict
和tuple
具有明显优势。它能更精确地指定每个成员变量的类型,同时提供字段名的检查,大大降低了出错的可能性。相对于传统的类定义,使用dataclass
更加简洁,省去了冗长的__init__
方法等,只需直接列出成员变量即可。数据类更易于阅读和理解,类型提示使得读者更自然地理解数据的组织结构。当数据类清晰明了时,读者更容易形成准确的假设,也更容易发现并修复潜在的错误。
使用dataclass
改造了之后,看起来结果也是符合预期的,但是我们需要了解下其中的原理,不然也是会不经意间遗留下bug
。
你是否好奇dataclass
加上的这些魔术方法是什么样的?比如说比较的逻辑是什么?
接下来我们看一下源码及官方的介绍,那样你就知道上面的代码是否有问题啦!
def dataclass(cls=None, /, *, init=True, repr=True, eq=True, order=False,
unsafe_hash=False, frozen=False, match_args=True,
kw_only=False, slots=False):
"""Returns the same class as was passed in, with dunder methods
added based on the fields defined in the class.
Examines PEP 526 __annotations__ to determine fields.
If init is true, an __init__() method is added to the class. If
repr is true, a __repr__() method is added. If order is true, rich
comparison dunder methods are added. If unsafe_hash is true, a
__hash__() method function is added. If frozen is true, fields may
not be assigned to after instance creation. If match_args is true,
the __match_args__ tuple is added. If kw_only is true, then by
default all fields are keyword-only. If slots is true, an
__slots__ attribute is added.
"""
def wrap(cls):
return _process_class(cls, init, repr, eq, order, unsafe_hash,
frozen, match_args, kw_only, slots)
# See if we're being called as @dataclass or @dataclass().
if cls is None:
# We're called with parens.
return wrap
# We're called as @dataclass without parens.
return wrap(cls)
dataclass
提供了一些字段,使用这些字段,装饰器将生成的方法定义添加到类中,以支持实例初始化
、repr
、比较方法
以及规范
部分中所述的其他方法(可选)。
__init__
方法。__repr__
方法。__eq__
方法。__lt__
、__le__
、__gt__
和__ge__
方法。True
,则将生成函数__hash__
。True
,则实例将是不可变的(只读)。这样的类称为
Data
类,但该类实际上并没有什么特别之处,装饰器将生成的方法添加到类中,并返回给定的相同类。
举个例子:
@dataclass
class InventoryItem:
'''Class for keeping track of an item in inventory.'''
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
@dataclass
装饰器可以将这些方法的等效项添加到InventoryItem
类中,可以通过参数控制:
def __init__(self, name: str, unit_price: float, quantity_on_hand: int = 0) -> None:
self.name = name
self.unit_price = unit_price
self.quantity_on_hand = quantity_on_hand
def __repr__(self):
return f'InventoryItem(name={self.name!r}, unit_price={self.unit_price!r}, quantity_on_hand={self.quantity_on_hand!r})'
def __eq__(self, other):
if other.__class__ is self.__class__:
return (self.name, self.unit_price, self.quantity_on_hand) == (other.name, other.unit_price, other.quantity_on_hand)
return NotImplemented
def __ne__(self, other):
if other.__class__ is self.__class__:
return (self.name, self.unit_price, self.quantity_on_hand) != (other.name, other.unit_price, other.quantity_on_hand)
return NotImplemented
def __lt__(self, other):
if other.__class__ is self.__class__:
return (self.name, self.unit_price, self.quantity_on_hand) < (other.name, other.unit_price, other.quantity_on_hand)
return NotImplemented
def __le__(self, other):
if other.__class__ is self.__class__:
return (self.name, self.unit_price, self.quantity_on_hand) <= (other.name, other.unit_price, other.quantity_on_hand)
return NotImplemented
def __gt__(self, other):
if other.__class__ is self.__class__:
return (self.name, self.unit_price, self.quantity_on_hand) > (other.name, other.unit_price, other.quantity_on_hand)
return NotImplemented
def __ge__(self, other):
if other.__class__ is self.__class__:
return (self.name, self.unit_price, self.quantity_on_hand) >= (other.name, other.unit_price, other.quantity_on_hand)
return NotImplemented
看完上面的例子,我们也就对其原理有了一定了解,dataclass
在一定程度上帮我们简化了数据类的定义,但是如果我们需要精准控制我们的程序,还是需要我们重写其中的相关魔术方法的。
我们再来看下运动员的例子,使用dataclass
改造如下,以实现更精准的控制:
from dataclasses import dataclass
@dataclass
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
def __eq__(self, other):
return self.age == other.age # 只比较age
def __lt__(self, other):
return self.age < other.age # 只比较 age
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
bryant = Player(name='Kobe Bryant', number=24, position='PG', grade='S+', age=41)
result = harden < bryant # 按照 age 进行比较
print(result) # 输出 True,因为 34 < 41
print(harden.name) # James Harden
print(bryant.name) # Kobe Bryant
print(bryant == harden) # False
当然,如果都要自己重载实现,那
dataclass
看起来也是不太聪明的样子。不想全部的字段都参与,dataclass
也是提供了其它机制用于简化。
通过上面的示例,我们了解到,dataclass
帮我们模板化的实现了一批魔术方法,而我们要做的仅仅是根据需求调整dataclass
的参数或者在适当的时候进行部分重载以满足我们的实际场景。
与函数参数规则一样,具有默认值的属性必须出现在没有默认值的属性之后。
from dataclasses import dataclass
from typing import Any
@dataclass
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
team: Any = "nba"
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
bryant = Player(name='Kobe Bryant', number=24, position='PG', grade='S+')
print(harden.name) # James Harden
print(bryant.name) # Kobe Bryant
print(bryant.age) # 18
print(bryant.team) # nba
数据类可以嵌套为其他数据类的字段,可以简单创建一个有2个队员的球队。快船队包含:哈登和小卡。
from dataclasses import dataclass
from typing import List
@dataclass
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
@dataclass
class Team:
name: str
players: List[Player]
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
leonard = Player(name='Kawhi Leonard', number=2, position='SF', grade='S+')
clippers = Team("clippers", [harden, leonard])
print(harden.name) # James Harden
print(leonard.name) # Kawhi Leonard
print(leonard.age) # 18
print(clippers) # Team(name='clippers', players=[Player(name='James Harden', number=1, position='PG', grade='S+', age=34), Player(name='Kawhi Leonard', number=2, position='SF', grade='S+', age=18)])
from dataclasses import dataclass, field
@dataclass(order=True)
class Person:
name: str
age: int
@dataclass(order=True)
class Player(Person):
number: int
position: str
grade: str
team: str = "nba"
# 示例使用
harden = Player(name='James Harden', age=34, number=1, position='PG', grade='S+')
bryant = Player(name='Kobe Bryant', age=41, number=24, position='PG', grade='S+')
print(harden.name) # James Harden
print(bryant.name) # Kobe Bryant
print(bryant.age) # 41
print(bryant.team) # nba
# 使用 order 参数,可以比较对象的大小(用于排序)
print(harden < bryant) # True
类中定义的字段的顺序(先父类,再当前类)。
数据类一般建议是显示声明属性。如果你想额外接收一些参数,可能以下方法可以满足你。
from dataclasses import dataclass, field
@dataclass
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
args: tuple = ()
kwargs: dict = field(default_factory=dict)
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
bryant = Player(name='Kobe Bryant', number=24, position='PG', grade='S+', args=(1, 2), kwargs={"hello": "world"})
print(bryant)
输出:
Player(name='Kobe Bryant', number=24, position='PG', grade='S+', age=18, args=(1, 2), kwargs={'hello': 'world'})
如果数据类的属性是不可变类型,可以直接为其赋默认值,然而当属性是不可变类型时,直接给定默认值时会报错。
from dataclasses import dataclass
from typing import List
@dataclass
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
leonard = Player(name='Kawhi Leonard', number=2, position='SF', grade='S+')
@dataclass
class Team:
name: str
players: List[Player] = [leonard] # 这里会报错
clippers = Team("clippers", [harden, leonard])
print(harden.name)
print(leonard.name)
print(leonard.age)
print(clippers)
输出:
ValueError: mutable default <class 'list'> for field players is not allowed: use default_factory
dataclass
默认阻止使用可变数据做默认值
正如报错提示的一样,这时候field对象
就登场了。
from dataclasses import dataclass, field, fields
from typing import List
@dataclass
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
leonard = Player(name='Kawhi Leonard', number=2, position='SF', grade='S+')
@dataclass
class Team:
name: str = field(metadata={'unit': 'name'})
players: List[Player] = field(default_factory=lambda: [leonard], metadata={'unit': 'players'})
clippers = Team("clippers", [harden])
clippers1 = Team("clippers")
print(harden.name)
print(leonard.name)
print(leonard.age)
print(clippers.players)
print(clippers1.players)
print(fields(clippers))
print(fields(clippers)[1].metadata)
输出:
James Harden
Kawhi Leonard
18
[Player(name='James Harden', number=1, position='PG', grade='S+', age=34)]
[Player(name='Kawhi Leonard', number=2, position='SF', grade='S+', age=18)]
(Field(name='name',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object at 0x0000029523A65060>,default_factory=<dataclasses._MISSING_TYPE object at 0x0000029523A65060>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'unit': 'name'}),kw_only=False,_field_type=_FIELD), Field(name='players',type=typing.List[__main__.Player],default=<dataclasses._MISSING_TYPE object at 0x0000029523A65060>,default_factory=<function Team.<lambda> at 0x0000029523B44B80>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'unit': 'players'}),kw_only=False,_field_type=_FIELD))
{'unit': 'players'}
我们来看一下field对象的签名:
def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True,
hash=None, compare=True, metadata=None, kw_only=MISSING):
"""Return an object to identify dataclass fields.
default is the default value of the field. default_factory is a
0-argument function called to initialize a field's value. If init
is true, the field will be a parameter to the class's __init__()
function. If repr is true, the field will be included in the
object's repr(). If hash is true, the field will be included in the
object's hash(). If compare is true, the field will be used in
comparison functions. metadata, if specified, must be a mapping
which is stored but not otherwise examined by dataclass. If kw_only
is true, the field will become a keyword-only parameter to
__init__().
It is an error to specify both default and default_factory.
"""
if default is not MISSING and default_factory is not MISSING:
raise ValueError('cannot specify both default and default_factory')
return Field(default, default_factory, init, repr, hash, compare,
metadata, kw_only)
参数 | 描述 | 默认值 |
---|---|---|
default | 指定字段的默认值。 | |
default_factory | 与 default 相似,但是是一个可调用对象,用于提供默认值。每次创建实例时,都会重新调用工厂函数以获取新的默认值。 | |
init | 控制是否在__init__ 方法中包含该字段 |
True |
repr | 是否在__repr__() 方法中使用字段 |
True |
compare | 是否在比较对象时, 包括该字段 | True |
hash | 计算hash时, 是否包括字段 | True |
metadata | 包含字段信息的映射 |
如不想
name
加入比较,则可以设置:name: str = field(compare = False)
元数据(metadata)
可以基于元数据进行数据校验:
from dataclasses import dataclass, field, fields
from datetime import datetime
class ValidationError(Exception):
def __init__(self, field_name, condition, actual_value):
self.field_name = field_name
self.condition = condition
self.actual_value = actual_value
super().__init__(f"{field_name} validation failed: {condition} (Actual value: {actual_value})")
class Color:
RED = '\033[91m'
END = '\033[0m'
@dataclass
class Player:
name: str = field(default="", metadata={"validation": [lambda x: len(x) == 0]})
number: int = field(default=0, metadata={"validation": [lambda x: not 0 < x <= 100]})
position: str = field(default="", metadata={"validation": [lambda x: len(x) == 0]})
grade: str = field(default="", metadata={"validation": [lambda x: x in {'S+', 'S', 'A', 'B', 'C'}]})
age: int = field(default=0, metadata={"validation": [lambda x: not 0 < x <= 150]})
foundation_date: datetime = field(default_factory=datetime.now)
def validation(self):
for field_ in fields(self):
validations = field_.metadata.get("validation", [])
for validation in validations:
if validation(getattr(self, field_.name)):
raise ValidationError(field_.name, str(validation), getattr(self, field_.name))
harden = Player(name='James Harden', number=13, position='PG', grade='S+', age=32)
bryant = Player(name='Kobe Bryant', number=24, position='SG', grade='S', age=41)
# 无效的数据,引发异常
try:
harden.validation()
except ValidationError as e:
print(f"{Color.RED}{e}{Color.END}")
try:
bryant.validation()
except ValidationError as e:
print(f"{Color.RED}{e}{Color.END}")
输出:
grade validation failed: <function Player.<lambda> at 0x00000197FD6B4CA0> (Actual value: S+)
grade validation failed: <function Player.<lambda> at 0x00000197FD6B4CA0> (Actual value: S)
通过对field()对象
的剖析,我们可以指定属性:是否参与比较
,是否参与hash计算
等等。
不过我们知道默认的比较顺序,我们也可以通过增加属性以实现按需比较的功能。而这个用于比较的属性位于数据类的第一个属性,并可以借助__post_init__
魔法函数实现灵活赋值。
from dataclasses import dataclass, field
@dataclass(order=True)
class Player:
sort_index: tuple = field(init=False) # 添加一个 sort_index 字段,并设置为不在 __init__ 方法中初始化
name: str
number: int
position: str
grade: str
age: int = 18
def __post_init__(self):
self.sort_index = (self.age, self.grade) # 在 __post_init__ 方法中计算 sort_index
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
bryant = Player(name='Kobe Bryant', number=24, position='PG', grade='S+', age=41)
result = harden < bryant # 按照 age 进行比较
print(result) # 输出 True,因为 34 < 41
print(harden.name) # James Harden
print(bryant.name) # Kobe Bryant
print(bryant == harden) # False
def dataclass(cls=None, /, *, init=True, repr=True, eq=True, order=False,
unsafe_hash=False, frozen=False, match_args=True,
kw_only=False, slots=False):
使用dataclass
实现的数据类默认是可变的,要使数据类不可变,需要在创建类时设置frozen=True
。
from dataclasses import dataclass, field
@dataclass(order=True, frozen=True)
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
harden.age = 33 # dataclasses.FrozenInstanceError: cannot assign to field 'age'
当unsafe_hash=True
时,可以实现数据类的去重。参与的字段同样可由field
对象控制。
from dataclasses import dataclass, field
@dataclass(order=True, unsafe_hash=True)
class Player:
name: str
number: int
position: str = field(hash=False) # 不参与hash
grade: str
age: int = 18
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
harden2 = Player('James Harden', 1, 'PG', 'S+', 34)
harden3 = Player('James Harden', 1, 'SG', 'S+', 34)
print({harden, harden2})
print({harden, harden3})
输出:
{Player(name='James Harden', number=1, position='PG', grade='S+', age=34)}
{Player(name='James Harden', number=1, position='PG', grade='S+', age=34), Player(name='James Harden', number=1, position='SG', grade='S+', age=34)}
from dataclasses import dataclass, field, asdict, astuple
@dataclass
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
args: tuple = ()
kwargs: dict = field(default_factory=dict)
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
bryant = Player(name='Kobe Bryant', number=24, position='PG', grade='S+', args=(1, 2), kwargs={"hello": "world"})
alist = [harden, bryant]
print(sorted(alist, key=lambda x: x.age))
print(asdict(bryant))
print(astuple(harden))
输出:
[Player(name='Kobe Bryant', number=24, position='PG', grade='S+', age=18, args=(1, 2), kwargs={'hello': 'world'}), Player(name='James Harden', number=1, position='PG', grade='S+', age=34, args=(), kwargs={})]
{'name': 'Kobe Bryant', 'number': 24, 'position': 'PG', 'grade': 'S+', 'age': 18, 'args': (1, 2), 'kwargs': {'hello': 'world'}}
('James Harden', 1, 'PG', 'S+', 34, (), {})
这个方法允许你创建一个新的实例,其中某些字段的值被更改,而其他字段的值保持不变。
from dataclasses import dataclass, field, fields, replace
from typing import List
@dataclass
class Player:
name: str
number: int
position: str
grade: str
age: int = 18
# 示例使用
harden = Player('James Harden', 1, 'PG', 'S+', 34)
leonard = Player(name='Kawhi Leonard', number=2, position='SF', grade='S+')
@dataclass
class Team:
name: str = field(metadata={'unit': 'name'})
players: List[Player] = field(default_factory=lambda: [leonard], metadata={'unit': 'players'})
clippers = Team("clippers", [leonard])
# 使用 replace() 替换 Team 实例中的字段值
new_clippers = replace(clippers, name="new_clippers", players=[leonard, harden])
print("Original Clippers:", clippers)
print("New Clippers:", new_clippers)
输出:
Original Clippers: Team(name='clippers', players=[Player(name='Kawhi Leonard', number=2, position='SF', grade='S+', age=18)])
New Clippers: Team(name='new_clippers', players=[Player(name='Kawhi Leonard', number=2, position='SF', grade='S+', age=18), Player(name='James Harden', number=1, position='PG', grade='S+', age=34)])
dataclass
数据类可以配合一些校验工具包和数据提取工具包以实现数据提取或参数校验的工作,以下是配合marshmallow
、desert
实现数据校验提取工作的示例:
import requests
from dataclasses import dataclass
import dataclasses
from marshmallow import fields, EXCLUDE, validate
import desert
@dataclass
class Activity:
activity: str
participants: int = dataclasses.field(metadata=desert.metadata(
fields.Int(required=True,
validate=validate.Range(min=1, max=50,
error="Participants must be between 1 and 50 people"))
))
price: float = dataclasses.field(metadata=desert.metadata(
fields.Float(required=True,
validate=validate.Range(
min=0, max=50,
error="Price must be between $1 and $50"))
))
def __post_init__(self):
self.price = self.price * 100
def get_activity():
# resp = requests.get("https://www.boredapi.com/api/activity").json()
resp = {
"activity": "Improve your touch typing",
"type": "busywork",
"participants": 1,
"price": 1.0,
# "price": 51,
"link": "https://en.wikipedia.org/wiki/Touch_typing",
"key": "2526437",
"accessibility": 0.8
}
# 只提取关心的部分,未知内容选择忽略
schema = desert.schema(Activity, meta={"unknown": EXCLUDE})
return schema.load(resp)
print(get_activity())
输出:
Activity(activity='Improve your touch typing', participants=1, price=100.0)
如果你修改一下resp
的值,比如使price
大于50,这时候你会得到校验失败的提示:
marshmallow.exceptions.ValidationError: {'price': ['Price must be between $1 and $50']}
dataclasses
在许多情境下都表现出色,尤其是在定义用于存储数据的简单对象时。它特别适用于处理配置信息、数据传输对象(DTO)、领域对象以及其他仅包含数据的结构。
需求:程序退出前自动持久化配置对象到配置文件。
import json
import atexit
import logging
import threading
from pathlib import Path
from dataclasses import dataclass, asdict
@dataclass
class Config(object):
name: str = "mysql"
port: int = 3306
_instance = None
_lock = threading.Lock()
_registered = False # 新增类属性
def __new__(cls, *args, **kw):
with cls._lock:
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
def load_from_file(self, file_path):
"""从配置文件加载配置,如果文件不存在或加载失败,保持默认值。
"""
if file_path.exists():
try:
with file_path.open() as f:
json_data = json.load(f)
for key, value in json_data.items():
setattr(self, key, value)
except Exception as err:
logging.error(f"Failed to load config from file: {err}")
else:
logging.warning(f"Config file '{file_path}' not exists. Using default values.")
def save_to_file(self, file_path):
"""保存配置到文件
"""
json_str = json.dumps(asdict(self), indent=4)
with file_path.open('w') as f:
logging.warning(f"Saving configs to '{file_path}'")
f.write(json_str)
@classmethod
def register_atexit(cls):
"""注册在程序退出时保存配置到配置文件"""
with cls._lock:
if not cls._registered:
atexit.register(cls._instance.save_to_file, Path("./config.json"))
cls._registered = True
# 读取配置文件和保存配置的逻辑分离
def __post_init__(self):
config_file = Path("./config.json")
# 从配置文件加载配置
self.load_from_file(config_file)
# 注册在程序退出时保存配置到配置文件
self.register_atexit()
if __name__ == "__main__":
# 创建一个 Config 实例
config_instance = Config(name="redis", port=6379)
# 打印当前配置
print("Current Config:", config_instance)
# 修改配置并再次打印
config_instance.port = 8080
print("Updated Config:", config_instance)
# 创建另一个 Config 实例,演示单例模式
another_instance = Config()
print("Another Instance Config:", another_instance)
# 保存配置到文件
another_instance.save_to_file(Path("./another_config.json"))
# 从文件加载配置
another_instance.load_from_file(Path("./another_config.json"))
print("Loaded Config from File:", another_instance)
输出:
Current Config: Config(name='mysql', port=3306)
Updated Config: Config(name='mysql', port=8080)
Another Instance Config: Config(name='mysql', port=3306)
Loaded Config from File: Config(name='mysql', port=3306)
WARNING:root:Saving configs to 'another_config.json'
WARNING:root:Saving configs to 'config.json'
from dataclasses import dataclass
from enum import Enum
from typing import Tuple, Dict, Union
class Grade(Enum):
S_PLUS = 'S+'
# 定义其他等级...
@dataclass
class Player:
name: str
number: int
position: str
grade: Grade
age: int = 18
def create_player(name: str, number: int, position: str, grade: Grade, age: int) -> Player:
return Player(name, number, position, grade, age)
# 示例使用
harden = create_player('詹姆斯·哈登', 1, '控球后卫', Grade.S_PLUS, 34)
bryant = create_player('科比·布莱恩特', 24, '得分后卫', Grade.S_PLUS, 41)
print(harden)
print(bryant)
输出:
Player(name='詹姆斯·哈登', number=1, position='控球后卫', grade=<Grade.S_PLUS: 'S+'>, age=34)
Player(name='科比·布莱恩特', number=24, position='得分后卫', grade=<Grade.S_PLUS: 'S+'>, age=41)
dataclasses
提供了许多方便的功能,但是PEP 557
中还提到一个同样强大的数据类库attrs
,并且这个库支持验证器等功能。
import attr
@attr.s
class Point:
x = attr.ib(type=int)
y = attr.ib(type=int)
p = Point(1, 2)
print(p) # Output: Point(x=1, y=2)
在选择使用dataclasses
还是attrs
时,取决于项目的需求和个人喜好。dataclasses
更简单直观,而attrs
提供了更多的扩展性。如果只需要一些基本的自动生成特殊方法的功能,dataclasses
是个不错的选择。如果你需要更高级的功能和更多的定制选项,可以考虑使用attrs
。
dataclass
是一个强大的工具,使得创建和管理类变得更加简单和高效。
在实际应用中,特别是在数据处理和对象建模方面,使用@dataclass
装饰器能够极大地提升代码的清晰度,减少冗余的样板代码。
深入理解dataclass
的各项特性将帮助我们更灵活地运用这一功能,从而提高代码的质量和开发效率。
更多使用技巧请查阅官方文档!
如果你觉得文章还不错,请大家点赞、关注、分享、在看
下,因为这将是我持续输出更多优质文章的最强动力!
https://peps.python.org/pep-0557/
https://realpython.com/python-data-classes/#more-flexible-data-classes
https://docs.python.org/zh-cn/3/library/dataclasses.html#module-contents
https://www.pythontutorial.net/python-oop/python-dataclass/
https://github.com/python-desert/desert
https://glyph.twistedmatrix.com/2016/08/attrs.html
https://github.com/pviafore/RobustPython