序列化(Serialization)与反序列化(Deserialization)是RESTful API 开发中绕不开的一环,开发时,序列化与反序列化的功能实现中通常也会包含数据校验(Validation)相关的业务逻辑。 本文介绍一个强大的序列化处理框架Marshmallow。
Marshmallow 是一个强大的轮子,很好的实现了 object -> dict , objects -> list, string -> dict和 string -> list。
官网
官网
API
Marshmallow的使用,将从下面几个方面展开,在开始之前,首先需要用于序列化和反序列化的类
class tt():
def __init__(self,name,age,tchild):
self.name=name
self.age=age
self.tchild=tchild#tchild 实例
class tchild():
def __init__(self,location):
self.location=location
Schema
要对一个类(记为Class_A)进行序列化和反序列化,首先要创建一个与之对应的类(记Class_A’),负责实现Class_A的序列化、序列化和数据校验等,Class_A’就是schema,即:
Schema是序列化功能的载体,每个需要被序列化或反序列化的类,都要设计一个相应的Schema,以实现相关功能。Schema中的字段,是被序列化的类的映射,注意Class_A与Class_A’字段名称需要一致如:
class city(Schema):
location=fields.Str()
@post_load
def make(self,data,**kwargs):
return tchild(**data)
class test(Schema):
name=fields.Str()
age=fields.Integer()
tchild=fields.List(fields.Nested(city()))
@post_load
def make(self,data,**kwargs):
return tt(**data)
序列化
序列化使用schema中的dump()或dumps()方法,其中,dump() 方法实现obj -> dict,dumps()方法实现 obj -> string,由于Flask能直接序列化dict,所以通常Flask与Marshmallow配合序列化时,用 dump()方法即可。
from marshmallow import Schema
from marshmallow import fields,post_load
import marshmallow
class tt():
def __init__(self,name,age,tchild):
self.name=name
self.age=age
self.tchild=tchild
class tchild():
def __init__(self,location):
self.location=location
class city(Schema):
location=fields.Str()
class test(Schema):
name=fields.Str()
age=fields.Integer()
tchild=fields.List(fields.Nested(city()))
data={'name':"xx","age":100,'tchild':[{'location':"dl"},{'location':"dl"}]}
ss=test()
print(ss.dumps(data))
反序列化
反序列化基于schema中的load()或loads()方法,默认情况下,load()方法将一个传入的dict,结合schema的约定,再转换为一个dict,而loads()方法的传入参数是json格式的string,同样将传入数据转换为符合规范的dict。由于调用load()或loads()方法时,会执行下面提到的数据校验,所以在开发RESTful API时,对传入数据执行load()或loads()方法是必要的。load()方法使用如下:
from marshmallow import Schema
from marshmallow import fields,post_load
import marshmallow
class tt():
def __init__(self,name,age,tchild):
self.name=name
self.age=age
self.tchild=tchild
class tchild():
def __init__(self,location):
self.location=location
class city(Schema):
location=fields.Str()
@post_load
def make(self,data,**kwargs):
return tchild(**data)
class test(Schema):
name=fields.Str()
age=fields.Integer()
tchild=fields.List(fields.Nested(city()))
@post_load
def make(self,data,**kwargs):
return tt(**data)
data={'name':"xx","age":100,'tchild':[{'location':"dl"},{'location':"dl"}]}
ss=test()
print(ss.load(data))
对反序列化而言,将传入的dict变成object更加有意义。在Marshmallow中,dict -> object的方法需要自己实现,然后在该方法前面加上一个decoration:post_load即可,即:
from marshmallow import Schema
from marshmallow import fields,post_load
import marshmallow
class tt():
def __init__(self,name,age,tchild):
self.name=name
self.age=age
self.tchild=tchild
class tchild():
def __init__(self,location):
self.location=location
class city(Schema):
location=fields.Str()
@post_load
def make(self,data,**kwargs):
return tchild(**data)
class test(Schema):
name=fields.Str()
age=fields.Integer()
tchild=fields.List(fields.Nested(city()))
@post_load
def make(self,data,**kwargs):
return tt(**data)
data={'name':"xx","age":100,'tchild':[{'location':"dl"},{'location':"dl"}]}
ss=test()
print(ss.dumps(data))
s=ss.load(data)
print(s.tchild)
print(ss.dump(s))
#
# class ArtistSchema(Schema):
# name = fields.Str()
#
#
# class AlbumSchema(Schema):
# title = fields.Str()
# release_date = fields.Date()
# artist = fields.Nested(ArtistSchema())
#
#
# bowie = dict(name=["David Bowie",'jj'])
# album = dict(artist=bowie, title="Hunky Dory", release_date=date(1971, 12, 17))
#
# schema = AlbumSchema()
# result = schema.dump(album)
# pprint(result, indent=2)
# pprint(schema.load(result))
# class User:
# def __init__(self, name, email):
# self.name = name
# self.email = email
# self.created_at = dt.datetime.now()
#
# def __repr__(self):
# return "".format(self=self)
#
# from marshmallow import Schema, fields
#
#
# class UserSchema(Schema):
# name = fields.Str()
# email = fields.Email()
# created_at = fields.DateTime()
#
# from marshmallow import ValidationError
#
# try:
# result = UserSchema().load({"name": "John", "email": "foo"})
# except ValidationError as err:
# print(err.messages) # => {"email": ['"foo" is not a valid email address.']}
# valid_data = err.valid_data # => {"name": "John"}
验证
import datetime as dt
class User:
def __init__(self, name, email):
self.name = name
self.email = email
self.created_at = dt.datetime.now()
def __repr__(self):
return "" .format(self=self)
from marshmallow import Schema, fields
class UserSchema(Schema):
name = fields.Str()
email = fields.Email()
created_at = fields.DateTime()
验证demo
from marshmallow import ValidationError
try:
result = UserSchema().load({"name": "John", "email": "foo"})
except ValidationError as err:
print(err.messages) # => {"email": ['"foo" is not a valid email address.']}
valid_data = err.valid_data # => {"name": "John"}
集合验证demo
from marshmallow import ValidationError
class BandMemberSchema(Schema):
name = fields.String(required=True)
email = fields.Email()
user_data = [
{"email": "[email protected]", "name": "Mick"},
{"email": "invalid", "name": "Invalid"}, # invalid email
{"email": "[email protected]", "name": "Keith"},
{"email": "[email protected]"}, # missing "name"
]
try:
BandMemberSchema(many=True).load(user_data)
except ValidationError as err:
err.messages
# {1: {'email': ['"invalid" is not a valid email address.']},
# 3: {'name': ['Missing data for required field.']}}
您可以对一个可调用的字段(定义了_call__的函数、lambda或对象)执行额外的验证。
from marshmallow import ValidationError
class ValidatedUserSchema(UserSchema):
# NOTE: This is a contrived example.
# You could use marshmallow.validate.Range instead of an anonymous function here
age = fields.Number(validate=lambda n: 18 <= n <= 40)
in_data = {"name": "Mick", "email": "[email protected]", "age": 71}
try:
result = ValidatedUserSchema().load(in_data)
except ValidationError as err:
err.messages # => {'age': ['Validator (71.0) is False']}
from marshmallow import Schema, fields, ValidationError
def validate_quantity(n):
if n < 0:
raise ValidationError("Quantity must be greater than 0.")
if n > 30:
raise ValidationError("Quantity must not be greater than 30.")
class ItemSchema(Schema):
quantity = fields.Integer(validate=validate_quantity)
in_data = {"quantity": 31}
try:
result = ItemSchema().load(in_data)
except ValidationError as err:
err.messages # => {'quantity': ['Quantity must not be greater than 30.']}
必须的字段 您可以通过传递required=True使字段成为必填字段。如果Schema.load()的输入中缺少该值,则会存储一个错误。
要为必填字段自定义错误消息,请将带有必填键的dict作为字段的error_messages参数传递。
from marshmallow import ValidationError
class UserSchema(Schema):
name = fields.String(required=True)
age = fields.Integer(required=True, error_messages={"required": "Age is required."})
city = fields.String(
required=True,
error_messages={"required": {"message": "City required", "code": 400}},
)
email = fields.Email()
try:
result = UserSchema().load({"email": "[email protected]"})
except ValidationError as err:
err.messages
# {'name': ['Missing data for required field.'],
# 'age': ['Age is required.'],
# 'city': {'message': 'City required', 'code': 400}}
Partial Loading
按照RESTful架构风格的要求,更新数据使用HTTP方法中的PUT或PATCH方法,使用PUT方法时,需要把完整的数据全部传给服务器,使用PATCH方法时,只需把需要改动的部分数据传给服务器即可。因此,当使用PATCH方法时,传入数据存在无法通过Marshmallow 数据校验的风险,为了避免这种情况,需要借助Partial Loading功能。
实现Partial Loadig只要在schema中增加一个partial参数即可:
class UserSchema(Schema):
name = fields.String(required=True)
age = fields.Integer(required=True)
data, errors = UserSchema().load({'age': 42}, partial=True)
#OR UserSchema(partial=True).load({'age': 42})
data, errors # => ({'age': 42}, {})
指定默认的序列化/反序列化值
可以向字段提供用于序列化和反序列化的默认值。
如果在输入数据中没有找到字段,则使用missing来反序列化。同样,如果输入值丢失,则使用默认值进行序列化。
import datetime as dt
class UserSchema(Schema):
id = fields.Str(missing='dd')
birthdate = fields.DateTime(default=dt.datetime(2017, 9, 29))
UserSchema().load({})
# {'id': UUID('337d946c-32cd-11e8-b475-0022192ed31b')}
UserSchema().dump({})
# {'birthdate': '2017-09-29T00:00:00+00:00'}