快速上手
Declaring Schemas
首先创建一个基础的user“模型”(只是为了演示,并不是真正的模型):
import datetime as dt
class User(object):
def __init__(self, name, email):
self.name = name
self.email = email
self.created_at = dt.datetime.now()
def __repr__(self):
return ''.format(self=self)
然后通过定义一个映射属性名称到Field
对象的类创建schema
:
from marshmallow import Schema, fields
class UserSchema(Schema):
name = fields.Str()
email = fields.Email()
created_at = fields.DateTime()
Serializing Objects ("Dumping")
传递对象到创建的schema的dump
方法,返回一个序列化字典对象(和一个错误字典对象,下文讲):
from marshmallow import pprint
user = User(name="Monty", email="[email protected]")
schema = UserSchema()
result = schema.dump(user)
pprint(result.data)
# {"name": "Monty",
# "email": "[email protected]",
# "created_at": "2014-08-17T14:54:16.049594+00:00"}
也可以使用dumps
方法序列化对象为JSON字符串:
json_result = schema.dumps(user)
pprint(json_result.data)
# '{"name": "Monty", "email": "[email protected]", "created_at": "2014-08-17T14:54:16.049594+00:00"}'
Filtering output
使用only
参数指定要序列化输出的字段:
summary_schema = UserSchema(only=('name', 'email'))
summary_schema.dump(user).data
# {"name": "Monty Python", "email": "[email protected]"}
使用exclude
参数指定不进行序列化输出的字段。
Deserializing Objects ("Loading")
dump方法对应的是load
方法,它反序列化一个字典为python数据结构。
load方法默认返回一个fields
字段和反序列化值对应的字典对象:
from pprint import pprint
user_data = {
'created_at': '2014-08-11T05:26:03.869245',
'email': u'[email protected]',
'name': u'Ken'
}
schema = UserSchema()
result = schema.load(user_data)
pprint(result.data)
# {'name': 'Ken',
# 'email': '[email protected]',
# 'created_at': datetime.datetime(2014, 8, 11, 5, 26, 3, 869245)}
Deserializing to Objects
在Schema
子类中定义一个方法并用post_load
装饰,该方法接收一个要反序列化的数据字典返回原始python对象:
from marshmallow import Schema, fields, post_load
class UserSchema(Schema):
name = fields.Str()
email = fields.Email()
created_at = fields.DateTime()
@post_load
def make_user(self, data):
return User(**data)
现在调用load方法将返回一个User对象:
user_data = {
'name': 'Ronnie',
'email': '[email protected]'
}
schema = UserSchema()
result = schema.load(user_data)
result.data # =>
Handling Collections of Objects
可迭代的对象集合也可以进行序列化和反序列化。只需要设置many=True
:
user1 = User(name="Mick", email="[email protected]")
user2 = User(name="Keith", email="[email protected]")
users = [user1, user2]
schema = UserSchema(many=True)
result = schema.dump(users) # OR UserSchema().dump(users, many=True)
result.data
# [{'name': u'Mick',
# 'email': u'[email protected]',
# 'created_at': '2014-08-17T14:58:57.600623+00:00'}
# {'name': u'Keith',
# 'email': u'[email protected]',
# 'created_at': '2014-08-17T14:58:57.600623+00:00'}]
Validation
Schema.load()
和Schema.loads()
返回值的第二个元素是一个验证错误的字典。某些fields例如Email
和URL
内置了验证器:
data, errors = UserSchema().load({'email': 'foo'})
errors # => {'email': ['"foo" is not a valid email address.']}
# OR, equivalently
result = UserSchema().load({'email': 'foo'})
result.errors # => {'email': ['"foo" is not a valid email address.']}
验证集合时,错误字典将基于无效字段的索引作为键:
class BandMemberSchema(Schema):
name = fields.String(required=True)
email = fields.Email()
user_data = [
{'email': '[email protected]', 'name': 'Mick'},
{'email': 'invalid', 'name': 'Invalid'}, # invalid email
{'email': '[email protected]', 'name': 'Keith'},
{'email': '[email protected]'}, # missing "name"
]
result = BandMemberSchema(many=True).load(user_data)
result.errors
# {1: {'email': ['"invalid" is not a valid email address.']},
# 3: {'name': ['Missing data for required field.']}}
通过给fields的validate
参数传递callable对象,可以执行额外的验证:
class ValidatedUserSchema(UserSchema):
# NOTE: This is a contrived example.
# You could use marshmallow.validate.Range instead of an anonymous function here
age = fields.Number(validate=lambda n: 18 <= n <= 40)
in_data = {'name': 'Mick', 'email': '[email protected]', 'age': 71}
result = ValidatedUserSchema().load(in_data)
result.errors # => {'age': ['Validator (71.0) is False']}
验证函数可以返回布尔值或抛出ValidationError
异常。如果是抛出异常,其信息将保存在错误字典中:
from marshmallow import Schema, fields, ValidationError
def validate_quantity(n):
if n < 0:
raise ValidationError('Quantity must be greater than 0.')
if n > 30:
raise ValidationError('Quantity must not be greater than 30.')
class ItemSchema(Schema):
quantity = fields.Integer(validate=validate_quantity)
in_data = {'quantity': 31}
result, errors = ItemSchema().load(in_data)
errors # => {'quantity': ['Quantity must not be greater than 30.']}
Field Validators as Methods
使用validates
装饰器注册方法验证器:
from marshmallow import fields, Schema, validates, ValidationError
class ItemSchema(Schema):
quantity = fields.Integer()
@validates('quantity')
def validate_quantity(self, value):
if value < 0:
raise ValidationError('Quantity must be greater than 0.')
if value > 30:
raise ValidationError('Quantity must not be greater than 30.')
strict Mode
在schema构造器或class Meta
中设置strict=True
,遇到不合法数据时将抛出异常,通过ValidationError.messages
属性可以访问验证错误的字典:
from marshmallow import ValidationError
try:
UserSchema(strict=True).load({'email': 'foo'})
except ValidationError as err:
print(err.messages)# => {'email': ['"foo" is not a valid email address.']}
Required Fields
设置required=True
可以定义一个必要字段,调用Schema.load()
方法时如果字段值缺失将验证失败并保存错误信息。
给error_messages
参数传递一个dict对象可以自定义必要字段的错误信息:
class UserSchema(Schema):
name = fields.String(required=True)
age = fields.Integer(
required=True,
error_messages={'required': 'Age is required.'}
)
city = fields.String(
required=True,
error_messages={'required': {'message': 'City required', 'code': 400}}
)
email = fields.Email()
data, errors = UserSchema().load({'email': '[email protected]'})
errors
# {'name': ['Missing data for required field.'],
# 'age': ['Age is required.'],
# 'city': {'message': 'City required', 'code': 400}}
Partial Loading
通过指定partial
参数,可以忽略某些缺失字段的required
检查:
class UserSchema(Schema):
name = fields.String(required=True)
age = fields.Integer(required=True)
data, errors = UserSchema().load({'age': 42}, partial=('name',))
# OR UserSchema(partial=('name',)).load({'age': 42})
data, errors # => ({'age': 42}, {})
或者设置partial=True
忽略所有缺失字段的required
检查:
class UserSchema(Schema):
name = fields.String(required=True)
age = fields.Integer(required=True)
data, errors = UserSchema().load({'age': 42}, partial=True)
# OR UserSchema(partial=True).load({'age': 42})
data, errors # => ({'age': 42}, {})
Schema.validate
使用Schema.validate()
可以只验证输入数据而不反序列化:
errors = UserSchema().validate({'name': 'Ronnie', 'email': 'invalid-email'})
errors # {'email': ['"invalid-email" is not a valid email address.']}
Specifying Attribute Names
默认情况下schema序列化处理和field名称相同的对象属性。对于属性和field不相同的场景,通过attribute
参数指定field处理哪个属性:
class UserSchema(Schema):
name = fields.String()
email_addr = fields.String(attribute="email")
date_created = fields.DateTime(attribute="created_at")
user = User('Keith', email='[email protected]')
ser = UserSchema()
result, errors = ser.dump(user)
pprint(result)
# {'name': 'Keith',
# 'email_addr': '[email protected]',
# 'date_created': '2014-08-17T14:58:57.600623+00:00'}
Specifying Deserialization Keys
默认情况下schema反序列化处理键和field名称相同的字典。可以通过load_from
参数指定额外处理的字典键值:
class UserSchema(Schema):
name = fields.String()
email = fields.Email(load_from='emailAddress')
data = {
'name': 'Mike',
'emailAddress': '[email protected]'
}
s = UserSchema()
result, errors = s.load(data)
#{'name': u'Mike',
# 'email': '[email protected]'}
Specifying Serialization Keys
如果要序列化输出不想使用field名称作为键,可以通过dump_to
参数指定(和load_from
相反):
class UserSchema(Schema):
name = fields.String(dump_to='TheName')
email = fields.Email(load_from='CamelCasedEmail', dump_to='CamelCasedEmail')
data = {
'name': 'Mike',
'email': '[email protected]'
}
s = UserSchema()
result, errors = s.dump(data)
#{'TheName': u'Mike',
# 'CamelCasedEmail': '[email protected]'}
Refactoring: Implicit Field Creation
当schema中有很多属性时,为每个属性指定field类型会产生大量的重复工作,尤其是大部分属性为原生的python数据类型时。
class Meta
允许开发人员指定序列化哪些属性,Marshmallow会基于属性类型选择合适的field类型:
# 重构UserSchema
class UserSchema(Schema):
uppername = fields.Function(lambda obj: obj.name.upper())
class Meta:
fields = ("name", "email", "created_at", "uppername")
user = User(name="erika", email="[email protected]")
schema = UserSchema()
result = schema.dump(user)
print(result.data)
# {'created_at': '2019-05-20T15:45:27.760000+00:00', 'uppername': 'ERIKA', 'name': 'erika', 'email': '[email protected]'}
除了显式声明的field外,使用additional
选项可以指定还要包含哪些fields。以下代码等同于上面的代码:
class UserSchema(Schema):
uppername = fields.Function(lambda obj: obj.name.upper())
class Meta:
# No need to include 'uppername'
additional = ("name", "email", "created_at")
Ordering Output
设置ordered=True
可以维护序列化输出的field顺序,此时序列化字典为collections.OrderedDict
类型:
from collections import OrderedDict
class UserSchema(Schema):
uppername = fields.Function(lambda obj: obj.name.upper())
class Meta:
fields = ("name", "email", "created_at", "uppername")
ordered = True
u = User('Charlie', '[email protected]')
schema = UserSchema()
result = schema.dump(u)
assert isinstance(result.data, OrderedDict)
# marshmallow's pprint function maintains order
pprint(result.data, indent=2)
# {
# "name": "Charlie",
# "email": "[email protected]",
# "created_at": "2014-10-30T08:27:48.515735+00:00",
# "uppername": "CHARLIE"
# }
"Read-only" and "Write-only" Fields
在web API上下文中,dump_only
和load_only
参数分别类似于只读和只写的概念:
class UserSchema(Schema):
name = fields.Str()
# password is "write-only"
password = fields.Str(load_only=True)
# created_at is "read-only"
created_at = fields.DateTime(dump_only=True)
更多教程
marshmallow之schema嵌套
marshmallow之自定义Field
marshmallow之Schema延伸功能