【跟小嘉学 Rust 编程】一、Rust 编程基础
【跟小嘉学 Rust 编程】二、Rust 包管理工具使用
【跟小嘉学 Rust 编程】三、Rust 的基本程序概念
【跟小嘉学 Rust 编程】四、理解 Rust 的所有权概念
【跟小嘉学 Rust 编程】五、使用结构体关联结构化数据
【跟小嘉学 Rust 编程】六、枚举和模式匹配
【跟小嘉学 Rust 编程】七、使用包(Packages)、单元包(Crates)和模块(Module)来管理项目
【跟小嘉学 Rust 编程】八、常见的集合
【跟小嘉学 Rust 编程】九、错误处理(Error Handling)
【跟小嘉学 Rust 编程】十一、编写自动化测试
【跟小嘉学 Rust 编程】十二、构建一个命令行程序
【跟小嘉学 Rust 编程】十三、函数式语言特性:迭代器和闭包
【跟小嘉学 Rust 编程】十四、关于 Cargo 和 Crates.io
【跟小嘉学 Rust 编程】十五、智能指针(Smart Point)
【跟小嘉学 Rust 编程】十六、无畏并发(Fearless Concurrency)
【跟小嘉学 Rust 编程】十七、面向对象语言特性
【跟小嘉学 Rust 编程】十八、模式匹配(Patterns and Matching)
【跟小嘉学 Rust 编程】十九、高级特性
【跟小嘉学 Rust 编程】二十、进阶扩展
【跟小嘉学 Rust 编程】二十一、网络编程
【跟小嘉学 Rust 编程】二十三、Cargo 使用指南
【跟小嘉学 Rust 编程】二十四、内联汇编(inline assembly)
【跟小嘉学 Rust 编程】二十五、Rust命令行参数解析库(clap)
【跟小嘉学 Rust 编程】二十六、Rust的序列化解决方案(Serde)
本章节讲解的 Rust 序列化解决方案是以 serde 为中心的,包含了 serde、serde_json、serde_yaml等第三方库的使用
主要教材参考 《The Rust Programming Language》
主要教材参考 《Rust For Rustaceans》
主要教材参考 《The Rustonomicon》
主要教材参考 《Rust 高级编程》
主要教材参考 《Cargo 指南》
Serde 是建立在 Rust 的 trait 系统之上的,实现 Serialize 和 Deserialize 可以帮助我们实现序列化和反序列化。
支持的数据格式有:JSON、YAML、TOML等等。
cargo add serde --features derive
cargo add serde_json serde_yaml
use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize, Debug)]
struct Point {
x: i32,
y: i32,
}
fn main() {
let point = Point { x: 1, y: 2 };
let serialized = serde_json::to_string(&point).unwrap();
println!("serialized = {}", serialized);
let deserialized: Point = serde_json::from_str(&serialized).unwrap();
println!("deserialized = {:?}", deserialized);
}
执行结果
cargo run
serialized = {"x":1,"y":2}
deserialized = Point { x: 1, y: 2 }
使用属性可以帮助我们自定义 序列化和序列化, 常见的属性包括如下几种
使用给定的名称为结构体和枚举序列化和反序列化。支持单独为序列化或反序列化设置。
重命名结构体的字段或者枚举的变体根据给定约定的情况。可能值包含如下几种
支持单独为序列化或反序列化设置。
在反序列化的过程中遇到未知的自段总是报错。当属性不存在时,默认情况下不会忽略自描述格式的未知自段。
注意:该属性不支持 flatten 结合
在枚举上使用带给顶标记的内部标记枚举表示形式。在将结构体的名称序列化为给定键的自段,位于结构体的所有实际自段的前面。
对枚举使用领接标记的枚举表示,并标记和内容使用给定的字段名。
对枚举使用无标记的枚举表示。
设置序列化和反序列化的 Traint 边界。
为结构体设置默认的反序列化标记。
序列化接口定义
pub trait Serialize {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer;
}
示例:
impl Serialize for i32 {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
serializer.serialize_i32(*self)
}
}
use serde::ser::{Serialize, Serializer, SerializeStruct};
struct Color {
r: u8,
g: u8,
b: u8,
}
impl Serialize for Color {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// 3 is the number of fields in the struct.
let mut state = serializer.serialize_struct("Color", 3)?;
state.serialize_field("r", &self.r)?;
state.serialize_field("g", &self.g)?;
state.serialize_field("b", &self.b)?;
state.end()
}
pub trait Deserialize<'de>: Sized {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>;
}
访问器是由反序列化实现实例化并且传递给反序列化器
use linked_hash_map::LinkedHashMap;
use serde_test::{Token, assert_tokens};
#[test]
fn test_ser_de_empty() {
let map = LinkedHashMap::<char, u32>::new();
assert_tokens(&map, &[
Token::Map { len: Some(0) },
Token::MapEnd,
]);
}
#[test]
fn test_ser_de() {
let mut map = LinkedHashMap::new();
map.insert('b', 20);
map.insert('a', 10);
map.insert('c', 30);
assert_tokens(&map, &[
Token::Map { len: Some(3) },
Token::Char('b'),
Token::I32(20),
Token::Char('a'),
Token::I32(10),
Token::Char('c'),
Token::I32(30),
Token::MapEnd,
]);
}
在序列化过程中,Serialize 将 Rust 数据结构映射到 Serde 到数据模型中。在反序列化期间,反序列化器将输入数据映射到 Serde 数据模型中,反序列化和访问器 trait 将数据模型映射到数据接结构中,这些步骤都可能失败
use std;
use std::fmt::{self, Display};
use serde::{de, ser};
pub type Result<T> = std::result::Result<T, Error>;
// This is a bare-bones implementation. A real library would provide additional
// information in its error type, for example the line and column at which the
// error occurred, the byte offset into the input, or the current key being
// processed.
#[derive(Debug)]
pub enum Error {
// One or more variants that can be created by data structures through the
// `ser::Error` and `de::Error` traits. For example the Serialize impl for
// Mutex might return an error because the mutex is poisoned, or the
// Deserialize impl for a struct may return an error because a required
// field is missing.
Message(String),
// Zero or more variants that can be created directly by the Serializer and
// Deserializer without going through `ser::Error` and `de::Error`. These
// are specific to the format, in this case JSON.
Eof,
Syntax,
ExpectedBoolean,
ExpectedInteger,
ExpectedString,
ExpectedNull,
ExpectedArray,
ExpectedArrayComma,
ExpectedArrayEnd,
ExpectedMap,
ExpectedMapColon,
ExpectedMapComma,
ExpectedMapEnd,
ExpectedEnum,
TrailingCharacters,
}
impl ser::Error for Error {
fn custom<T: Display>(msg: T) -> Self {
Error::Message(msg.to_string())
}
}
impl de::Error for Error {
fn custom<T: Display>(msg: T) -> Self {
Error::Message(msg.to_string())
}
}
impl Display for Error {
fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
match self {
Error::Message(msg) => formatter.write_str(msg),
Error::Eof => formatter.write_str("unexpected end of input"),
/* and so forth */
}
}
}
impl std::error::Error for Error {}
use serde::{ser, Serialize};
use error::{Error, Result};
pub struct Serializer {
// This string starts empty and JSON is appended as values are serialized.
output: String,
}
// By convention, the public API of a Serde serializer is one or more `to_abc`
// functions such as `to_string`, `to_bytes`, or `to_writer` depending on what
// Rust types the serializer is able to produce as output.
//
// This basic serializer supports only `to_string`.
pub fn to_string<T>(value: &T) -> Result<String>
where
T: Serialize,
{
let mut serializer = Serializer {
output: String::new(),
};
value.serialize(&mut serializer)?;
Ok(serializer.output)
}
impl<'a> ser::Serializer for &'a mut Serializer {
// The output type produced by this `Serializer` during successful
// serialization. Most serializers that produce text or binary output should
// set `Ok = ()` and serialize into an `io::Write` or buffer contained
// within the `Serializer` instance, as happens here. Serializers that build
// in-memory data structures may be simplified by using `Ok` to propagate
// the data structure around.
type Ok = ();
// The error type when some error occurs during serialization.
type Error = Error;
// Associated types for keeping track of additional state while serializing
// compound data structures like sequences and maps. In this case no
// additional state is required beyond what is already stored in the
// Serializer struct.
type SerializeSeq = Self;
type SerializeTuple = Self;
type SerializeTupleStruct = Self;
type SerializeTupleVariant = Self;
type SerializeMap = Self;
type SerializeStruct = Self;
type SerializeStructVariant = Self;
// Here we go with the simple methods. The following 12 methods receive one
// of the primitive types of the data model and map it to JSON by appending
// into the output string.
fn serialize_bool(self, v: bool) -> Result<()> {
self.output += if v { "true" } else { "false" };
Ok(())
}
// JSON does not distinguish between different sizes of integers, so all
// signed integers will be serialized the same and all unsigned integers
// will be serialized the same. Other formats, especially compact binary
// formats, may need independent logic for the different sizes.
fn serialize_i8(self, v: i8) -> Result<()> {
self.serialize_i64(i64::from(v))
}
fn serialize_i16(self, v: i16) -> Result<()> {
self.serialize_i64(i64::from(v))
}
fn serialize_i32(self, v: i32) -> Result<()> {
self.serialize_i64(i64::from(v))
}
// Not particularly efficient but this is example code anyway. A more
// performant approach would be to use the `itoa` crate.
fn serialize_i64(self, v: i64) -> Result<()> {
self.output += &v.to_string();
Ok(())
}
fn serialize_u8(self, v: u8) -> Result<()> {
self.serialize_u64(u64::from(v))
}
fn serialize_u16(self, v: u16) -> Result<()> {
self.serialize_u64(u64::from(v))
}
fn serialize_u32(self, v: u32) -> Result<()> {
self.serialize_u64(u64::from(v))
}
fn serialize_u64(self, v: u64) -> Result<()> {
self.output += &v.to_string();
Ok(())
}
fn serialize_f32(self, v: f32) -> Result<()> {
self.serialize_f64(f64::from(v))
}
fn serialize_f64(self, v: f64) -> Result<()> {
self.output += &v.to_string();
Ok(())
}
// Serialize a char as a single-character string. Other formats may
// represent this differently.
fn serialize_char(self, v: char) -> Result<()> {
self.serialize_str(&v.to_string())
}
// This only works for strings that don't require escape sequences but you
// get the idea. For example it would emit invalid JSON if the input string
// contains a '"' character.
fn serialize_str(self, v: &str) -> Result<()> {
self.output += "\"";
self.output += v;
self.output += "\"";
Ok(())
}
// Serialize a byte array as an array of bytes. Could also use a base64
// string here. Binary formats will typically represent byte arrays more
// compactly.
fn serialize_bytes(self, v: &[u8]) -> Result<()> {
use serde::ser::SerializeSeq;
let mut seq = self.serialize_seq(Some(v.len()))?;
for byte in v {
seq.serialize_element(byte)?;
}
seq.end()
}
// An absent optional is represented as the JSON `null`.
fn serialize_none(self) -> Result<()> {
self.serialize_unit()
}
// A present optional is represented as just the contained value. Note that
// this is a lossy representation. For example the values `Some(())` and
// `None` both serialize as just `null`. Unfortunately this is typically
// what people expect when working with JSON. Other formats are encouraged
// to behave more intelligently if possible.
fn serialize_some<T>(self, value: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
value.serialize(self)
}
// In Serde, unit means an anonymous value containing no data. Map this to
// JSON as `null`.
fn serialize_unit(self) -> Result<()> {
self.output += "null";
Ok(())
}
// Unit struct means a named value containing no data. Again, since there is
// no data, map this to JSON as `null`. There is no need to serialize the
// name in most formats.
fn serialize_unit_struct(self, _name: &'static str) -> Result<()> {
self.serialize_unit()
}
// When serializing a unit variant (or any other kind of variant), formats
// can choose whether to keep track of it by index or by name. Binary
// formats typically use the index of the variant and human-readable formats
// typically use the name.
fn serialize_unit_variant(
self,
_name: &'static str,
_variant_index: u32,
variant: &'static str,
) -> Result<()> {
self.serialize_str(variant)
}
// As is done here, serializers are encouraged to treat newtype structs as
// insignificant wrappers around the data they contain.
fn serialize_newtype_struct<T>(
self,
_name: &'static str,
value: &T,
) -> Result<()>
where
T: ?Sized + Serialize,
{
value.serialize(self)
}
// Note that newtype variant (and all of the other variant serialization
// methods) refer exclusively to the "externally tagged" enum
// representation.
//
// Serialize this to JSON in externally tagged form as `{ NAME: VALUE }`.
fn serialize_newtype_variant<T>(
self,
_name: &'static str,
_variant_index: u32,
variant: &'static str,
value: &T,
) -> Result<()>
where
T: ?Sized + Serialize,
{
self.output += "{";
variant.serialize(&mut *self)?;
self.output += ":";
value.serialize(&mut *self)?;
self.output += "}";
Ok(())
}
// Now we get to the serialization of compound types.
//
// The start of the sequence, each value, and the end are three separate
// method calls. This one is responsible only for serializing the start,
// which in JSON is `[`.
//
// The length of the sequence may or may not be known ahead of time. This
// doesn't make a difference in JSON because the length is not represented
// explicitly in the serialized form. Some serializers may only be able to
// support sequences for which the length is known up front.
fn serialize_seq(self, _len: Option<usize>) -> Result<Self::SerializeSeq> {
self.output += "[";
Ok(self)
}
// Tuples look just like sequences in JSON. Some formats may be able to
// represent tuples more efficiently by omitting the length, since tuple
// means that the corresponding `Deserialize implementation will know the
// length without needing to look at the serialized data.
fn serialize_tuple(self, len: usize) -> Result<Self::SerializeTuple> {
self.serialize_seq(Some(len))
}
// Tuple structs look just like sequences in JSON.
fn serialize_tuple_struct(
self,
_name: &'static str,
len: usize,
) -> Result<Self::SerializeTupleStruct> {
self.serialize_seq(Some(len))
}
// Tuple variants are represented in JSON as `{ NAME: [DATA...] }`. Again
// this method is only responsible for the externally tagged representation.
fn serialize_tuple_variant(
self,
_name: &'static str,
_variant_index: u32,
variant: &'static str,
_len: usize,
) -> Result<Self::SerializeTupleVariant> {
self.output += "{";
variant.serialize(&mut *self)?;
self.output += ":[";
Ok(self)
}
// Maps are represented in JSON as `{ K: V, K: V, ... }`.
fn serialize_map(self, _len: Option<usize>) -> Result<Self::SerializeMap> {
self.output += "{";
Ok(self)
}
// Structs look just like maps in JSON. In particular, JSON requires that we
// serialize the field names of the struct. Other formats may be able to
// omit the field names when serializing structs because the corresponding
// Deserialize implementation is required to know what the keys are without
// looking at the serialized data.
fn serialize_struct(
self,
_name: &'static str,
len: usize,
) -> Result<Self::SerializeStruct> {
self.serialize_map(Some(len))
}
// Struct variants are represented in JSON as `{ NAME: { K: V, ... } }`.
// This is the externally tagged representation.
fn serialize_struct_variant(
self,
_name: &'static str,
_variant_index: u32,
variant: &'static str,
_len: usize,
) -> Result<Self::SerializeStructVariant> {
self.output += "{";
variant.serialize(&mut *self)?;
self.output += ":{";
Ok(self)
}
}
// The following 7 impls deal with the serialization of compound types like
// sequences and maps. Serialization of such types is begun by a Serializer
// method and followed by zero or more calls to serialize individual elements of
// the compound type and one call to end the compound type.
//
// This impl is SerializeSeq so these methods are called after `serialize_seq`
// is called on the Serializer.
impl<'a> ser::SerializeSeq for &'a mut Serializer {
// Must match the `Ok` type of the serializer.
type Ok = ();
// Must match the `Error` type of the serializer.
type Error = Error;
// Serialize a single element of the sequence.
fn serialize_element<T>(&mut self, value: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
if !self.output.ends_with('[') {
self.output += ",";
}
value.serialize(&mut **self)
}
// Close the sequence.
fn end(self) -> Result<()> {
self.output += "]";
Ok(())
}
}
// Same thing but for tuples.
impl<'a> ser::SerializeTuple for &'a mut Serializer {
type Ok = ();
type Error = Error;
fn serialize_element<T>(&mut self, value: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
if !self.output.ends_with('[') {
self.output += ",";
}
value.serialize(&mut **self)
}
fn end(self) -> Result<()> {
self.output += "]";
Ok(())
}
}
// Same thing but for tuple structs.
impl<'a> ser::SerializeTupleStruct for &'a mut Serializer {
type Ok = ();
type Error = Error;
fn serialize_field<T>(&mut self, value: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
if !self.output.ends_with('[') {
self.output += ",";
}
value.serialize(&mut **self)
}
fn end(self) -> Result<()> {
self.output += "]";
Ok(())
}
}
// Tuple variants are a little different. Refer back to the
// `serialize_tuple_variant` method above:
//
// self.output += "{";
// variant.serialize(&mut *self)?;
// self.output += ":[";
//
// So the `end` method in this impl is responsible for closing both the `]` and
// the `}`.
impl<'a> ser::SerializeTupleVariant for &'a mut Serializer {
type Ok = ();
type Error = Error;
fn serialize_field<T>(&mut self, value: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
if !self.output.ends_with('[') {
self.output += ",";
}
value.serialize(&mut **self)
}
fn end(self) -> Result<()> {
self.output += "]}";
Ok(())
}
}
// Some `Serialize` types are not able to hold a key and value in memory at the
// same time so `SerializeMap` implementations are required to support
// `serialize_key` and `serialize_value` individually.
//
// There is a third optional method on the `SerializeMap` trait. The
// `serialize_entry` method allows serializers to optimize for the case where
// key and value are both available simultaneously. In JSON it doesn't make a
// difference so the default behavior for `serialize_entry` is fine.
impl<'a> ser::SerializeMap for &'a mut Serializer {
type Ok = ();
type Error = Error;
// The Serde data model allows map keys to be any serializable type. JSON
// only allows string keys so the implementation below will produce invalid
// JSON if the key serializes as something other than a string.
//
// A real JSON serializer would need to validate that map keys are strings.
// This can be done by using a different Serializer to serialize the key
// (instead of `&mut **self`) and having that other serializer only
// implement `serialize_str` and return an error on any other data type.
fn serialize_key<T>(&mut self, key: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
if !self.output.ends_with('{') {
self.output += ",";
}
key.serialize(&mut **self)
}
// It doesn't make a difference whether the colon is printed at the end of
// `serialize_key` or at the beginning of `serialize_value`. In this case
// the code is a bit simpler having it here.
fn serialize_value<T>(&mut self, value: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
self.output += ":";
value.serialize(&mut **self)
}
fn end(self) -> Result<()> {
self.output += "}";
Ok(())
}
}
// Structs are like maps in which the keys are constrained to be compile-time
// constant strings.
impl<'a> ser::SerializeStruct for &'a mut Serializer {
type Ok = ();
type Error = Error;
fn serialize_field<T>(&mut self, key: &'static str, value: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
if !self.output.ends_with('{') {
self.output += ",";
}
key.serialize(&mut **self)?;
self.output += ":";
value.serialize(&mut **self)
}
fn end(self) -> Result<()> {
self.output += "}";
Ok(())
}
}
// Similar to `SerializeTupleVariant`, here the `end` method is responsible for
// closing both of the curly braces opened by `serialize_struct_variant`.
impl<'a> ser::SerializeStructVariant for &'a mut Serializer {
type Ok = ();
type Error = Error;
fn serialize_field<T>(&mut self, key: &'static str, value: &T) -> Result<()>
where
T: ?Sized + Serialize,
{
if !self.output.ends_with('{') {
self.output += ",";
}
key.serialize(&mut **self)?;
self.output += ":";
value.serialize(&mut **self)
}
fn end(self) -> Result<()> {
self.output += "}}";
Ok(())
}
}
#[test]
fn test_struct() {
#[derive(Serialize)]
struct Test {
int: u32,
seq: Vec<&'static str>,
}
let test = Test {
int: 1,
seq: vec!["a", "b"],
};
let expected = r#"{"int":1,"seq":["a","b"]}"#;
assert_eq!(to_string(&test).unwrap(), expected);
}
#[test]
fn test_enum() {
#[derive(Serialize)]
enum E {
Unit,
Newtype(u32),
Tuple(u32, u32),
Struct { a: u32 },
}
let u = E::Unit;
let expected = r#""Unit""#;
assert_eq!(to_string(&u).unwrap(), expected);
let n = E::Newtype(1);
let expected = r#"{"Newtype":1}"#;
assert_eq!(to_string(&n).unwrap(), expected);
let t = E::Tuple(1, 2);
let expected = r#"{"Tuple":[1,2]}"#;
assert_eq!(to_string(&t).unwrap(), expected);
let s = E::Struct { a: 1 };
let expected = r#"{"Struct":{"a":1}}"#;
assert_eq!(to_string(&s).unwrap(), expected);
}
use std::ops::{AddAssign, MulAssign, Neg};
use serde::Deserialize;
use serde::de::{
self, DeserializeSeed, EnumAccess, IntoDeserializer, MapAccess, SeqAccess,
VariantAccess, Visitor,
};
use error::{Error, Result};
pub struct Deserializer<'de> {
// This string starts with the input data and characters are truncated off
// the beginning as data is parsed.
input: &'de str,
}
impl<'de> Deserializer<'de> {
// By convention, `Deserializer` constructors are named like `from_xyz`.
// That way basic use cases are satisfied by something like
// `serde_json::from_str(...)` while advanced use cases that require a
// deserializer can make one with `serde_json::Deserializer::from_str(...)`.
pub fn from_str(input: &'de str) -> Self {
Deserializer { input }
}
}
// By convention, the public API of a Serde deserializer is one or more
// `from_xyz` methods such as `from_str`, `from_bytes`, or `from_reader`
// depending on what Rust types the deserializer is able to consume as input.
//
// This basic deserializer supports only `from_str`.
pub fn from_str<'a, T>(s: &'a str) -> Result<T>
where
T: Deserialize<'a>,
{
let mut deserializer = Deserializer::from_str(s);
let t = T::deserialize(&mut deserializer)?;
if deserializer.input.is_empty() {
Ok(t)
} else {
Err(Error::TrailingCharacters)
}
}
// SERDE IS NOT A PARSING LIBRARY. This impl block defines a few basic parsing
// functions from scratch. More complicated formats may wish to use a dedicated
// parsing library to help implement their Serde deserializer.
impl<'de> Deserializer<'de> {
// Look at the first character in the input without consuming it.
fn peek_char(&mut self) -> Result<char> {
self.input.chars().next().ok_or(Error::Eof)
}
// Consume the first character in the input.
fn next_char(&mut self) -> Result<char> {
let ch = self.peek_char()?;
self.input = &self.input[ch.len_utf8()..];
Ok(ch)
}
// Parse the JSON identifier `true` or `false`.
fn parse_bool(&mut self) -> Result<bool> {
if self.input.starts_with("true") {
self.input = &self.input["true".len()..];
Ok(true)
} else if self.input.starts_with("false") {
self.input = &self.input["false".len()..];
Ok(false)
} else {
Err(Error::ExpectedBoolean)
}
}
// Parse a group of decimal digits as an unsigned integer of type T.
//
// This implementation is a bit too lenient, for example `001` is not
// allowed in JSON. Also the various arithmetic operations can overflow and
// panic or return bogus data. But it is good enough for example code!
fn parse_unsigned<T>(&mut self) -> Result<T>
where
T: AddAssign<T> + MulAssign<T> + From<u8>,
{
let mut int = match self.next_char()? {
ch @ '0'..='9' => T::from(ch as u8 - b'0'),
_ => {
return Err(Error::ExpectedInteger);
}
};
loop {
match self.input.chars().next() {
Some(ch @ '0'..='9') => {
self.input = &self.input[1..];
int *= T::from(10);
int += T::from(ch as u8 - b'0');
}
_ => {
return Ok(int);
}
}
}
}
// Parse a possible minus sign followed by a group of decimal digits as a
// signed integer of type T.
fn parse_signed<T>(&mut self) -> Result<T>
where
T: Neg<Output = T> + AddAssign<T> + MulAssign<T> + From<i8>,
{
// Optional minus sign, delegate to `parse_unsigned`, negate if negative.
unimplemented!()
}
// Parse a string until the next '"' character.
//
// Makes no attempt to handle escape sequences. What did you expect? This is
// example code!
fn parse_string(&mut self) -> Result<&'de str> {
if self.next_char()? != '"' {
return Err(Error::ExpectedString);
}
match self.input.find('"') {
Some(len) => {
let s = &self.input[..len];
self.input = &self.input[len + 1..];
Ok(s)
}
None => Err(Error::Eof),
}
}
}
impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
type Error = Error;
// Look at the input data to decide what Serde data model type to
// deserialize as. Not all data formats are able to support this operation.
// Formats that support `deserialize_any` are known as self-describing.
fn deserialize_any<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
match self.peek_char()? {
'n' => self.deserialize_unit(visitor),
't' | 'f' => self.deserialize_bool(visitor),
'"' => self.deserialize_str(visitor),
'0'..='9' => self.deserialize_u64(visitor),
'-' => self.deserialize_i64(visitor),
'[' => self.deserialize_seq(visitor),
'{' => self.deserialize_map(visitor),
_ => Err(Error::Syntax),
}
}
// Uses the `parse_bool` parsing function defined above to read the JSON
// identifier `true` or `false` from the input.
//
// Parsing refers to looking at the input and deciding that it contains the
// JSON value `true` or `false`.
//
// Deserialization refers to mapping that JSON value into Serde's data
// model by invoking one of the `Visitor` methods. In the case of JSON and
// bool that mapping is straightforward so the distinction may seem silly,
// but in other cases Deserializers sometimes perform non-obvious mappings.
// For example the TOML format has a Datetime type and Serde's data model
// does not. In the `toml` crate, a Datetime in the input is deserialized by
// mapping it to a Serde data model "struct" type with a special name and a
// single field containing the Datetime represented as a string.
fn deserialize_bool<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_bool(self.parse_bool()?)
}
// The `parse_signed` function is generic over the integer type `T` so here
// it is invoked with `T=i8`. The next 8 methods are similar.
fn deserialize_i8<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_i8(self.parse_signed()?)
}
fn deserialize_i16<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_i16(self.parse_signed()?)
}
fn deserialize_i32<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_i32(self.parse_signed()?)
}
fn deserialize_i64<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_i64(self.parse_signed()?)
}
fn deserialize_u8<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_u8(self.parse_unsigned()?)
}
fn deserialize_u16<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_u16(self.parse_unsigned()?)
}
fn deserialize_u32<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_u32(self.parse_unsigned()?)
}
fn deserialize_u64<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_u64(self.parse_unsigned()?)
}
// Float parsing is stupidly hard.
fn deserialize_f32<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
unimplemented!()
}
// Float parsing is stupidly hard.
fn deserialize_f64<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
unimplemented!()
}
// The `Serializer` implementation on the previous page serialized chars as
// single-character strings so handle that representation here.
fn deserialize_char<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
// Parse a string, check that it is one character, call `visit_char`.
unimplemented!()
}
// Refer to the "Understanding deserializer lifetimes" page for information
// about the three deserialization flavors of strings in Serde.
fn deserialize_str<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_borrowed_str(self.parse_string()?)
}
fn deserialize_string<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
self.deserialize_str(visitor)
}
// The `Serializer` implementation on the previous page serialized byte
// arrays as JSON arrays of bytes. Handle that representation here.
fn deserialize_bytes<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
unimplemented!()
}
fn deserialize_byte_buf<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
unimplemented!()
}
// An absent optional is represented as the JSON `null` and a present
// optional is represented as just the contained value.
//
// As commented in `Serializer` implementation, this is a lossy
// representation. For example the values `Some(())` and `None` both
// serialize as just `null`. Unfortunately this is typically what people
// expect when working with JSON. Other formats are encouraged to behave
// more intelligently if possible.
fn deserialize_option<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
if self.input.starts_with("null") {
self.input = &self.input["null".len()..];
visitor.visit_none()
} else {
visitor.visit_some(self)
}
}
// In Serde, unit means an anonymous value containing no data.
fn deserialize_unit<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
if self.input.starts_with("null") {
self.input = &self.input["null".len()..];
visitor.visit_unit()
} else {
Err(Error::ExpectedNull)
}
}
// Unit struct means a named value containing no data.
fn deserialize_unit_struct<V>(
self,
_name: &'static str,
visitor: V,
) -> Result<V::Value>
where
V: Visitor<'de>,
{
self.deserialize_unit(visitor)
}
// As is done here, serializers are encouraged to treat newtype structs as
// insignificant wrappers around the data they contain. That means not
// parsing anything other than the contained value.
fn deserialize_newtype_struct<V>(
self,
_name: &'static str,
visitor: V,
) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_newtype_struct(self)
}
// Deserialization of compound types like sequences and maps happens by
// passing the visitor an "Access" object that gives it the ability to
// iterate through the data contained in the sequence.
fn deserialize_seq<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
// Parse the opening bracket of the sequence.
if self.next_char()? == '[' {
// Give the visitor access to each element of the sequence.
let value = visitor.visit_seq(CommaSeparated::new(self))?;
// Parse the closing bracket of the sequence.
if self.next_char()? == ']' {
Ok(value)
} else {
Err(Error::ExpectedArrayEnd)
}
} else {
Err(Error::ExpectedArray)
}
}
// Tuples look just like sequences in JSON. Some formats may be able to
// represent tuples more efficiently.
//
// As indicated by the length parameter, the `Deserialize` implementation
// for a tuple in the Serde data model is required to know the length of the
// tuple before even looking at the input data.
fn deserialize_tuple<V>(self, _len: usize, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
self.deserialize_seq(visitor)
}
// Tuple structs look just like sequences in JSON.
fn deserialize_tuple_struct<V>(
self,
_name: &'static str,
_len: usize,
visitor: V,
) -> Result<V::Value>
where
V: Visitor<'de>,
{
self.deserialize_seq(visitor)
}
// Much like `deserialize_seq` but calls the visitors `visit_map` method
// with a `MapAccess` implementation, rather than the visitor's `visit_seq`
// method with a `SeqAccess` implementation.
fn deserialize_map<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
// Parse the opening brace of the map.
if self.next_char()? == '{' {
// Give the visitor access to each entry of the map.
let value = visitor.visit_map(CommaSeparated::new(self))?;
// Parse the closing brace of the map.
if self.next_char()? == '}' {
Ok(value)
} else {
Err(Error::ExpectedMapEnd)
}
} else {
Err(Error::ExpectedMap)
}
}
// Structs look just like maps in JSON.
//
// Notice the `fields` parameter - a "struct" in the Serde data model means
// that the `Deserialize` implementation is required to know what the fields
// are before even looking at the input data. Any key-value pairing in which
// the fields cannot be known ahead of time is probably a map.
fn deserialize_struct<V>(
self,
_name: &'static str,
_fields: &'static [&'static str],
visitor: V,
) -> Result<V::Value>
where
V: Visitor<'de>,
{
self.deserialize_map(visitor)
}
fn deserialize_enum<V>(
self,
_name: &'static str,
_variants: &'static [&'static str],
visitor: V,
) -> Result<V::Value>
where
V: Visitor<'de>,
{
if self.peek_char()? == '"' {
// Visit a unit variant.
visitor.visit_enum(self.parse_string()?.into_deserializer())
} else if self.next_char()? == '{' {
// Visit a newtype variant, tuple variant, or struct variant.
let value = visitor.visit_enum(Enum::new(self))?;
// Parse the matching close brace.
if self.next_char()? == '}' {
Ok(value)
} else {
Err(Error::ExpectedMapEnd)
}
} else {
Err(Error::ExpectedEnum)
}
}
// An identifier in Serde is the type that identifies a field of a struct or
// the variant of an enum. In JSON, struct fields and enum variants are
// represented as strings. In other formats they may be represented as
// numeric indices.
fn deserialize_identifier<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
self.deserialize_str(visitor)
}
// Like `deserialize_any` but indicates to the `Deserializer` that it makes
// no difference which `Visitor` method is called because the data is
// ignored.
//
// Some deserializers are able to implement this more efficiently than
// `deserialize_any`, for example by rapidly skipping over matched
// delimiters without paying close attention to the data in between.
//
// Some formats are not able to implement this at all. Formats that can
// implement `deserialize_any` and `deserialize_ignored_any` are known as
// self-describing.
fn deserialize_ignored_any<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
self.deserialize_any(visitor)
}
}
// In order to handle commas correctly when deserializing a JSON array or map,
// we need to track whether we are on the first element or past the first
// element.
struct CommaSeparated<'a, 'de: 'a> {
de: &'a mut Deserializer<'de>,
first: bool,
}
impl<'a, 'de> CommaSeparated<'a, 'de> {
fn new(de: &'a mut Deserializer<'de>) -> Self {
CommaSeparated {
de,
first: true,
}
}
}
// `SeqAccess` is provided to the `Visitor` to give it the ability to iterate
// through elements of the sequence.
impl<'de, 'a> SeqAccess<'de> for CommaSeparated<'a, 'de> {
type Error = Error;
fn next_element_seed<T>(&mut self, seed: T) -> Result<Option<T::Value>>
where
T: DeserializeSeed<'de>,
{
// Check if there are no more elements.
if self.de.peek_char()? == ']' {
return Ok(None);
}
// Comma is required before every element except the first.
if !self.first && self.de.next_char()? != ',' {
return Err(Error::ExpectedArrayComma);
}
self.first = false;
// Deserialize an array element.
seed.deserialize(&mut *self.de).map(Some)
}
}
// `MapAccess` is provided to the `Visitor` to give it the ability to iterate
// through entries of the map.
impl<'de, 'a> MapAccess<'de> for CommaSeparated<'a, 'de> {
type Error = Error;
fn next_key_seed<K>(&mut self, seed: K) -> Result<Option<K::Value>>
where
K: DeserializeSeed<'de>,
{
// Check if there are no more entries.
if self.de.peek_char()? == '}' {
return Ok(None);
}
// Comma is required before every entry except the first.
if !self.first && self.de.next_char()? != ',' {
return Err(Error::ExpectedMapComma);
}
self.first = false;
// Deserialize a map key.
seed.deserialize(&mut *self.de).map(Some)
}
fn next_value_seed<V>(&mut self, seed: V) -> Result<V::Value>
where
V: DeserializeSeed<'de>,
{
// It doesn't make a difference whether the colon is parsed at the end
// of `next_key_seed` or at the beginning of `next_value_seed`. In this
// case the code is a bit simpler having it here.
if self.de.next_char()? != ':' {
return Err(Error::ExpectedMapColon);
}
// Deserialize a map value.
seed.deserialize(&mut *self.de)
}
}
struct Enum<'a, 'de: 'a> {
de: &'a mut Deserializer<'de>,
}
impl<'a, 'de> Enum<'a, 'de> {
fn new(de: &'a mut Deserializer<'de>) -> Self {
Enum { de }
}
}
// `EnumAccess` is provided to the `Visitor` to give it the ability to determine
// which variant of the enum is supposed to be deserialized.
//
// Note that all enum deserialization methods in Serde refer exclusively to the
// "externally tagged" enum representation.
impl<'de, 'a> EnumAccess<'de> for Enum<'a, 'de> {
type Error = Error;
type Variant = Self;
fn variant_seed<V>(self, seed: V) -> Result<(V::Value, Self::Variant)>
where
V: DeserializeSeed<'de>,
{
// The `deserialize_enum` method parsed a `{` character so we are
// currently inside of a map. The seed will be deserializing itself from
// the key of the map.
let val = seed.deserialize(&mut *self.de)?;
// Parse the colon separating map key from value.
if self.de.next_char()? == ':' {
Ok((val, self))
} else {
Err(Error::ExpectedMapColon)
}
}
}
// `VariantAccess` is provided to the `Visitor` to give it the ability to see
// the content of the single variant that it decided to deserialize.
impl<'de, 'a> VariantAccess<'de> for Enum<'a, 'de> {
type Error = Error;
// If the `Visitor` expected this variant to be a unit variant, the input
// should have been the plain string case handled in `deserialize_enum`.
fn unit_variant(self) -> Result<()> {
Err(Error::ExpectedString)
}
// Newtype variants are represented in JSON as `{ NAME: VALUE }` so
// deserialize the value here.
fn newtype_variant_seed<T>(self, seed: T) -> Result<T::Value>
where
T: DeserializeSeed<'de>,
{
seed.deserialize(self.de)
}
// Tuple variants are represented in JSON as `{ NAME: [DATA...] }` so
// deserialize the sequence of data here.
fn tuple_variant<V>(self, _len: usize, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
de::Deserializer::deserialize_seq(self.de, visitor)
}
// Struct variants are represented in JSON as `{ NAME: { K: V, ... } }` so
// deserialize the inner map here.
fn struct_variant<V>(
self,
_fields: &'static [&'static str],
visitor: V,
) -> Result<V::Value>
where
V: Visitor<'de>,
{
de::Deserializer::deserialize_map(self.de, visitor)
}
}
#[test]
fn test_struct() {
#[derive(Deserialize, PartialEq, Debug)]
struct Test {
int: u32,
seq: Vec<String>,
}
let j = r#"{"int":1,"seq":["a","b"]}"#;
let expected = Test {
int: 1,
seq: vec!["a".to_owned(), "b".to_owned()],
};
assert_eq!(expected, from_str(j).unwrap());
}
#[test]
fn test_enum() {
#[derive(Deserialize, PartialEq, Debug)]
enum E {
Unit,
Newtype(u32),
Tuple(u32, u32),
Struct { a: u32 },
}
let j = r#""Unit""#;
let expected = E::Unit;
assert_eq!(expected, from_str(j).unwrap());
let j = r#"{"Newtype":1}"#;
let expected = E::Newtype(1);
assert_eq!(expected, from_str(j).unwrap());
let j = r#"{"Tuple":[1,2]}"#;
let expected = E::Tuple(1, 2);
assert_eq!(expected, from_str(j).unwrap());
let j = r#"{"Struct":{"a":1}}"#;
let expected = E::Struct { a: 1 };
assert_eq!(expected, from_str(j).unwrap());
}
Deserialize 和 Deserializer 拥有一个生命周期:'de
。
trait Deserialize<'de>: Sized {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>;
}
生命周期能够安全地跨各种数据执行有效的零拷贝反序列化,在 Rust 以外的语言中是不可能的或不安全的。
#[derive(Deserialize)]
struct User<'a> {
id: u32,
name: &'a str,
screen_name: &'a str,
location: &'a str,
}
零拷贝反序列化意味着反序列化到一个数据结构中,保存输入的字符串或字节数组中借用字符串或字节数组数据。
避免为每个单独的字段分配内存来存储字符串,然后将字符串数据从输入复制到新分配的字段
包含有两种 trait 边界
visit_str 接受一个 &str
visit_borrowed_str accepts 接收一个 &'de str
visit_string 接收一个 String。