参考文档 pickle模块
用这个模块 可以创建Python对象的可移植序列化表示。
Create portable serialized representations of Python objects.
There are fundamental differences between the pickle protocols and JSON (JavaScript Object Notation):
JSON is a text serialization format (it outputs unicode text, although most of the time it is then encoded to utf-8), while pickle is a binary serialization format;
JSON is human-readable, while pickle is not;
JSON is interoperable and widely used outside of the Python ecosystem, while pickle is Python-specific;
JSON, by default, can only represent a subset of the Python built-in types, and no custom classes; pickle can represent an extremely large number of Python types (many of them automatically, by clever usage of Python’s introspection facilities; complex cases can be tackled by implementing specific object APIs).
1看一个小例子
import pickle
import json
if __name__ == '__main__':
d1 = dict(zip('frank', range(5)))
print(d1)
json_str = json.dumps(d1)
pickle_str = pickle.dumps(d1)
print(f'json_str: {json_str}')
print(f'pickle_str: {pickle_str}')
结果如下:
{'f': 0, 'r': 1, 'a': 2, 'n': 3, 'k': 4}
json_str: {"f": 0, "r": 1, "a": 2, "n": 3, "k": 4}
pickle_str: b'\x80\x03}q\x00(X\x01\x00\x00\x00fq\x01K\x00X\x01\x00\x00\x00rq\x02K\x01X\x01\x00\x00\x00aq\x03K\x02X\x01\x00\x00\x00nq\x04K\x03X\x01\x00\x00\x00kq\x05K\x04u.'
Process finished with exit code 0
可以看出来,json 序列化后,是人类能够看懂的.而pickle 模块序列化后,就看不懂了,因为是二进制的.
在看一个例子
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
"""
@author: Frank
@contact: [email protected]
@file: test_pickle.py
@time: 2018/7/22 上午9:06
"""
import pickle
import json
class Person:
__tablename__ = 'person'
table_flag = 'online'
def __init__(self, name):
self.name = name
@classmethod
def pickup(cls, *args, **kwargs):
print('pickup() is running.')
kwargs.update({"name": "frank", "hobby": "swim"})
return kwargs
def test_pickle():
# 序列化类
p1 = pickle.dumps(Person)
# 反序列化
P1 = pickle.loads(p1)
# 打印Person类
print(P1)
p2 = P1('frank')
print(p2.pickup())
def test_json():
p1 = json.dumps(Person)
P1 = json.loads(p1)
print(P1)
p2 = P1('frank')
print(p2.pickup())
if __name__ == '__main__':
# test_pickle()
test_json()
报错如下:
TypeError: Object of type 'type' is not JSON serializable
这里就是 type 不可以json 序列化的.
而用 test_pickle() 是可以的.
结果如下:
<class '__main__.Person'>
pickup() is running.
{'name': 'frank', 'hobby': 'swim'}
##### 三. 常用api说明
提供了常用序列化,和反序列化的接口
dumps dump 前一个返回时一个bytes 对象 , 后一个直接序列化到文件里面
loads load 前一个 从二进制bytes对象读取对象, 后一个 从文件中读取对象
pickle.dump(obj, file, protocol=None, *, fix_imports=True)
pickle.dumps(obj, protocol=None, *, fix_imports=True)
pickle.load(file, *, fix_imports=True, encoding=”ASCII”, errors=”strict”)
pickle.loads(bytes_object, *, fix_imports=True, encoding=”ASCII”, errors=”strict”)
如果要有更多的空值, 可以使用下面的两个类 来定制你的 序列化对象
The pickle module exports two classes, Pickler and Unpickler:
如果要对序列化和反序列化进行更多控制,可以分别创建Pickler或Unpickler对象。
pickle 模块定义的异常
The pickle module defines three exceptions:
exception pickle.PickleError
exception pickle.PicklingError
exception pickle.UnpicklingError
来看一个例子
import pickle
class Person:
__tablename__ = 'person'
table_flag = 'online'
def __init__(self, name):
self.name = name
@classmethod
def pickup(cls, *args, **kwargs):
print('pickup() is running.')
# 所有的参数直接返回,不做任何处理.
kwargs.update({"name": "frank", "hobby": "swim"})
return kwargs
@classmethod
def extract(cls, value='frank'):
"""获取写数据库必要数据
:param value: pickup 方法的返回值
:param context: pickup 方法的入参
:return:
"""
print('extract() is running.')
return value
class Serialization:
def __init__(self, obj):
self.myclass = obj
def serialize(self):
with open('pickle.txt', 'wb+') as f:
# 写入 序列化到文件
pickle.dump(self.myclass, f)
def deserialize(self):
# 反序列化 从文件反序列化
with open('pickle.txt', 'rb') as f:
# 读取
data = pickle.load(f)
return data
if __name__ == '__main__':
ser = Serialization(Person)
ser.serialize()
person = ser.deserialize()
print(f'person.table_flag: {person.table_flag}')
print(person.pickup())
print(person.extract())
结果如下:
person.table_flag: online
pickup() is running.
{'name': 'frank', 'hobby': 'swim'}
extract() is running.
frank
这个例子就是把类序列化到文件里面, 之后再从文件中读出来.
这个模块具体有什么用呢? 比如 有一个系统需要动态加载类, (我的意思是类是通过代码生成的,然后要把这个类加载到内存里面)
但是有一天我担心,如果程序突然有意外的bug ,或者其他的情况崩溃了, 而之前加载的类,就会消失了,一旦重启了系统,所有动态生成的类就会消失了, 所以pickle 模块就给我提供非常好用的方法. 可以把类序列化写到文件,或者序列化到二进制bytes 对象. 之后如果系统重启后,我重新 反序列把类读取到内存里面,完成反序列化.
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
"""
@author: Frank
@contact: [email protected]
@file: serialization.py
@time: 2018/7/22 上午12:18
"""
import pickle
class Person:
__tablename__ = 'person'
table_flag = 'online'
def __init__(self, name):
self.name = name
@classmethod
def pickup(cls, *args, **kwargs):
print('pickup() is running.')
# 所有的参数直接返回,不做任何处理.
kwargs.update({"name": "frank", "hobby": "swim"})
return kwargs
@classmethod
def extract(cls, value='frank'):
"""
:param value: pickup 方法的返回值
:return:
"""
print('extract() is running.')
return value
class Serialization:
def __init__(self):
self.myclasses = []
def resigester(self, obj):
self.myclasses.append(obj)
def serialize(self):
# 写入 序列化
pickle_strings = []
for myclass in self.myclasses:
pickle_string = pickle.dumps(myclass)
pickle_strings.append(pickle_string)
return pickle_strings
@staticmethod
def deserialize(bytes_object):
# 反序列化 从文件反序列化
return pickle.loads(bytes_object)
if __name__ == '__main__':
serialization = Serialization()
serialization.resigester(Person)
# 序列化Person 类
strings = serialization.serialize()
# 打印 序列化的结果
print(strings)
for bytes_obj in strings:
# 反序列,得到Person 类
P = serialization.deserialize(bytes_obj)
print(P)
# 构造p1 对象
p1 = P('frank')
print(p1.pickup())
结果如下:
[b'\x80\x03c__main__\nPerson\nq\x00.']
<class '__main__.Person'>
pickup() is running.
{'name': 'frank', 'hobby': 'swim'}
通过Serialization 序列化 Person类,之后有把他反序列出来, 完成序列化, 与反序列化操作.
本文简单介绍了pickle模块的常见用法,常用api , 比较了与pickle 模块的不同. 如果需要特殊定制序列化, 可以使用 接口提供的那两个类.Pickler and Unpickler 这两个类更多的参考官方文档, 一般用的比较少.
https://docs.python.org/3/library/pickle.html