写了一个测试文件,使用不同的pickle压缩协议,发现gzip+3时最小,3次之, 2时最大。
测试代码如下:
import pickle, gzip
import numpy as np
x = np.random.randn(16, 224, 224 ,)
# print(x)
# 定义一个对象
data = {
'name': 'Alice',
'age': 30,
'cities_visited': ['New York', 'Paris', 'Tokyo'],
'datax': x
}
# 使用协议2进行pickle
with open('compressed_pickle2.pkl', 'wb') as f:
pickle.dump(data, f, protocol=2)
# 使用协议3进行pickle
with open('compressed_pickle3.pkl', 'wb') as f:
pickle.dump(data, f, protocol=3)
# 使用协议0进行pickle
with open('compressed_pickle0.pkl', 'wb') as f:
pickle.dump(data, f, protocol=0)
# 使用gzip 和协议2进行pickle
with gzip.open('compressed_picklezip2.pkl', 'wb') as f:
pickle.dump(data, f, protocol=2)
# 使用gzip 和协议3进行pickle
with gzip.open('compressed_picklezip3.pkl', 'wb') as f:
pickle.dump(data, f, protocol=3)
# 从文件中加载pickle数据
with open('compressed_pickle2.pkl', 'rb') as f:
loaded_data = pickle.load(f)
# print(loaded_data)
print(len(loaded_data), "ok")
文件大小ls -l compressed_pickle*:
-rw-r--r-- 1 6945817 2 14 00:01 compressed_pickle0.pkl
-rw-r--r-- 1 9989418 2 14 00:01 compressed_pickle2.pkl
-rw-r--r-- 1 6422805 2 14 00:01 compressed_pickle3.pkl
-rw-r--r-- 1 7483752 2 14 00:01 compressed_picklezip2.pkl
-rw-r--r-- 1 6170076 2 14 00:01 compressed_picklezip3.pkl