【博学谷学习记录】超强总结,用心分享丨人工智能 AI项目 模型硬件优化简记

目录

    • 记录一个错误
    • GPU训练 + CPU部署
    • CPU优化之模型量化

记录一个错误

model.load_state_dict(torch.load(origin_model_path))

在这里插入图片描述

报错信息:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device(‘cpu’) to map your storages to the CPU.

修改方法:
增加map_location

model.load_state_dict(torch.load(origin_model_path, 
                                 map_location=lambda storage, loc:storage))
# 或者下面这种形式
model.load_state_dict(torch.load(origin_model_path, 
                                 map_location=torch.device('cpu'))

GPU训练 + CPU部署

# 将在GPU上训练好的模型加载到CPU上
model.load_state_dict(torch.load(origin_model_path, map_location=lambda storage, loc:storage))

CPU优化之模型量化

Quantizing a network means converting it to use a reduced precision integer representation for the weights and/or activations. This saves on model size and allows the use of higher throughput math operations on your CPU or GPU.

量化网络意味着将其转换为使用权重和/或激活的精度降低的整数表示。这节省了模型大小,并允许在CPU或GPU上使用更高吞吐量的数学运算。

model.load_state_dict(torch.load(origin_model_path, map_location=config.DEVICE))

# 使用torch.quantization.quantize_dynamic获得动态量化的模型
# 量化的网络层为所有的nn.Linear的权重,使其成为int8
quantized_model = torch.quantization.quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)

print_size_of_model(model)
print_size_of_model(quantized_model)

Size (MB): 159.766928
Size (MB): 120.379636

你可能感兴趣的:(学后扩展,学习笔记,人工智能,学习,深度学习,pytorch)