在跑Pytorch模型测试代码时报错:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
RuntimeError: cuda runtime error (3) : initialization error at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCGeneral.cpp:51
完全不知所云…
详情:
Traceback (most recent call last):
File "../models/arc_face.py", line 44, in Arcface
learner.load_state(conf, 'ir_se50.pth', model_only=True, from_save_folder=False)
File "../arcface/Learner.py", line 86, in load_state
pretrained_dict = torch.load(weights)
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 574, in _load
result = unpickler.load()
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 537, in persistent_load
deserialized_objects[root_key] = restore_location(obj, location)
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 119, in default_restore_location
result = fn(storage, location)
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 95, in _cuda_deserialize
device = validate_cuda_device(location)
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 79, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
Loading ir_se50.pth weights failed.
Use CelebA+Lfw-a train set weights
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCGeneral.cpp line=51 error=3 : initialization error
Traceback (most recent call last):
File "test_models.py", line 164, in <module>
main()
File "test_models.py", line 76, in main
criterions = [nn.BCEWithLogitsLoss(weight=attrWeights[i], reduction='none').cuda() for i in range(40)]
File "test_models.py", line 76, in <listcomp>
criterions = [nn.BCEWithLogitsLoss(weight=attrWeights[i], reduction='none').cuda() for i in range(40)]
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 265, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 205, in _apply
self._buffers[key] = fn(buf)
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 265, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (3) : initialization error at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCGeneral.cpp:51
解决:
经过亲自测试, 至少以下前两种方法是有效的:
pretrained_dict = torch.load(weights)
改成了pretrained_dict = torch.load(weights, map_location='cuda:0')
,然后重新运行测试程序. 会调用gpu 0. self._model.load_state_dict(torch.load(filelike, map_location=torch.device('cpu')))
def load(f, map_location='cpu', pickle_module=pickle, **pickle_load_args):
https://stackoverflow.com/questions/56369030/runtimeerror-attempting-to-deserialize-object-on-a-cuda-device
因不能稳定复现, 暂未尝试.
另:
类似报错:
RuntimeError: cuda runtime error (3) : initialization error at /pytorch/aten/src/THC/THCGeneral.cpp:50