RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is Fal

在跑Pytorch模型测试代码时报错:

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
RuntimeError: cuda runtime error (3) : initialization error at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCGeneral.cpp:51

完全不知所云…
详情:

Traceback (most recent call last):
  File "../models/arc_face.py", line 44, in Arcface
    learner.load_state(conf, 'ir_se50.pth', model_only=True, from_save_folder=False)
  File "../arcface/Learner.py", line 86, in load_state
    pretrained_dict = torch.load(weights)
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 387, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 574, in _load
    result = unpickler.load()
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 537, in persistent_load
    deserialized_objects[root_key] = restore_location(obj, location)
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 119, in default_restore_location
    result = fn(storage, location)
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 95, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/serialization.py", line 79, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
Loading ir_se50.pth weights failed.
Use CelebA+Lfw-a train set weights
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCGeneral.cpp line=51 error=3 : initialization error
Traceback (most recent call last):
  File "test_models.py", line 164, in <module>
    main()
  File "test_models.py", line 76, in main
    criterions = [nn.BCEWithLogitsLoss(weight=attrWeights[i], reduction='none').cuda() for i in range(40)]
  File "test_models.py", line 76, in <listcomp>
    criterions = [nn.BCEWithLogitsLoss(weight=attrWeights[i], reduction='none').cuda() for i in range(40)]
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 265, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 205, in _apply
    self._buffers[key] = fn(buf)
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 265, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "/home/user1/miniconda3/lib/python3.7/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init
    torch._C._cuda_init()
RuntimeError: cuda runtime error (3) : initialization error at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCGeneral.cpp:51

解决:
经过亲自测试, 至少以下前两种方法是有效的:

  1. 换台机器做测试
  2. 参考annabeth-h @ https://github.com/computationalmedia/semstyle/issues/3
    pretrained_dict = torch.load(weights)改成了pretrained_dict = torch.load(weights, map_location='cuda:0'),然后重新运行测试程序. 会调用gpu 0.
  3. 这个方法我没有尝试: 参照这里的 修改 model.load_state_dict()
	self._model.load_state_dict(torch.load(filelike, map_location=torch.device('cpu'))) 
def load(f, map_location='cpu', pickle_module=pickle, **pickle_load_args):

https://stackoverflow.com/questions/56369030/runtimeerror-attempting-to-deserialize-object-on-a-cuda-device

因不能稳定复现, 暂未尝试.

另:
类似报错:
RuntimeError: cuda runtime error (3) : initialization error at /pytorch/aten/src/THC/THCGeneral.cpp:50

你可能感兴趣的:(Pytorch,Python,#,DL-报错)