[debug] RuntimeError: CUDA error: no kernel image is available for execution on the device

问题描述

运行程序时出现报错:

RuntimeError: CUDA error: no kernel image is available for execution on the device (launch_kernel at /opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/ATen/native/cuda/Loops.cuh:72)

问题原因

当前使用的硬件算力不支持相应的pytorch函数,需要更换pytorch到一个合适的版本。

解决方法

  1. 查看当前设备的算力
    在官网查看即可

  2. 查看当前pytorch版本支持的算力

    	torch.cuda.get_arch_list()
    
    Python 3.7.10 (default, Jun  4 2021, 14:48:32) 
    [GCC 7.5.0] :: Anaconda, Inc. on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import torch
    >>> torch.cuda.get_arch_list()
    ['sm_37', 'sm_50', 'sm_60', 'sm_70', 'sm_75']
    
    
  3. 选择合适的pytorch版本,支持当前的GPU算力

附录:完整报错信息

Traceback (most recent call last):
  File "work/main.py", line 171, in <module>
    alg(args.input_dir, args.output_dir,args)
  File "work/main.py", line 127, in alg
    outputs = single_gpu_test(model, data_loader,save_path =split_result )
  File "/workspace/work/tools/TianZhi_test.py", line 29, in single_gpu_test
    result = model(return_loss=False, rescale=not show, **data)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/work/mmdet/core/fp16/decorators.py", line 49, in new_func
    return old_func(*args, **kwargs)
  File "/workspace/work/mmdet/models/detectors/base.py", line 88, in forward
    return self.forward_test(img, img_meta, **kwargs)
  File "/workspace/work/mmdet/models/detectors/base.py", line 79, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/workspace/work/mmdet/models/detectors/polarmask.py", line 79, in simple_test
    bbox_list = self.bbox_head.get_bboxes(*bbox_inputs)
  File "/workspace/work/mmdet/core/fp16/decorators.py", line 127, in new_func
    return old_func(*args, **kwargs)
  File "/workspace/work/mmdet/models/anchor_heads/polarmask_last_head.py", line 603, in get_bboxes
    scale_factor, cfg, rescale)
  File "/workspace/work/mmdet/models/anchor_heads/polarmask_last_head.py", line 691, in get_bboxes_single
    cfg.max_per_img,)
  File "/workspace/work/mmdet/core/post_processing/bbox_nms.py", line 111, in multiclass_nms_with_mask
    cls_dets, index = nms_op(cls_dets, **nms_cfg_)
  File "/workspace/work/mmdet/ops/nms/nms_wrapper.py", line 43, in nms
    inds = nms_cuda.nms(dets_th, iou_thr)
RuntimeError: CUDA error: no kernel image is available for execution on the device (launch_kernel at /opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/ATen/native/cuda/Loops.cuh:72)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f38c8b12dc5 in /opt/conda/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_index_kernel<__nv_dl_wrapper_t<__nv_dl_tag, c10::ArrayRef), &(void at::native::index_kernel_impl >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef)), 1u>> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef, __nv_dl_wrapper_t<__nv_dl_tag, c10::ArrayRef), &(void at::native::index_kernel_impl >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef)), 1u>> const&) + 0x33e (0x7f38ce75013e in /opt/conda/lib/python3.7/site-packages/torch/lib/libcaffe2_gpu.so)
frame #2:  + 0x27e9cca (0x7f38ce74bcca in /opt/conda/lib/python3.7/site-packages/torch/lib/libcaffe2_gpu.so)
frame #3:  + 0x27ea555 (0x7f38ce74c555 in /opt/conda/lib/python3.7/site-packages/torch/lib/libcaffe2_gpu.so)
frame #4:  + 0x6cb2aa (0x7f38c93f72aa in /opt/conda/lib/python3.7/site-packages/torch/lib/libcaffe2.so)
frame #5: at::native::index(at::Tensor const&, c10::ArrayRef) + 0x3e8 (0x7f38c93f4ed8 in /opt/conda/lib/python3.7/site-packages/torch/lib/libcaffe2.so)
frame #6: at::TypeDefault::index(at::Tensor const&, c10::ArrayRef) const + 0x6c (0x7f38c97beb0c in /opt/conda/lib/python3.7/site-packages/torch/lib/libcaffe2.so)
frame #7: torch::autograd::VariableType::index(at::Tensor const&, c10::ArrayRef) const + 0x6f9 (0x7f38c5ec5289 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #8: at::Tensor::index(c10::ArrayRef) const + 0x59 (0x7f3849a5599d in /workspace/work/mmdet/ops/nms/nms_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #9: nms_cuda(at::Tensor, float) + 0x779 (0x7f3849a5414d in /workspace/work/mmdet/ops/nms/nms_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #10: nms(at::Tensor const&, float) + 0x130 (0x7f3849a46e10 in /workspace/work/mmdet/ops/nms/nms_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #11:  + 0x2084d (0x7f3849a5284d in /workspace/work/mmdet/ops/nms/nms_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #12:  + 0x1df0a (0x7f3849a4ff0a in /workspace/work/mmdet/ops/nms/nms_cuda.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>

你可能感兴趣的:(BUG解决,深度学习,gpu,cuda)