RuntimeError: CUDA error: unknown error

1. 问题描述

在运行yolov1的项目时提示RuntimeError: CUDA error: unknown error错误,详细错误提示如下:

root@bcc5071417cf:/home/zhou/pytorch/yolo_1_pytorch# python train.py
epoch = 0
Traceback (most recent call last):
  File "train.py", line 75, in 
    train()
  File "train.py", line 71, in train
    train_step(epochs, model, train_loader, test_loader, optimizer, classes, device=device)
  File "train.py", line 23, in train_step
    loss_dict = model(img, gt_info)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/zhou/pytorch/yolo_1_pytorch/yolo/yolov1.py", line 103, in forward
    output = self.local_layer(output)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/zhou/pytorch/yolo_1_pytorch/yolo/darknet.py", line 65, in forward
    out = self.conv(x)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 447, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

2. 解决办法

将batch_size改小,从batch_size=16, 修改为2之后,解决了问题。

train_cfg['batch_size'] = 2

你可能感兴趣的:(AI开发常见错误集,AI,人工智能,python,cuda,error)