报错For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

.报错For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

/aten/src/ATen/native/cuda/NLLLoss2d.cu:103: nll_loss2d_forward_kernel: block: [29,0,0], thread: [707,0,0] Assertion t >= 0 && t < n_classes failed.

报错信息如下:

./aten/src/ATen/native/cuda/NLLLoss2d.cu:103: nll_loss2d_forward_kernel: block: [29,0,0], thread: [707,0,0] Assertion t >= 0 && t < n_classes failed.
。。。。。。

。。。。。。
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so
the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

模型运行训练,可到epoch=9 ,报错
删除models/__pycache__下的缓存文件,重新运行数据集,还是会报错。

解决方案:
是标签有问题,有一张图片标签坏了,某张图片的label标签个数超过了设定的类别数。

你可能感兴趣的:(常见问题,bug)