pytorch下训练模型出现 target 8 is out of bounds

在模型调试阶段,如果定位不出具体的错误,建议将模型放在CPU上进行调试,这样荣誉i

问题描述

做9分类,在CPU上调试时,报的错误是

return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
IndexError: Target 9 is out of bounds.

从报错的信息可以看出,是计算损失函数时出错了,但是在GPU上进行调试时,根本不容易定位到具体错误,如下:

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [5,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [10,0,0] Assertion `t >= 0 && t < n_classes` failed.

从这个报错来看,非专业人士几乎看不懂


解决方法

从报错来看,是标签超出了界限,作9分类,我的标签是从1开始的,所以会报错,将标签改为从0开始就解决问题了

你可能感兴趣的:(pytorch,深度学习,人工智能)