pytorch报错:RuntimeError: CUDA error: device-side assert triggered

训练网络报错:RuntimeError: cuda runtime error (710) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorScatterGather.cu:380
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: device-side assert triggered (insert_events at /pytorch/c10/cuda/CUDACachingAllocator.cpp:569)

原因:标签(label)越界

方法:输入

CUDA_LAUNCH_BLOCKING=1 python train.py

会出现错误具体产生信息

/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo, TensorInfo, Real, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = -1]: block: [72,0,0], thread: [32,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.

可以看出是Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]这个断定错误,也就是说标签出现了小于0或者大于总数的情况,越界了。我调试后发现确实类别打标签时有个设定是大于预设的类别总数的,修改了此标签,问题解决。

你可能感兴趣的:(pytorch,深度学习)