pytorch运行错误:CUDA out of memory. [已解决]

在2080ti上运行分类模型时遇到了该问题,检查模型本身没有发现错误,最终确认是验证集评估阶段的张量计算非常占用空间。
法1. 可以对利用torch.tensor().detach().cpu().numpy()转为numpy,在cpu上进行loss和acc的计算
法2. 直接对评估阶段使用with torch.no_grad():

        for step, (img, label) in enumerate(dataloader):
        	......
            if (step + 1) % opt.print_interval_steps == 0:
                with torch.no_grad():
                    '''验证集上评估模型'''
                    print("evaluate the performance on validate data")
                    total_loss_val = torch.zeros(opt.batch_size).to(device)
                    total_acc_val = torch.zeros(opt.batch_size).to(device)

                    for img_val, label_val in tqdm(val_dataloader):
                        img_val = img_val.to(device)
                        label_val = label_val.to(device)

                        y_pred_val = resnet18_model(img_val)
                        total_loss_val += loss_func(y_pred_val, label_val)

                        y_pred_class = torch.argmax(y_pred_val)
                        total_acc_val += (y_pred_class == label_val)

                    loss_val = torch.sum(total_loss_val) / len(val_db)
                    acc_val = torch.sum(total_acc_val) / len(val_db)

你可能感兴趣的:(错误与异常处理,pytorch)