问题描述:在跑Faster R-CNN时,使用单GPU训练,计算mAP时,报错:
logs/logs_1/ep150-loss1.017-val_loss1.106.pth model, anchors, and classes loaded.
0%| | 0/4952 [00:00, ?it/s]Load model done.
Get predict result.
0%| | 0/4952 [00:00, ?it/s]
Traceback (most recent call last):
File "/home/PycharmProjects/compare-programs/faster-rcnn-pytorch-master/get_map.py", line 76, in
frcnn.get_map_txt(image_id, image, class_names, map_out_path)
File "/home/PycharmProjects/compare-programs/faster-rcnn-pytorch-master/frcnn.py", line 278, in get_map_txt
roi_cls_locs, roi_scores, rois, _ = self.net(images)
File "/home/.conda/envs/ultralytics_yolov3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/.conda/envs/ultralytics_yolov3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 156, in forward
return self.gather(outputs, self.output_device)
File "/home/.conda/envs/ultralytics_yolov3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in gather
return gather(outputs, output_device, dim=self.dim)
File "/home/.conda/envs/ultralytics_yolov3/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather
res = gather_map(outputs)
File "/home/.conda/envs/ultralytics_yolov3/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/.conda/envs/ultralytics_yolov3/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 55, in gather_map
return Gather.apply(target_device, dim, *outputs)
File "/home/.conda/envs/ultralytics_yolov3/lib/python3.7/site-packages/torch/nn/parallel/_functions.py", line 54, in forward
assert all(map(lambda i: i.is_cuda, inputs))
AssertionError
问题分析:是由于电脑上有两个显卡,但是代码使用的单GPU,因此在未指定代码用哪个GPU时,它不知道使用哪个。
问题解决:定位问题到frcnn.py的278行,该部分代码是生成模型的代码块,原始代码为:
#---------------------------------------------------#
# 载入模型
#---------------------------------------------------#
def generate(self):
#-------------------------------#
# 载入模型与权值
#-------------------------------#
self.net = FasterRCNN(self.num_classes, "predict", anchor_scales = self.anchors_size, backbone = self.backbone)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
self.net.load_state_dict(torch.load(self.model_path, map_location=device))
self.net = self.net.eval()
print('{} model, anchors, and classes loaded.'.format(self.model_path))
if self.cuda:
self.net = nn.DataParallel(self.net)
self.net = self.net.cuda()
为模型指定使用哪个GPU,将最后两行修改为:
if self.cuda:
self.net = nn.DataParallel(self.net, device_ids=[0])
self.net = self.net.cuda()
问题解决。
参考:
assert all(map(lambda i: i.is_cuda, inputs)) AssertionError_学无止境、积少成多、厚积薄发-CSDN博客