IDE & Framework之mmdetection使用记录20200707-

文章目录

    • mmdetection代码阅读
      • Git版本控制时的commit message
        • mmdet_v2.7
        • MMDet v2.7.0
        • MMDet v2.3.0
      • {Config}与{ConfigDict}之间的异同
      • 代码注释翻译与理解
      • 可借鉴使用的代码
      • 暂时没看懂的代码
      • 通过看代码才能知晓的细节
      • !!!待思考可改进的代码!!!
      • (H, W, C), [x, y, w, h], [tl_x, tl_y, br_x, br_y]
      • MMDet中是何时执行loss.backward()和optimizer.step()的?
      • anchor的生成
      • train时rpn中nms是如何被调用的
      • test时rcnn中multiclass_nms是如何被调用的
    • mmdetection安装
      • datasets准备
      • 下载预训练模型
      • 使用预训练模型测试
        • Test a dataset
        • Image demo
        • Webcam demo
        • High-level APIs for testing images
    • mmdetection使用
      • OpenMMLab中的 Hook 机制; runner in mmcv
      • 评估模型性能时输出mAP_60,mAP_70,mAP_80,mAP_90所作的修改
      • mmdetection/build路径下的.so文件
    • 模型训练过程中的问题记录
      • 杀死进程, 强制释放GPU
      • 通过命令`ps -ef|grep 进程ID`查找该进程的父进程
      • 命令`nvidia-smi` 的打印输出解释
        • `nvidia-smi: command not found`, but GPU works fine
      • Linux下/usr/lib/xorg/Xorg占用GPU显存过大问题
      • 多GPU训练
      • 有关torch.distributed.all_reduce()的使用
      • mAP跳动, 相同config下前后训练得到的模型的性能不一样
      • 训练模型收敛性问题
    • cuda,cudnn
      • 同时安装CUDA10.0和CUDA10.1及版本切换
    • TODO待办
    • 问题记录
      • RuntimeError: xxx must be contiguous
      • 执行return obj_cls(**args)该行代码时无法构造实例
      • xxxDataset is not in the dataset registry
      • Can't Evaluate a Model Trained With Pytorch 1.6 under Pytorch 1.4 env; the torch.save() serialization format;
      • “RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one”
    • 待补充
      • 待补充
    • 分割线
    • 分割线

mmdetection代码阅读

轻松掌握 MMDetection 整体构建流程(一) - OpenMMLab的文章 - 知乎 20210512

Git版本控制时的commit message

mmdet_v2.7

# ---------------------------20210518分割线---------------------------
the initial project version of mmdet_v2.7;

add my code for calculating COCO-style mAP to the following code:
- tools/train.py
- mmdet/datasets/voc.py
- mmdet/datasets/__init__.py
- mmdet/core/evaluation/mean_ap.py

add the following file to the project:
- mmdet/configs/_base_/datasets/voc07_mini_cocostyle.py
- mmdet/configs/_base_/datasets/voc0712_cocostyle.py
- mmdet/datasets/voc_mini.py
- tools/hcy_train.py
- configs/faster_rcnn/bl_01_faster_rcnn_r50_fpn_1x_voc.py
- configs/retinanet/bl_01_retinanet_r50_fpn_1x_voc.py
# ---------------------------20210519分割线---------------------------
write the code for the following schemes:
- hhid

add the following file to the project:
- configs/hhid/hhid_r50_fpn_1x_voc.py
- mmdet/models/detectors/hhid.py

MMDet v2.7.0

# ---------------------------20201203分割线---------------------------
the initial project version of MMDet v2.7.0;

add my code for calculating COCO-style mAP to the following code:
- tools/train.py
- mmdet/datasets/voc.py
- mmdet/core/evaluation/mean_ap.py
# ---------------------------20201215分割线---------------------------
add comments to the official code;
hc-y_modifybug:... assigned_gt_inds ... max_overlaps in mmdet/core/bbox/assigners/max_iou_assigner.py;

add the following file to mmdet/configs/_base_/datasets/:
- coco_detection_multiscale.py
- voc07_mini_cocostyle.py
- voc0712_cocostyle.py

add the following file to mmdet/datasets/:
- voc_mini.py

write the code for the following two schemes:
- v4v5_11_retinanet_r50_fpn_1x_voc
- v4v5_11_faster_rcnn_r50_fpn_1x_voc.py
- v4v6_11_retinanet_r50_fpn_1x_voc
- v4v7_11_retinanet_r50_fpn_1x_voc
- v4v8_11_retinanet_r50_fpn_1x_voc
- v4v9_11_retinanet_r50_fpn_1x_voc
- v5v1_11_retinanet_r50_fpn_1x_voc
- v5v2_11_retinanet_r50_fpn_1x_voc
# ---------------------------20210308分割线---------------------------
write the code for the following schemes:
- los_rcnn_r50_fpn_1x_carton_mini55_fourcat
- los_rcnn_r50_fpn_1x_carton_onecat

add comments to the official code about the following model:
- SABL RetinaNet;
- YOLACT;

add comments to the official code:
- mmdet/models/detectors/single_stage.py
- mmdet/models/roi_heads/mask_heads/fcn_mask_head.py
- mmdet/datasets/pipelines/loading.py
- mmdet/datasets/pipelines/formating.py
- mmdet/datasets/custom.py

add the following file to mmdet/configs/_base_/datasets/:
- carton_instance_onecat.py
- carton_instance_mini55_onecat.py
- carton_detection_onecat.py
- carton_detection_fourcat.py
- carton_detection_mini55_onecat.py

add the following file to mmdet/datasets/:
- carton.py, including the class CartonDatasetOneCat, CartonDatasetFourCat, CartonminiDatasetOneCat;

modify the official code of the following file:
- mmdet/models/detectors/base.py def show_result(), in order to visualize bbox and mask on image, \
  modify the outline thickness of bbox, modify the fill color of the mask from the random color \
  to the specified color, add some codes for drawing the contour of the mask;
# ---------------------------20210622分割线---------------------------
add the following file to the project:
- carton_detection_fourcat.py
- carton_detection_mini55_onecat.py

add or modify the following file related to the LOS R-CNN Model:
- mmdet/models/roi_heads/standard_roi_head.py

add or modify the following file related to the FCOS Model on carton:
- configs/fcos/bl_01_fcos_r50_caffe_fpn_gn-head_6x2_1x_carton.py
- configs/fcos/bl_02_fcos_r50_caffe_fpn_gn-head_6x2_1x_carton.py
- configs/fcos/bl_01_fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_6x2_1x_carton.py
- configs/fcos/bl_02_fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_6x2_1x_carton.py
- configs/fcos/bl_02_fcos_center-normbbox-centeronreg_r50_caffe_fpn_gn-head_6x2_1x_carton.py
- configs/fcos/v2v1_11_fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_6x2_1x_carton.py
- mmdet/models/dense_heads/v1_fcos_head.py

add or modify the following file related to the SKU110K dataset:
- configs/_base_/datasets/sku_detection_onecat.py
- mmdet/datasets/sku.py, including the class SkuDatasetOneCat;
- configs/fcos/bl_01_fcos_r50_caffe_fpn_gn-head_6x2_1x_sku.py
- configs/fcos/bl_02_fcos_r50_caffe_fpn_gn-head_6x2_1x_sku.py
- configs/fcos/bl_02_fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_6x2_1x_sku.py
- configs/cascade_rcnn/bl_01_cascade_rcnn_r50_fpn_1x_sku.py

add or modify the following file related to the LSNet Model:
- configs/_base_/datasets/carton_instance_onecat_dpp.py
- configs/_base_/datasets/coco_lsvr.py
- configs/lsnet/xxx.py
- mmdet/models/dense_heads/lsnet_head.py
- mmdet/models/dense_heads/lscpvnet_head.py
- mmdet/datasets/pipelines/loading_plus.py
- mmdet/core/bbox/transforms_plus.py 
- mmdet/core/bbox/assigners/centroid_assigner.py
- mmdet/ops/corner_pool/xxx.py
- mmdet/ops/dcn/xxx.py
- mmdet/models/losses/cross_iou_loss.py
- mmdet/core/post_processing/bbox_nms_plus.py
# ---------------------------20210706分割线---------------------------
add or modify the following file related to the LSNet Model:
- configs/_base_/datasets/carton_instance_onecat_dpp.py
- configs/_base_/datasets/coco_lsvr.py
- configs/_base_/datasets/carton_onecat_lsvr_dpp.py
- configs/_base_/datasets/carton_onecat_lsvr_dpp_mini55.py

  # in order to load the "extreme_points" field in the annotation json file
- mmdet/datasets/custom.py, results['extreme_fields'] = [], results['keypoint_fields'] = []
- mmdet/datasets/coco.py, gt_extremes_ann = []

- mmdet/datasets/pipelines/loading_plus.py
- mmdet/datasets/pipelines/transforms_plus.py class ResizeV1v1(), class RandomFlipV1v1()
- mmdet/core/mask/structures.py class PolygonMasks() def flip(), related to class RandomFlipV1v1()

- mmdet/datasets/pipelines/formating.py class DefaultFormatBundle()

- mmdet/core/evaluation/eval_hooks.py class EvalHook(), class DistEvalHook()
- mmdet/apis/test.py, single_gpu_test(), multi_gpu_test()
- mmdet/core/mask/utils.py, def get_rle(), def encode_poly_results()
- mmdet/core/mask/__init__.py encode_poly_results

- tools/test.py, single_gpu_test(), multi_gpu_test()
- mmdet/apis/test.py, single_gpu_test(), multi_gpu_test()

- setup_mmdet_v221.py
- mmdet/ops/corner_pool/xxx.py
- mmdet/ops/dcn/xxx.py
- mmdet/models/dense_heads/lsnet_head.py
- mmdet/models/dense_heads/lscpvnet_head.py
- mmdet/models/losses/cross_iou_loss.py

- mmdet/core/post_processing/bbox_nms_plus.py

- mmdet/models/detectors/lsnet.py def show_result()
- mmcv/visualization/image.py def imshow_extremes(), def imshow_polygons_v2(), def imshow_pose()


add or modify the following file which were copied from 20210507_LSNet-main_Duankaiwen:
- mmdet/ops/chamfer_2d/xxx.py
- mmdet/models/losses/chamfer_loss.py
- mmdet/models/losses/focal_loss.py

- mmdet/core/bbox/assigners/fcos_assigner.py
- mmdet/core/bbox/assigners/point_assigner_v2.py
- mmdet/core/bbox/assigners/point_ct_assigner.py
- mmdet/core/bbox/assigners/point_hm_assigner.py

- mmdet/datasets/pipelines/loading_reppointsv2.py
- mmdet/datasets/pipelines/formating_reppointsv2.py
- mmdet/datasets/coco_pose.py

- mmdet/models/backbones/mobilenet.py

- mmdet/models/dense_heads/reppoints_v2_head.py
- mmdet/models/dense_heads/dense_reppoints_head.py
- mmdet/models/dense_heads/dense_reppoints_v2_head.py
- mmdet/models/detectors/reppoints_v2_detector.py
- mmdet/models/detectors/dense_reppoints_detector.py
- mmdet/models/detectors/dense_reppoints_v2_detector.py
# ---------------------------2020xxxx分割线---------------------------
待添加
# ---------------------------2020xxxx分割线---------------------------

MMDet v2.3.0

# ---------------------------mmdetection_v2.1.0分割线---------------------------
the initial project version;
try to train mask_rcnn_r50_fpn_1x on Poker dataset;
add comments to the official code;
try to train faster_rcnn_r50_fpn_1x on Poker dataset;
read the code of faster_rcnn_r50_fpn_1x on Poker dataset;
# ---------------------------mmdetection_v2.1.0分割线---------------------------

# ---------------------------mmdetection_v2.3.0分割线---------------------------
# ---------------------------20200817分割线---------------------------
the initial project version
# ---------------------------20200817分割线---------------------------
try to train faster_rcnn_r50_fpn_1x on Poker dataset
# ---------------------------20200830分割线---------------------------
add comments to the official code;
hc-y_modifybug:... assigned_gt_inds ... max_overlaps in mmdet/core/bbox/assigners/max_iou_assigner.py;

learn the use of pycocotools/coco.py and pycocotools/cocoeval.py;

modify my code in tools/usr_train.py;
add my code 'record data' to mmdet/models/dense_heads/anchor_head.py;
# ---------------------------20200904分割线---------------------------
add comments to the official code;
hc-y_modifybug:... if unmap_outputs ... in mmdet/models/dense_heads/anchor_head.py;

add my code 'record data' to mmdet/models/dense_heads/anchor_head.py and mmdet/models/dense_heads/rpn_head.py;
record data of {img_meta, anchors, rpn_preds, rpn_loss, proposals} and save to /work_dirs/usr_recorddata with flag_record_data=True;

modify my code in mmcv/runner/epoch_based_runner.py;

complete usr_recorddata_analysis.py for single image;
# ---------------------------20200908分割线---------------------------
add comments to the official code;
hc-y_modifybug:def loss():avg_factor=bbox_targets.size(0) in mmdet/models/roi_heads/bbox_heads/bbox_head.py;

add my code 'record data' to mmdet/models/dense_heads/anchor_head.py and mmdet/models/dense_heads/rpn_head.py;
record data of {img_meta, anchors, rpn_preds, rpn_loss, proposals} and save to /work_dirs/usr_recorddata with flag_record_data=True;

add my code 'record data' to mmdet/models/roi_heads/standard_roi_head.py;
record data of {q0, assign_pg_cls_labels, rcnn_preds, rcnn_loss} and save to /work_dirs/usr_recorddata with flag_record_data=True;

modify my code in mmcv/runner/epoch_based_runner.py;

add code to usr_recorddata_analysis.py for single image;
# ---------------------------20200908分割线---------------------------
replace flag_record_data with flag_train_record_data in mmdet/models/dense_heads/anchor_head.py and mmdet/models/dense_heads/rpn_head.py mmdet/models/roi_heads/standard_roi_head.py;
# ---------------------------20200911分割线---------------------------
add comments to the official code;

add my code 'record data' to mmdet/models/roi_heads/standard_roi_head.py;
record data of {q0, assign_pg_cls_labels, rcnn_preds, rcnn_loss} and save to /work_dirs/usr_recorddata with flag_train_record_data=True;

read code by debugging my_workspace/test_demo_model/usr_demo_detectobject.py and add my code 'record data';

add my code 'record data' to mmdet/core/post_processing/bbox_nms.py and mmdet/models/roi_heads/bbox_heads/bbox_head.py and mmdet/models/roi_heads/standard_roi_head.py;
record data of {q0, q1, q2, q3} and save to /work_dirs/usr_recorddata with flag_train_record_data=True;

add code to usr_recorddata_analysis.py for single image;
# ---------------------------20200912分割线---------------------------
modify my code in mmdet/core/post_processing/bbox_nms.py;

data can be collected successfully on {Poker, coco, VOCdevkit} dataset, but whether the data is desired and available remains to be confirmed; 

- record data: {img_meta, A, G, assign_G_ind, IoU(A,assign_G), pos_inds, neg_inds}
- record data: {rpn_cls_scores, rpn_bbox_preds, rpn_cls_labels, rpn_bbox_targets}
- record data: {p0, p0_scores, p1, p1_scores, p1_inds, p2, p2_scores, p2_inds, p3, p3_scores, p3_inds}

- record data: {p3_inrcnn, G, assign_G_ind, IoU(p3_inrcnn,assign_G), assign_G_cls_label, pos_inds, neg_inds}
- record data: {rcnn_cls_scores, rcnn_bbox_preds, rcnn_cls_labels, rcnn_bbox_targets}
- record data: {q0, q0_scores_after_softmax, q1, q1_scores_after_softmax, q1_inds, q2, q2_scores_after_softmax, q2_inds, q3, q3_scores_after_softmax, q3_inds}
# ---------------------------20200912分割线---------------------------
revert the code of {anchor_head.py, rpn_head.py, standard_roi_head.py, bbox_head.py, bbox_nms.py, test_mixins.py} to the official code;
# ---------------------------20200917分割线---------------------------
data can be collected successfully on {Poker, coco, VOCdevkit} dataset, and whether the data excluding {p0, p1, p2, p3, q0, q1, q2, q3} is desired and available has been confirmed; 

- record data: {img_meta, all_a, G, inside_a_flags, assign_G_ind, IoU(inside_a,assign_G), inside_a_pos_bbox_targets, sampling_a_pos_inds, sampling_a_neg_inds}
- record data: {all_a_cls_scores, all_a_bbox_preds, sampling_a_cls_labels, sampling_a_bbox_targets}
- record data: {p0, p0_scores, p1, p1_scores, p1_inds, p2, p2_scores, p2_inds, p3, p3_scores, p3_inds}
- record data: {p3_inrcnn, G, assign_G_ind, IoU(p3_inrcnn,assign_G), assign_G_cls_label, sampling_p3_inrcnn_pos_inds, sampling_p3_inrcnn_neg_inds, all_p3_inrcnn_cls_scores, all_p3_inrcnn_bbox_preds, all_p3_inrcnn_pos_bbox_targets}
- record data: {sampling_p3_inrcnn_cls_scores, sampling_p3_inrcnn_bbox_preds, sampling_p3_inrcnn_cls_labels, sampling_p3_inrcnn_bbox_targets}
- record data: {q0, q0_scores_after_softmax, q1, q1_scores_after_softmax, q1_inds, q2, q2_scores_after_softmax, q2_inds, q3, q3_scores_after_softmax, q3_inds}
# ---------------------------20201007分割线---------------------------
review the inheritance and polymorphisn of Class in Python, and perform code review;

write the code for the following two schemes:
- v2v1_11_faster_rcnn_r50_fpn_1x_voc
- v2v2_11_faster_rcnn_r50_fpn_1x_voc
# ---------------------------20201008分割线---------------------------
add comments to the code in the following file:
- mmdet/core/bbox/assigners/v2v1_max_iof_assigner.py
- mmdet/core/bbox/obtain_gt_bgs.py
- mmdet/models/roi_heads/test_mixins.py
# ---------------------------20201113分割线---------------------------
add comments to the official code;

read the code about the following papers:
- mmdet/configs/ghm
- mmdet/configs/pisa
- mmdet/configs/gfl

revise the config of learning rate in my code;

write the code for the following two schemes:
- v3v1_00_retinanet_r50_fpn_1x_voc
- v3v1_11_retinanet_r50_fpn_1x_poker
- v4v1_11_retinanet_r50_fpn_1x_voc

add my code for calculating COCO-style mAP to the following code:
- mmdet/datasets/poker.py
- mmdet/datasets/voc.py
- mmdet/core/evaluation/mean_ap.py
# ---------------------------20201203分割线---------------------------
add comments to the official code;

write the code for the following schemes:
- etc
- v4v5_11_retinanet_r50_fpn_1x_voc
- v4v6_11_retinanet_r50_fpn_1x_voc

add the following file to mmdet/configs/_base_/datasets/:
- coco_detection_multiscale.py
- voc07_mini_cocostyle.py
- voc0712_cocostyle.py

submit this commit message before updating the MMDet version from v2.3.0 to v2.7.0;
# ---------------------------2020xxxx分割线---------------------------
待添加
# ---------------------------2020xxxx分割线---------------------------

{Config}与{ConfigDict}之间的异同

cfg = {Config} Config (path: ../../configs/mask_rcnn/usr_mask_rcnn_r50_fpn_1x_poker.py): {'model': {}, 'key': {}, }
	_cfg_dict = {ConfigDict} {'model': {}, 'key': {}, }
	_filename = {str} '../../configs/mask_rcnn/usr_mask_rcnn_r50_fpn_1x_poker.py'
	_text = {str} '键值存储的是../../configs/mask_rcnn/usr_mask_rcnn_r50_fpn_1x_poker.py其本身及其所包含的4个文件的内容'
	filename = {str} '../../configs/mask_rcnn/usr_mask_rcnn_r50_fpn_1x_poker.py'
	pretty_text = {str} '键值与usr_dumpconfig_mask_rcnn_r50_fpn_1x_poker.py中的内容一致'
	text = {str} '键值存储的是../../configs/mask_rcnn/usr_mask_rcnn_r50_fpn_1x_poker.py其本身及其所包含的4个文件的内容'	
IDE & Framework之mmdetection使用记录20200707-_第1张图片
20200802 Config与ConfigDict之间的异同.jpg

代码注释翻译与理解

  • we set FG labels to [0, num_class-1] and BG label to num_class in other heads since mmdet v2.0, However we keep BG label as 0 and FG label as 1 in rpn head;

  • gt_bboxes_ignore (Tensor, optional): Ground truth bboxes that are labelled as ignored, e.g., crowd boxes in COCO.

  • imgs (List[Tensor]): the outer list indicates test-time augmentations and inner Tensor should have a shape NxCxHxW, which contains all images in the batch.
      mmdet/models/detectors/base.py
      含义是其中1个Tensor表示的是N=batchsize个图片(CxHxW),对这1组图片分别执行不同的图像变换又会得到多个Tensor,这些Tensor以list形式组合之后便得到List[Tensor];
      这里,有几个Tensor就表示执行了几种图像变换;Tensor中的N表示有N张图片;

  • img_metas (List[List[dict]]): the outer list indicates test-time augs (multiscale, flip, etc.) and the inner list indicates images in a batch.
      mmdet/models/detectors/base.py
      含义是其中1个List[dict]表示的是batchsize个图片的img_metas信息,对这1组图片分别执行不同的图像变换又会得到多个List[dict],这些List[dict]以list形式组合之后便得到List[List[dict]];
      这里,有几个List[dict]就表示执行了几种图像变换;有几个dict就表示有几张图片;dict中存储单张图片的img_metas信息;

  • proposals (List[List[Tensor]]): the outer list indicates test-time augs (multiscale, flip, etc.) and the inner list indicates images in a batch. The Tensor should have a shape Px4, where P is the number of proposals.
      mmdet/models/detectors/base.py
      含义是其中1个List[Tensor]表示的是batchsize个图片,对这1组图片分别执行不同的图像变换又会得到多个List[Tensor],这些List[Tensor]组合之后便得到List[List[Tensor]];
      这里,有几个List[Tensor]就表示执行了几种图像变换;有几个Tensor就表示有几张图片;Tensor中存储单张图片的proposals数据;

  • feature map通常是a 4D-tensor, 其shape大小为torch.Size([N,C,W,H]);

  • bbox_targets_list (list[Tensor]): BBox targets of each level.

可借鉴使用的代码

  • logger.info
logger.info('Environment info:\n' + dash_line + env_info + '\n' + dash_line)
# c-y_note:logger.info()在.log文件中输出log信息“2020-08-01 20:41:12,985 - mmdet - INFO - Environment info:内容”
  • 用于判断以’var_name’为变量名的变量var是否是list类型
# 用于判断以'var_name'为变量名的变量var是否是list类型
for var, name in [(imgs, 'imgs'), (img_metas, 'img_metas')]:
	if not isinstance(var, list):
		raise TypeError(f'{name} must be a list, but got {type(var)}')
  • xxx
在这里插入代码片
  • xxx
在这里插入代码片
  • xxx
在这里插入代码片

暂时没看懂的代码

  • self(**data)是什么操作?
mmdet/models/detectors/base.py

class BaseDetector(nn.Module, metaclass=ABCMeta):
    """Base class for detectors."""
	def train_step(self, data, optimizer):
		losses = self(**data)  # hc-y_Q20200903:这是什么操作?

类似的有self(x),self(x)应该是Pytorch中独有的,与nn.Module有关吧。

mmdet/models/dense_heads/rpn_test_mixin.py

class RPNTestMixin(object):
    """Test methods of RPN."""
	def simple_test_rpn(self, x, img_metas):
		rpn_outs = self(x)  # hc-y_note:self(x)会去调用相应模块的def forward(self, feats)函数
        proposal_list = self.get_bboxes(*rpn_outs, img_metas)  # hc-y_note:此行代码的执行过程与train时mmdet/models/dense_heads/
        return proposal_list  # base_dense_head.py中proposal_list = self.get_bboxes(*outs, img_metas, cfg=proposal_cfg)的执行过程一致
mmdet/models/dense_heads/base_dense_head.py

class BaseDenseHead(nn.Module, metaclass=ABCMeta):
    """Base class for DenseHeads."""
	def forward_train(self, x, img_metas, gt_bboxes, gt_labels=None, gt_bboxes_ignore=None, proposal_cfg=None, **kwargs):
		outs = self(x)  # hc-y_note:tuple[2Tensor], 2Tensor分别表示的是cls_scores, bbox_preds; 获取由rpn得到的preds, 即{cls_scores, bbox_preds};
		if gt_labels is None:
			loss_inputs = outs + (gt_bboxes, img_metas)
		else:
			loss_inputs = outs + (gt_bboxes, gt_labels, img_metas)
		losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)  # hc-y_note:mmdet/models/dense_heads/anchor_head.py def loss()
		if proposal_cfg is None:
			return losses
		else:
			proposal_list = self.get_bboxes(*outs, img_metas, cfg=proposal_cfg)  # hc-y_note:mmdet/models/dense_heads/anchor_head.py def get_bboxes()
			return losses, proposal_list
  • scale_factor的值是如何得来的?
    hc-y_Q20200910:scale_factor的值是如何得来的?
type(scale_factor)  # numpy.ndarray, dtype=float32, [1.5873016 1.5873016 1.5873016 1.5873016]
bboxes  # torch.Size([rpn_nms_post, num_cls*4])

mmdet/models/roi_heads/bbox_heads/bbox_head.py

class BBoxHead(nn.Module):
	def get_bboxes():
		if rescale:
			if isinstance(scale_factor, float):
				bboxes /= scale_factor
			else:
				scale_factor = bboxes.new_tensor(scale_factor)  # torch.Tensor.new_tensor(): Returns a new Tensor with data as the tensor data. By default, the returned Tensor has the same torch.dtype and torch.device as this tensor.
				bboxes = (bboxes.view(bboxes.size(0), -1, 4) /
						  scale_factor).view(bboxes.size()[0], -1)
  • xxx
在这里插入代码片
  • xxx
在这里插入代码片
  • xxx
在这里插入代码片

通过看代码才能知晓的细节

  • NMS能处理的bboxes数量上限
      Q:mmdetection中batched_nms能处理的bboxes数量是由一定的上限的, 这个上限数量是多少?
      A:个人觉得数量不要超过3000吧; COCO Dataset中number of instances per image为1~15, 平均值为7.7;
  • NMS在each scale level上是各自执行的;
mmdet/models/dense_heads/rpn_head.py
class RPNHead(RPNTestMixin, AnchorHead):
	def _get_bboxes_single():
		for idx in range(len(cls_scores)):  # hc-y_note:在单张图片上 再逐个scale level处理
			pass
		dets, keep = batched_nms(proposals, scores, ids, nms_cfg)  # hc-y_highlight:NMS在each scale level上是各自执行的;
  • 逐个scale level地对从1个batch里的所有图片中采样出的rois执行RoIAlign操作;
mmdet/models/roi_heads/roi_extractors/single_level_roi_extractor.py
class SingleRoIExtractor(BaseRoIExtractor):
	def forward():
		for i in range(num_levels):  # hc-y_note:逐个scale level地对从1个batch里的所有图片中采样出的rois执行RoIAlign操作;
  • xxx
在这里插入代码片
  • xxx
在这里插入代码片
  • xxx
在这里插入代码片

!!!待思考可改进的代码!!!

  • 注意到同一个bbox是有可能被同时预测出多个类别标签的;
mmdet/core/post_processing/bbox_nms.py
def multiclass_nms():
	dets, keep = batched_nms(bboxes, scores, labels, nms_cfg)  # hc-y_note:逐个类别地执行nms; !!!注意到同一个bbox是有可能被同时预测出多个类别标签的; 这里可以改进吧!!!
  • train_cfg.rcnn.sampler.pos_fraction rcnn阶段的正负样本采样比例
在这里插入代码片
  • 如果让anchor_scale与roi的投射scale保持一致, 性能会不会有所提升?

Why the default value of finest_scale is 56 in mmdet/models/roi_heads/roi_extractors/single_level_roi_extractor.py?

Why the default value of finest_scale is 56? · Issue #2843 · open-mmlab/mmdetection · GitHub 20200529

The corelation between anchor_scale and map_roi_levels ? · Issue #2387 · open-mmlab/mmdetection · GitHub 20200403

It seems strange as below
P2 anchor_scale set for rpn_head: 32 mapped_roi_scale for roi_extractor: 0-112
P3 anchor_scale set for rpn_head: 64 mapped_roi_scale for roi_extractor: 112-224
P4 anchor_scale set for rpn_head: 128 mapped_roi_scale for roi_extractor: 224-448
P5 anchor_scale set for rpn_head: 256 mapped_roi_scale for roi_extractor: 448-

So why the mapped_roi_scale for roi_extractor don’t approximately match the anchor_scale for rpn_head?

hc-y_Q20200907: 如果让anchor_scale与roi的投射scale保持一致, 性能会不会有所提升?

在这里插入代码片
  • MMDet中模型的总损失是怎么加到一块的
由此可见, 假设检测器总的损失loss=lamda_1*rpn_loss + lamda_2*roi_loss, 这里的rpn_losses 
(mmdet/models/detectors/two_stage.py)相当于lamda_1*rpn_loss, roi_losses相当于lamda_2*roi_loss;

mmdet/models/detectors/base.py loss, log_vars = self._parse_losses(losses)
mmdet/models/detectors/base.py loss, def _parse_losses(self, losses):对分类损失和定位损失求和得到总的损失

  • xxx
在这里插入代码片
  • xxx
在这里插入代码片
  • xxx
在这里插入代码片

(H, W, C), [x, y, w, h], [tl_x, tl_y, br_x, br_y]

img_metas[img_id].img_shape = <class 'tuple'> (800, 600, 3)分别表示(H, W, C)


mmdet/core/bbox/coder/delta_xywh_bbox_coder.py
def bbox2delta(proposals, gt, means=(0., 0., 0., 0.), stds=(1., 1., 1., 1.)):
    """Compute deltas of proposals w.r.t. gt.

    We usually compute the deltas of x, y, w, h of proposals w.r.t ground
    truth bboxes to get regression target.
    This is the inverse function of :func:`delta2bbox`.

    Args:
        proposals (Tensor): Boxes to be transformed, shape (N, ..., 4)
        gt (Tensor): Gt bboxes to be used as base, shape (N, ..., 4)
        means (Sequence[float]): Denormalizing means for delta coordinates
        stds (Sequence[float]): Denormalizing standard deviation for delta
            coordinates

    Returns:
        Tensor: deltas with shape (N, 4), where columns represent 
		dx, dy, dw, dh.
    """
	pass


def delta2bbox(rois, deltas, means=(0., 0., 0., 0.), stds=(1., 1., 1., 1.), max_shape=None, wh_ratio_clip=16 / 1000):
    """Apply deltas to shift/scale base boxes.

    Typically the rois are anchor or proposed bounding boxes and the deltas are
    network outputs used to shift/scale those boxes.
    This is the inverse function of :func:`bbox2delta`.

    Args:
        rois (Tensor): Boxes to be transformed. Has shape (N, 4)
        deltas (Tensor): Encoded offsets with respect to each roi.
            Has shape (N, 4 * num_classes). Note N = num_anchors * W * H when
            rois is a grid of anchors. Offset encoding follows [1]_.
        means (Sequence[float]): Denormalizing means for delta coordinates
        stds (Sequence[float]): Denormalizing standard deviation for delta
            coordinates
        max_shape (tuple[int, int]): Maximum bounds for boxes. specifies (H, W)
        wh_ratio_clip (float): Maximum aspect ratio for boxes.

    Returns:
        Tensor: Boxes with shape (N, 4), where columns represent 
		tl_x, tl_y, br_x, br_y.
	"""
	pass

MMDet中是何时执行loss.backward()和optimizer.step()的?

20200907记:
  "backward"和"update weights"操作发生在:

  • xxx
# mmcv/runner/epoch_based_runner.py

class EpochBasedRunner(BaseRunner):
	def train():
		self.call_hook('after_train_iter')
  • xxx
# mmcv/runner/base_runner.py

class BaseRunner(metaclass=ABCMeta):
	def call_hook(self, fn_name): 
		在hook = <mmcv.runner.hooks.optimizer.OptimizerHook object at 0x7f2ac0811250>, 会调用class OptimizerHook.after_train_iter()
  • xxx
# mmcv/runner/hooks/optimizer.py

class OptimizerHook(Hook): 
    def after_train_iter(self, runner):
        runner.optimizer.zero_grad()
        runner.outputs['loss'].backward()  # c-y_note:backward
        if self.grad_clip is not None:
            grad_norm = self.clip_grads(runner.model.parameters())
            if grad_norm is not None:
                # Add grad norm to the logger
                runner.log_buffer.update({'grad_norm': float(grad_norm)},
                                         runner.outputs['num_samples'])
        runner.optimizer.step()  # c-y_note:update weights

anchor的生成

mmdet/core/anchor/anchor_generator.py def single_level_grid_anchors()

shift_xx = torch.tensor([ 0,  4,  0,  4,  0,  4])
shift_yy = torch.tensor([ 0,  0,  4,  4,  8,  8])
shifts = torch.stack([shift_xx, shift_yy, shift_xx, shift_yy], dim=-1)
>>> shifts
Out[13]: 
tensor([[0, 0, 0, 0],
        [4, 0, 4, 0],
        [0, 4, 0, 4],
        [4, 4, 4, 4],
        [0, 8, 0, 8],
        [4, 8, 4, 8]])
>>> shifts.shape
Out[14]: torch.Size([6, 4])

train时rpn中nms是如何被调用的

20200903记:

#mmcv/ops/nms.py
# MMDet中的NMS代码实现


def batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False):
    """Performs non-maximum suppression in a batched fashion.

    Modified from https://github.com/pytorch/vision/blob
    /505cd6957711af790211896d32b40291bea1bc21/torchvision/ops/boxes.py#L39.
    In order to perform NMS independently per class, we add an offset to all
    the boxes. The offset is dependent only on the class idx, and is large
    enough so that boxes from different classes do not overlap.

    Arguments:
        boxes (torch.Tensor): boxes in shape (N, 4).
        scores (torch.Tensor): scores in shape (N, ).
        idxs (torch.Tensor): each index value correspond to a bbox cluster,
            and NMS will not be applied between elements of different idxs,
            shape (N, ).
        nms_cfg (dict): specify nms type and other parameters like iou_thr.
        class_agnostic (bool): if true, nms is class agnostic,
            i.e. IoU thresholding happens over all boxes,
            regardless of the predicted class

    Returns:
        tuple: kept dets and indice.
    """
    nms_cfg_ = nms_cfg.copy()
    class_agnostic = nms_cfg_.pop('class_agnostic', class_agnostic)
    if class_agnostic:
        boxes_for_nms = boxes
    else:
        max_coordinate = boxes.max()  # c-y_note:返回的是单个值
        offsets = idxs.to(boxes) * (max_coordinate + 1)  # idxs.dtype:torch.int64, boxes.dtype:torch.float32
        boxes_for_nms = boxes + offsets[:, None]
    nms_type = nms_cfg_.pop('type', 'nms')
    nms_op = eval(nms_type)  # c-y_Q20200903:内置函数eval()什么意思?
    dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg_)  # c-y_note:kept dets (boxes and scores) and indice, which is always the same data type as the input.
    boxes = boxes[keep]  # torch.Size([num_boxes, 4])
    scores = dets[:, -1]  # torch.Size([num_boxes])
    return torch.cat([boxes, scores[:, None]], -1), keep  # c-y_note:这里设置为-1, .cat()时会自动判断in which dimension上执行.cat()操作

test时rcnn中multiclass_nms是如何被调用的

20200910记:
  知识点: torch.Tensor布尔索引与torch.masked_select()的使用;
  阅读mmdet/core/post_processing/bbox_nms.py脚本文件时,可辅助理解的代码片段:

# mmdet/core/post_processing/bbox_nms.py


temp_bboxes = torch.randn(6, 3, 4)  # torch.Size([6, 3, 4])
temp_scores = torch.rand(6, 3)
temp_valid_mask = temp_scores > 0.5  # torch.Size([6, 3])

temp_bboxes[temp_valid_mask].shape  # torch.Size([11, 4])
print('temp_valid_mask.nonzero()的值:\n', temp_valid_mask.nonzero())  # torch.Size([11, 2])
temp_valid_mask_inds = temp_valid_mask.nonzero()
temp_label = temp_valid_mask.nonzero()[:, 1]  # 值与temp_cls_inds相同


print('torch.where(temp_valid_mask == True)的值:\n', torch.where(temp_valid_mask == True))  # tuple[len(temp_valid_mask.size())=2个Tensor], each Tensor的shape是torch.Size([11])
temp_q0_inds = torch.where(temp_valid_mask == True)[0]  # torch.Size([11])
temp_cls_inds = torch.where(temp_valid_mask == True)[1]  # torch.Size([11])

# 以下三行代码的输出结果均相同
torch.masked_select(temp_bboxes, torch.stack((temp_valid_mask, temp_valid_mask, temp_valid_mask, temp_valid_mask), -1)).view(-1, 4)
temp_bboxes[temp_valid_mask]  # torch.Size([11, 4])
temp_bboxes[temp_q0_inds, temp_cls_inds]
torch.masked_select(temp_scores, temp_valid_mask)  # torch.Size([11])

mmdetection安装

datasets准备

  Feel free to put the dataset at any place you want, and then soft link the dataset under the data/ folder:

cd data/coco
ln -s /media/your_username/A42C33A02C336D04/dataset/coco2017/annotations_trainval2017/annotations annotations
ln -s /media/your_username/A42C33A02C336D04/dataset/coco2017/train_image_2017/train2017 train2017
ln -s /media/your_username/A42C33A02C336D04/dataset/coco2017/val_image_2017 val2017
ln -s /media/your_username/A42C33A02C336D04/dataset/coco2017/test_image_2017/test2017 test2017

cd data/cityscapes
ln -s /media/your_username/A42C33A02C336D04/dataset/cityscapes/annotations annotations
ln -s /media/your_username/A42C33A02C336D04/dataset/cityscapes/leftImg8bit leftImg8bit
ln -s /media/your_username/A42C33A02C336D04/dataset/cityscapes/gtFine gtFine

cd data/VOCdevkit
ln -s /media/your_username/A42C33A02C336D04/dataset/Pascal VOC Dataset/VOC2007 VOC2007
ln -s /media/your_username/A42C33A02C336D04/dataset/Pascal VOC Dataset/VOC2012 VOC2012

  暂未执行该操作:
  The cityscapes annotations have to be converted into the coco format using tools/convert_datasets/cityscapes.py:

下载预训练模型

mmdetection/model_zoo.md at master · open-mmlab/mmdetection · GitHub
  在上述链接中下载需要测试的预训练模型,例如faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth,并在mmdetection目录下新建目录checkpoints,将模型存放到该目录下。

wget "https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth"

  如何在Terminal中下载文件,参考
Download Files From Google Drive With curl/wget - DEV

使用预训练模型测试

Test a dataset

You can use the following commands to test a dataset.

# single-gpu testing
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]

Examples:
Assume that you have already downloaded the checkpoints to the directory checkpoints/.

# Test Faster R-CNN and visualize the results. Press any key for the next image.
python tools/test.py configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
    --show

# Test Faster R-CNN and save the painted images for latter visualization.
python tools/test.py configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
    --show-dir my_workspace/faster_rcnn_r50_fpn_1x_results

# Test Faster R-CNN on PASCAL VOC (without saving the test results) and evaluate the mAP.
python tools/test.py configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py \
    checkpoints/SOME_CHECKPOINT.pth \
    --eval mAP


# ------以下待修改------
# Test Mask R-CNN with 8 GPUs, and evaluate the bbox and mask AP.
./tools/dist_test.sh configs/mask_rcnn_r50_fpn_1x_coco.py \
    checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth \
    8 --out results.pkl --eval bbox segm

# Test Mask R-CNN with 8 GPUs, and evaluate the classwise bbox and mask AP.
./tools/dist_test.sh configs/mask_rcnn_r50_fpn_1x_coco.py \
    checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth \
    8 --out results.pkl --eval bbox segm --options "classwise=True"

# Test Mask R-CNN on COCO test-dev with 8 GPUs, and generate the json file to be submit to the official evaluation server.
./tools/dist_test.sh configs/mask_rcnn_r50_fpn_1x_coco.py \
    checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth \
    8 --format-only --options "jsonfile_prefix=./mask_rcnn_test-dev_results"
# You will get two json files mask_rcnn_test-dev_results.bbox.json and mask_rcnn_test-dev_results.segm.json.

# Test Mask R-CNN on Cityscapes test with 8 GPUs, and generate the txt and png files to be submit to the official evaluation server.
./tools/dist_test.sh configs/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes.py \
    checkpoints/mask_rcnn_r50_fpn_1x_cityscapes_20200227-afe51d5a.pth \
    8  --format-only --options "txtfile_prefix=./mask_rcnn_cityscapes_test_results"
# The generated png and txt would be under ./mask_rcnn_cityscapes_test_results directory.

Image demo

We provide a demo script to test a single image.

python demo/image_demo.py ${IMAGE_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${GPU_ID}] [--score-thr ${SCORE_THR}]

Examples:

python demo/image_demo.py demo/demo.jpg configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth --device cpu

Webcam demo

We provide a webcam demo to illustrate the results.

python demo/webcam_demo.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${GPU_ID}] [--camera-id ${CAMERA-ID}] [--score-thr ${SCORE_THR}]

Examples:

python demo/webcam_demo.py configs/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth

High-level APIs for testing images

Synchronous interface
Here is an example of building the model and test given images.

from mmdet.apis import init_detector, inference_detector
import mmcv

config_file = 'configs/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth'

# build the model from a config file and a checkpoint file
model = init_detector(config_file, checkpoint_file, device='cuda:0')

# test a single image and show the results
img = 'test.jpg'  # or img = mmcv.imread(img), which will only load it once
result = inference_detector(model, img)
# visualize the results in a new window
model.show_result(img, result)
# or save the visualization results to image files
model.show_result(img, result, out_file='result.jpg')

# test a video and show the results
video = mmcv.VideoReader('video.mp4')
for frame in video:
    result = inference_detector(model, frame)
    model.show_result(frame, result, wait_time=1)

A notebook demo can be found in demo/inference_demo.ipynb.

mmdetection使用

OpenMMLab中的 Hook 机制; runner in mmcv

  • mmcv/runner.md at master · open-mmlab/mmcv
  • (待阅读)目标检测(MMdetection)-Runner - 知乎 20201025
  • 目标检测(MMdetection)-HOOK机制 - 努力的伍六七的文章 - 知乎 20200912
  • OpenMMLab 的 Hook 机制 | Yimian’s Blog

class EpochBasedRunner(BaseRunner) 的类方法的调用层级: run() --> train()/val() --> run_iter();
class IterBasedRunner(BaseRunner) 的类方法的调用层级: run() --> train()/val() --> run_iter();

评估模型性能时输出mAP_60,mAP_70,mAP_80,mAP_90所作的修改

  使用不同的IMS_PER_BATCH,或者更换不同的GPU设备,对同一个模型进行评估,AP值是相同的;


Average Precision  (AP) @[ IoU=0.60      | area=   all | maxDets=1000 ] = 0.945
Average Precision  (AP) @[ IoU=0.70      | area=   all | maxDets=1000 ] = 0.926
Average Precision  (AP) @[ IoU=0.80      | area=   all | maxDets=1000 ] = 0.880
Average Precision  (AP) @[ IoU=0.90      | area=   all | maxDets=1000 ] = 0.731

cd ~/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/mmpycocotools-12.0.3-py3.8-linux-x86_64.egg/pycocotools
vim cocoeval.py 
# cd ~/anaconda3/envs/mmlab/lib/python3.9/site-packages/pycocotools/cocoeval.py

line563
def setDetParams(self):
	self.imgIds = []
	self.catIds = []
	# np.arange causes trouble.  the data point on arange is slightly
	# larger than the true value
	# 原始为 self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
    # 原始为 self.recThrs = np.linspace(.0, 1.00, int(np.round((1.00 - .0) / .01)) + 1, endpoint=True)
	self.iouThrs = np.linspace(50,
							   95,
							   int(np.round((0.95 - .5) / .05)) + 1,
							   endpoint=True) / 100
	self.recThrs = np.linspace(.0,
							   100,
							   int(np.round((1.00 - .0) / .01)) + 1,
							   endpoint=True) / 100

line582
def setKpParams(self):
	self.imgIds = []
	self.catIds = []
	# np.arange causes trouble.  the data point on arange is slightly
	# larger than the true value
	# 原始为 self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
    # 原始为 self.recThrs = np.linspace(.0, 1.00, int(np.round((1.00 - .0) / .01)) + 1, endpoint=True)
	self.iouThrs = np.linspace(50,
							   95,
							   int(np.round((0.95 - .5) / .05)) + 1,
							   endpoint=True) / 100
	self.recThrs = np.linspace(.0,
							   100,
							   int(np.round((1.00 - .0) / .01)) + 1,
							   endpoint=True) / 100


def _summarize( ap=1, iouThr=None, areaRng='all', maxDets=100 ):
	# 二选一: 要么改 self.iouThrs 和 self.recThrs, 要么改 t = np.where(iouThr == p.iouThrs)[0];
	t = np.where(abs(iouThr - p.iouThrs) < 1e-5)[0]  # usr_modify0110:p.iouThrs[-2]=0.8999999999999999; 原始为 t = np.where(iouThr == p.iouThrs)[0];

line498
def _summarizeDets():
	stats = np.zeros((16, ))
	stats[0] = _summarize(1)
	stats[1] = _summarize(1, iouThr=.5, maxDets=self.params.maxDets[2])
	stats[2] = _summarize(1,
						  iouThr=.75,
						  maxDets=self.params.maxDets[2])
	stats[3] = _summarize(1,
						  areaRng='small',
						  maxDets=self.params.maxDets[2])
	stats[4] = _summarize(1,
						  areaRng='medium',
						  maxDets=self.params.maxDets[2])
	stats[5] = _summarize(1,
						  areaRng='large',
						  maxDets=self.params.maxDets[2])
	stats[6] = _summarize(0, maxDets=self.params.maxDets[0])
	stats[7] = _summarize(0, maxDets=self.params.maxDets[1])
	stats[8] = _summarize(0, maxDets=self.params.maxDets[2])
	stats[9] = _summarize(0,
						  areaRng='small',
						  maxDets=self.params.maxDets[2])
	stats[10] = _summarize(0,
						   areaRng='medium',
						   maxDets=self.params.maxDets[2])
	stats[11] = _summarize(0,
						   areaRng='large',
						   maxDets=self.params.maxDets[2])
	stats[12] = _summarize(1, iouThr=.6, maxDets=self.params.maxDets[2])
	stats[13] = _summarize(1, iouThr=.7, maxDets=self.params.maxDets[2])
	stats[14] = _summarize(1, iouThr=.8, maxDets=self.params.maxDets[2])
	stats[15] = _summarize(1, iouThr=.9, maxDets=self.params.maxDets[2])
	return stats



mmdet_20211213/mmdet/datasets/coco.py
def evaluate()# mapping of cocoEval.stats
        coco_metric_names = {
            'mAP': 0,
            'mAP_50': 1,
            'mAP_75': 2,
            'mAP_s': 3,
            'mAP_m': 4,
            'mAP_l': 5,
            'AR@100': 6,
            'AR@300': 7,
            'AR@1000': 8,
            'AR_s@1000': 9,
            'AR_m@1000': 10,
            'AR_l@1000': 11,
            'mAP_60': 12,  # usr_add1231:
            'mAP_70': 13,  # usr_add1231:
            'mAP_80': 14,  # usr_add1231:
            'mAP_90': 15,  # usr_add1231:
        }
        略
                metric_items = [
                    'mAP', 'mAP_50', 'mAP_75', 'mAP_s', 'mAP_m', 'mAP_l',
                    'mAP_60', 'mAP_70', 'mAP_80', 'mAP_90',  # usr_add1231:
                ]
        略
            ap = cocoEval.stats[:6]
            ap_extra = cocoEval.stats[-4:]  # usr_add1231:
            eval_results[f'{metric}_mAP_copypaste'] = (
                f'{ap[0]:.3f} {ap[1]:.3f} {ap[2]:.3f} {ap[3]:.3f} '
                f'{ap[4]:.3f} {ap[5]:.3f} '
                f'{ap_extra[-4]:.3f} {ap_extra[-3]:.3f} {ap_extra[-2]:.3f} {ap_extra[-1]:.3f}')  # usr_add1231:

mmdetection/build路径下的.so文件

  Linux下的.so是基于Linux下的动态链接,其功能和作用类似与windows下.dll文件。
  通常情况下,对函数库的链接是放在编译时期(compile time)完成的。所有相关的对象文件(object file)与牵涉到的函数库(library)被链接合成一个可执行文件(executable file)。程序在运行时,与函数库再无瓜葛,因为所有需要的函数已拷贝到自己门下。所以这些函数库被成为静态库(static libaray),通常文件名为“libxxx.a”的形式。
  其实,我们也可以把对一些库函数的链接载入推迟到程序运行的时期(runtime)。这就是如雷贯耳的动态链接库(dynamic link library)技术。
摘自:linux 中的.so和.a文件 - 心田居士 - 博客园 20190616

  mmdetection/build路径下的这些.so文件是通过编译mmdetection/mmdet/ops/下的相关文件(.py,.so,.cpp,.cu)生成的;
  因此,如果对mmdetection/mmdet/ops/下的相关文件作了修改,需要通过python setup.py develop or pip install -v -e .命令重新编译一下mmdetection,不重新编译的话载入的依旧是原来的动态链接库。

20200729记:
mmdetection v2.1.0 99a31d2 on 9 Jun 2020

卷 软件 的文件夹 PATH 列表
mmdetection/build

│  
├─lib.linux-x86_64-3.8
│  └─mmdet
│      └─ops
│          ├─carafe
│          │      carafe_ext.cpython-38-x86_64-linux-gnu.so
│          │      carafe_naive_ext.cpython-38-x86_64-linux-gnu.so
│          │      
│          ├─corner_pool
│          │      corner_pool_ext.cpython-38-x86_64-linux-gnu.so
│          │      
│          ├─dcn
│          │      deform_conv_ext.cpython-38-x86_64-linux-gnu.so
│          │      deform_pool_ext.cpython-38-x86_64-linux-gnu.so
│          │      
│          ├─masked_conv
│          │      masked_conv2d_ext.cpython-38-x86_64-linux-gnu.so
│          │      
│          ├─nms
│          │      nms_ext.cpython-38-x86_64-linux-gnu.so
│          │      
│          ├─roi_align
│          │      roi_align_ext.cpython-38-x86_64-linux-gnu.so
│          │      
│          ├─roi_pool
│          │      roi_pool_ext.cpython-38-x86_64-linux-gnu.so
│          │      
│          ├─sigmoid_focal_loss
│          │      sigmoid_focal_loss_ext.cpython-38-x86_64-linux-gnu.so
│          │      
│          └─utils
│                  compiling_info.cpython-38-x86_64-linux-gnu.so
│                  
└─temp.linux-x86_64-3.8
    └─mmdet
        └─ops
            ├─carafe
            │  └─src
            │      │  carafe_ext.o
            │      │  carafe_naive_ext.o
            │      │  
            │      └─cuda
            │              carafe_cuda.o
            │              carafe_cuda_kernel.o
            │              carafe_naive_cuda.o
            │              carafe_naive_cuda_kernel.o
            │              
            ├─corner_pool
            │  └─src
            │          corner_pool.o
            │          
            ├─dcn
            │  └─src
            │      │  deform_conv_ext.o
            │      │  deform_pool_ext.o
            │      │  
            │      └─cuda
            │              deform_conv_cuda.o
            │              deform_conv_cuda_kernel.o
            │              deform_pool_cuda.o
            │              deform_pool_cuda_kernel.o
            │              
            ├─masked_conv
            │  └─src
            │      │  masked_conv2d_ext.o
            │      │  
            │      └─cuda
            │              masked_conv2d_cuda.o
            │              masked_conv2d_kernel.o
            │              
            ├─nms
            │  └─src
            │      │  nms_ext.o
            │      │  
            │      ├─cpu
            │      │      nms_cpu.o
            │      │      
            │      └─cuda
            │              nms_cuda.o
            │              nms_kernel.o
            │              
            ├─roi_align
            │  └─src
            │      │  roi_align_ext.o
            │      │  
            │      ├─cpu
            │      │      roi_align_v2.o
            │      │      
            │      └─cuda
            │              roi_align_kernel.o
            │              roi_align_kernel_v2.o
            │              
            ├─roi_pool
            │  └─src
            │      │  roi_pool_ext.o
            │      │  
            │      └─cuda
            │              roi_pool_kernel.o
            │              
            ├─sigmoid_focal_loss
            │  └─src
            │      │  sigmoid_focal_loss_ext.o
            │      │  
            │      └─cuda
            │              sigmoid_focal_loss_cuda.o
            │              
            └─utils
                └─src
                        compiling_info.o
                        


模型训练过程中的问题记录

杀死进程, 强制释放GPU

  在终端Terminal通过Ctrl+C强制终止模型的训练过程后, 如果GPU还没被释放掉, 可以通过命令杀死进程;

nvidia-smi  # 查看gpu利用率
fuser -v /dev/nvidia*  # 使用fuser命令显示所有占用nvidia设备的进程processID
kill -9 13754  # 杀死PID为13754的进程, PID号表示的是进程地址编号
ps -A -ostat,ppid,pid,cmd | grep -e '^[Zz]'  # 查看隐藏的进程

深度学习训练已经停止了,可GPU内存还在占用着,怎么办?_u014264373的博客 20200819

通过命令ps -ef|grep 进程ID查找该进程的父进程

GPU上某些进程杀不死,查了一下说是因为父进程虽然kill了,但是子进程的内存还没有释放。因此要重新杀死父进程,我们通过命令ps -ef|grep 进程ID,来查找其父进程。然后再使用kill -9 父进程ID来杀死。
关于GPU上进程杀不死的解决_geter_CS的博客-CSDN博客 20190214

如何查看并准确找到占用GPU的程序_XCCCCZ的博客-CSDN博客
查询GPU使用情况以及杀死GPU上的多个无用进程_南国那片枫叶的博客-CSDN博客 20190507

命令nvidia-smi 的打印输出解释

Explained Output of nvidia-smi Utility | by Shachi Kaul | Analytics Vidhya | Medium 20191216
GPU状态监测 nvidia-smi 命令详解_黄飞的博客专栏-CSDN博客 20180201
nvidia-smi查看GPU的使用信息并分析_薰衣草PK向日葵的博客-CSDN博客 20191113
Linux查看GPU信息和使用情况 - Oops!# - 博客园 20181127

nvidia-smi: command not found, but GPU works fine

问题描述
  nvidia-smi: command not found, but GPU works fine;
原因分析and解决方案:
  先运行命令sudo apt purge nvidia-*, 然后在"系统设置 | 软件和更新 | 附加驱动"里面选中一个驱动后, 点击"Apply Changes";

  • drivers - nvidia-smi command not found Ubuntu 16.04 - Ask Ubuntu
  • 16.04 - nvidia-smi: command not found, but GPU works fine - Ask Ubuntu

Linux下/usr/lib/xorg/Xorg占用GPU显存过大问题

  • 方式一: 通过 Ctrl+Alt+F1~F7 关闭图形界面; Ctrl+Alt+F8 打开图形界面;
    Linux:Xorg占用现存过大问题_Hz_xi的博客-CSDN博客 20210715
    注: 键盘按键 Ctrl+Alt+F1~F7 关闭图形界面后, GPU-Util的确下降了. 但如果在用向日葵远程控制的话, 关闭图形界面后黑屏状态下, 就没法远程重新打开图形界面了.

  • 方式二(未测试): 该问题主要是由于系统默认使用独立显卡, 可以通过修改xorg的显卡使用来让xorg切换到集成显卡;
    解决/usr/lib/xorg/Xorg占用gpu显存的问题 - 简书 20210706

    usrname@usrname:~$ cat /etc/X11/xorg.conf
    cat: /etc/X11/xorg.conf: No such file or directory
    
  • 方式三: Just run sudo prime-select on-demand and reboot

    How to prevent Xorg process from using the GPU? on Ubuntu 20.04.3 LTS (with a RTX 3050 Ti) - Graphics / Linux / Linux - NVIDIA Developer Forums
    generix Top Contributor 20220121:
    Just run sudo prime-select on-demand and reboot. This will run everything on the intel igpu, only leaving a 4MB process on the nvidia left.
    If you also want to get of that, additionally create the file you mentioned in your first post, you’ll then have to make sure the nvidia-persistenced daemon is started on boot.
      
    user142861 20220121:
    Hey @generix , Well done! It was much simpler than I thought. Indeed, I have now my /usr/lib/xorg/Xorg process with only 4MB and nothing else. Exactly the kind of GPU optimization I was looking for.
    Thanks a lot for your quick support.

    尝试了执行命令 sudo prime-select on-demand, 但该命令在当前机器上用法不对;

    usrname@usrname:/usr/share/X11/xorg.conf.d$ sudo prime-select on-demand
    Usage: /usr/bin/prime-select nvidia|intel|query
    

多GPU训练

(usr_mmlab) usrname@usrname-System-Product-Name:~/usrname_workdir/usr/Pytorch_WorkSpace/OpenSourcePlatform/mmdetection$ ./tools/dist_train.sh configs/faster_rcnn/v2v3_11_faster_rcnn_r50_fpn_1x_voc.py 2
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
2020-10-10 11:39:12,029 - mmdet - INFO - Environment info:

  在终端运行命令的前面加上CUDA_VISIBLE_DEVICES=0,1,例如CUDA_VISIBLE_DEVICES=0,1 OMP_NUM_THREADS=1 python train_net.py
指定GPU训练 CUDA_VISIBLE_DEVICES_xkx_07_10的博客-CSDN博客 20190819
(待阅读) Pytorch 多GPU训练-单运算节点-All you need - walter_xh - 博客园 20190926

有关torch.distributed.all_reduce()的使用

PyTorch Distributed Developers
Distributed communication package - torch.distributed — PyTorch 1.8.1 documentation

mmdet/models/dense_heads/gfl_head.py
def reduce_mean(tensor):
    if not (dist.is_available() and dist.is_initialized()):
        return tensor
    tensor = tensor.clone()
	# 先除以the number of processes in the current process group;
	# 再Reduces the tensor data across all machines in such a way that all get the final result. 
	# After the call ``tensor`` is going to be bitwise identical in all processes;
    dist.all_reduce(tensor.div_(dist.get_world_size()), op=dist.ReduceOp.SUM)
    return tensor

num_total_samples = reduce_mean(torch.tensor(num_total_pos).cuda()).item()
num_total_samples = max(num_total_samples, 1.0)



mmdet/models/detectors/base.py
log_vars['loss'] = loss
for loss_name, loss_value in log_vars.items():
	# reduce loss when distributed training
	if dist.is_available() and dist.is_initialized():
		loss_value = loss_value.data.clone()
		dist.all_reduce(loss_value.div_(dist.get_world_size()))
	log_vars[loss_name] = loss_value.item()

mAP跳动, 相同config下前后训练得到的模型的性能不一样

model training is not reproducible · Issue #2773 · open-mmlab/mmdetection · GitHub 20200524
@ZwwWayne sure, here are the 4 mAPs obtained with the command above: 0.6258, 0.6283, 0.6226, 0.6197.
tmp_a = [0.6258, 0.6283, 0.6226, 0.6197], 跳动范围max(tmp_a)-min(tmp_a)=0.0085

model training is not reproducible · Issue #2773 · open-mmlab/mmdetection · GitHub 20200524
Hi @aabramovrepo ,
I also try to run the same config for four times, and obtain 0.8086, 0.7974, 0.7985, and 0.8009 AP. So it seems that the performance on VOC is indeed more unstable than the detectors on COCO dataset.
tmp_b = [0.8086, 0.7974, 0.7985, 0.8009], 跳动范围max(tmp_b)-min(tmp_b)=0.0112

训练模型收敛性问题

Divergence while training Mask RCNN with ResNet (50 or 101 backbone) on a custom COCO type format dataset · Issue #3557 · open-mmlab/mmdetection · GitHub 20200814

cuda,cudnn

  • Ubuntu系统查看显卡型号NVIDIA Corporation [10DE:1E82] -display UNCLAIMED_u011622434的博客-CSDN博客 20190718
  • Linux下如何查看NVIDIA显卡信息_百度知道 20171006
  • nvidia显卡和CUDA版本关系 - 简书 20210322
  • *** pytorch、显卡、显卡驱动、cuda版本是如何对应的 - 简书 20210206

同时安装CUDA10.0和CUDA10.1及版本切换

  如何同时安装CUDA10.0和CUDA10.1?
  在已经安装了cuda 10.0的环境下再安装一下cuda 10.1,直接在官网下载安装cuda 10.1和cudnn 7.6.1;
  同时安装了CUDA10.0和CUDA10.1,Linux系统下如何切换?
  当需要CUDA10.0时,

sudo rm -rf /usr/local/cuda
sudo ln -s /usr/local/cuda-10.0 /usr/local/cuda
nvcc --version

  当需要CUDA10.1时,

sudo rm -rf /usr/local/cuda  
sudo ln -s /usr/local/cuda-10.1 /usr/local/cuda  
nvcc --version

TODO待办

  • TODO: MMDet 中 AP 是如何计算的?
  • TODO: 如何把测试集中检测器识别效果不好的图片给过滤出来?
  • TODO: 将 gt bbox 和 predicted box 同时画在图片上;

问题记录

RuntimeError: xxx must be contiguous

问题描述:

det_bboxes, det_scores = (pred_bboxes[:, :4], pred_bboxes[:, 4])

batched_nms(det_bboxes, det_scores, det_labels, nms_cfg)

Traceback (most recent call last):
  File "/home/usrname/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3343, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "", line 1, in 
    batched_nms(det_bboxes, det_scores, det_labels, nms_cfg)
  File "/home/usrname/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/mmcv/ops/nms.py", line 259, in batched_nms
    dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg_)
  File "/home/usrname/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/mmcv/utils/misc.py", line 310, in new_func
    output = old_func(*args, **kwargs)
  File "/home/usrname/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/mmcv/ops/nms.py", line 113, in nms
    inds = NMSop.apply(boxes, scores, iou_threshold, offset)
  File "/home/usrname/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/mmcv/ops/nms.py", line 18, in forward
    inds = ext_module.nms(
RuntimeError: scores must be contiguous (nms at ./mmcv/ops/csrc/pytorch/nms.cpp:67)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f788a3d1627 in /home/usrname/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: nms(at::Tensor, at::Tensor, float, int) + 0x5b3 (0x7f7865bb7f53 in /home/usrname/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so)
frame #2:  + 0xc7e09 (0x7f7865ae5e09 in /home/usrname/miniconda2/envs/usr_mmlab/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so)
frame #3: ...

原因分析and解决方案:
  原因是由于the tensor det_labels is not contiguous; 解决方案是执行.contiguous()操作;即改为batched_nms(det_bboxes, det_scores.contiguous(), det_labels, nms_cfg);

执行return obj_cls(**args)该行代码时无法构造实例

  mmcv/utils/registry.py def build_from_cfg(): 执行 return obj_cls(**args) 该行代码时一直无法构造实例, 后来突然发现是因为在类中定义了两个 def __init__() 函数;

xxxDataset is not in the dataset registry

问题描述:
  已经register了自己的数据集后, 仍然报错xxxDataset is not in the dataset registry;
原因分析and解决方案:
  由于安装mmdet v2.18.0时使用的是mim install mmdet命令, 因此程序中实际调用执行的仍然是之前的脚本文件; 于是, 先通过pip uninstall mmdet命令卸载, 再通过以下命令重新安装一下mmdet, 问题便得以解决;

git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"
  • jshilong commented on 7 Aug 2021
    Please make sure that the mmdet you use is modified which adding the SunrgbdDataset,I am afraid you are using the version installed in the environment before
    SUNRGBDDataset is not in the dataset registry · Issue #5800 · open-mmlab/mmdetection · GitHub

Can’t Evaluate a Model Trained With Pytorch 1.6 under Pytorch 1.4 env; the torch.save() serialization format;

  • Can’t Evaluate a Model Trained With Pytorch 1.6 under Pytorch 1.4 env, but the opposite works · Issue #3636 · open-mmlab/mmdetection · GitHub 20200828
  • failed to load binary files saved from nightly build in version 1.5 · Issue #40140 · pytorch/pytorch · GitHub 20200617

“RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one”

问题描述
  开始


2022-05-18 09:41:08,390 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration.
2022-05-18 09:41:22,162 - mmdet - INFO - Epoch [290][50/9846]	lr: 4.621e-04, eta: 8 days, 2:48:26, time: 0.345, data_time: 0.145, memory: 7110, loss_cls: 1.0795, loss_bbox: 2.1580, loss_obj: 1.6240, loss: 4.8616
2022-05-18 09:41:36,016 - mmdet - INFO - Epoch [290][100/9846]	lr: 4.620e-04, eta: 7 days, 7:37:21, time: 0.277, data_time: 0.098, memory: 7110, loss_cls: 0.7758, loss_bbox: 2.1439, loss_obj: 1.4736, loss: 4.3933
2022-05-18 09:41:50,135 - mmdet - INFO - Epoch [290][150/9846]	lr: 4.620e-04, eta: 7 days, 2:13:19, time: 0.282, data_time: 0.101, memory: 7110, loss_cls: 0.7527, loss_bbox: 2.1367, loss_obj: 1.4209, loss: 4.3103
2022-05-18 09:42:04,179 - mmdet - INFO - Epoch [290][200/9846]	lr: 4.620e-04, eta: 6 days, 23:18:18, time: 0.281, data_time: 0.112, memory: 7110, loss_cls: 0.7673, loss_bbox: 2.1797, loss_obj: 1.4003, loss: 4.3473
Traceback (most recent call last):
  File "/media/user/USER/amax/USER_users/usrname/OpenSourcePlatform/mmdet_2.23.0/./tools/train.py", line 223, in <module>
    main()
  File "/media/user/USER/amax/USER_users/usrname/OpenSourcePlatform/mmdet_2.23.0/./tools/train.py", line 212, in main
    train_detector(
  File "/media/user/USER/amax/USER_users/usrname/OpenSourcePlatform/mmdet_2.23.0/mmdet/apis/train.py", line 208, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
    outputs = self.model.train_step(data_batch, self.optimizer,
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/mmcv/parallel/distributed.py", line 42, in train_step
    and self.reducer._rebuild_buckets()):
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`, and by 
making sure all `forward` function outputs participate in calculating loss. 
If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable).
Parameter indices which did not receive grad for rank 2: 372 373 374 375 376 377
 In addition, you can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print out information about which particular parameters did not receive gradient on this rank as part of this error
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 13219 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 13220 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 13222 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 2 (pid: 13221) of binary: /home/user/anaconda3/envs/usr_mmlab1213/bin/python
Traceback (most recent call last):
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/torch/distributed/run.py", line 710, in run
    elastic_launch(
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/user/anaconda3/envs/usr_mmlab1213/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
./tools/train.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------

原因分析:
  开始

  • (待阅读)RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. · Issue #43259 · pytorch/pytorch · GitHub

解决方案:
  开始

  • Frequently Asked Questions — MMDetection 2.24.1 documentation

待补充

  

待补充

  



文字居中

数学公式粗体 \textbf{} 或者 m e m o r y {\bf memory} memory
数学公式粗斜体 \bm{}

摘录自“bookname_author”
此文系转载,原文链接:名称 20200505

高亮颜色说明:突出重点
个人觉得,:待核准个人观点是否有误

分割线

分割线


我是颜色为00ffff的字体
我是字号为2的字体
我是颜色为00ffff, 字号为2的字体
我是字体类型为微软雅黑, 颜色为00ffff, 字号为2的字体

分割线

分割线
问题描述:
原因分析:
解决方案:

你可能感兴趣的:(CV,&,DL,编程实现,深度学习,pytorch)