创建虚拟环境
conda create -n open-mmlab python=3.7
安装pytorch环境(包括torchvision、cudatookit)
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
conda install pytorch==1.7.0 cudatoolkit=10.1 torchvision==0.8.0
安装openmim
pip install openmim
通过openmim安装mmcv-full、mmdet、mmrotate
mim install mmcv-full
mim install mmdet
#安装mmrotate有点特殊,可以使用mim install mmrotate,如果不成功,使用手动安装。
#去官网下载mmrotate或者git clone https://github.com/open-mmlab/mmrotate.git
cd mmrotate
pip install -r requirements/build.txt
pip install -v -e .(过程中可能需要C++开发工具,下一个visual studio开发工具安装就行)
使用Dockerfile创建镜像(此为另外一种方法)
# build an image with PyTorch 1.6, CUDA 10.1
docker build -t mmrotate docker/
# use the image generate one container
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmrotate/data mmrotate
# cd mmrotate
python demo/image_demo.py \
demo/demo.jpg \
work_dirs1/oriented_rcnn_r50_fpn_1x_dota_v3/oriented_rcnn_r50_fpn_1x_dota_v3.py \
work_dirs2/oriented_rcnn_r50_fpn_1x_dota_v3/epoch_12.pth
# 注:work_dirs1为你存放config的路径,work_dirs2为你存放权重的路径
目前mmrotate构建的数据集只有:DOTA数据集、SSDD数据集、HRSC数据集、HRSID 数据集,对于这几个数据集只需要更改config里面对应存放照片和标签的路径(data_root )。对于其他自己的数据集需要自己构建。
DOTA数据集下载地址:https://captain-whu.github.io/DOTA/dataset.html
SSDD数据集下载地址:https://pan.baidu.com/s/1_uezALB6eZ7DiPIozFoGJQ 密码:0518
HRSC数据集下载地址:https://aistudio.baidu.com/aistudio/datasetdetail/54106
HRSID数据集下载地址:https://pan.baidu.com/share/init?surl=vks9fj64Bb06U170GNL7mw 密码:0518
DOTA 数据集存储结构
# DOTA
mmrotate
├── mmrotate
├── tools
├── configs
├── data
│ ├── DOTA
│ │ ├── train
│ │ ├── val
│ │ ├── test
ssdd 数据集存储结构
# ssdd
mmrotate
├── mmrotate
├── tools
├── configs
├── data
│ ├── ssdd
│ │ ├── train
│ │ ├── test
hrsc数据集存储结构
mmrotate
├── mmrotate
├── tools
├── configs
├── data
│ ├── hrsc
│ │ ├── FullDataSet
│ │ │ ├─ AllImages
│ │ │ ├─ Annotations
│ │ │ ├─ LandMask
│ │ │ ├─ Segmentations
│ │ ├── ImageSets
hrsid 数据集存储结构
mmrotate
├── mmrotate
├── tools
├── configs
├── data
│ ├── hrsid
│ │ ├── trainsplit
│ │ ├── valsplit
│ │ ├── testsplit
change data_root in configs/_base_/datasets/dotav1.py to split DOTA dataset.
change data_root in configs/_base_/datasets/ssdd.py to data/ssdd/.
change data_root in configs/_base_/datasets/hrsc.py to data/hrsc/.
change data_root in configs/_base_/datasets/hrisd.py to data/hrsid/.
Please crop the original images into 1024×1024 patches with an overlap of 200 by run
python tools/data/dota/split/img_split.py --base-json \
tools/data/dota/split/split_configs/ss_trainval.json
python tools/data/dota/split/img_split.py --base-json \
tools/data/dota/split/split_configs/ss_test.json
If you want to get a multiple scale dataset, you can run the following command.
python tools/data/dota/split/img_split.py --base-json \
tools/data/dota/split/split_configs/ms_trainval.json
python tools/data/dota/split/img_split.py --base-json \
tools/data/dota/split/split_configs/ms_test.json
Please change the
img_dirs
andann_dirs
in json.
包括三类 单GPU、单节点多GPU、多节点
# single-gpu
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
# multi-gpu
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [optional arguments]
# multi-node in slurm environment
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments] --launcher slurm
例子: 参考 RotatedRetinaNet on DOTA-1.0 dataset,可以生成压缩文件在线提交。(先更改data_root。)
# 单GPU
python ./tools/test.py \
configs/rotated_retinanet/rotated_retinanet_obb_r50_fpn_1x_dota_le90.py \
checkpoints/SOME_CHECKPOINT.pth --format-only \
--eval-options submission_dir=work_dirs/Task1_results
# 单节点多GPU,指定GPU的数目为1
./tools/dist_test.sh \
configs/rotated_retinanet/rotated_retinanet_obb_r50_fpn_1x_dota_le90.py \
checkpoints/SOME_CHECKPOINT.pth 1 --format-only \
--eval-options submission_dir=work_dirs/Task1_results
您可以将data_root中的测试集路径更改为 val 集或 trainval 集以进行离线评估。
# 单GPU
python ./tools/test.py \
configs/rotated_retinanet/rotated_retinanet_obb_r50_fpn_1x_dota_le90.py \
checkpoints/SOME_CHECKPOINT.pth --eval mAP
# 单节点多GPU,指定GPU的数目为1
./tools/dist_test.sh \
configs/rotated_retinanet/rotated_retinanet_obb_r50_fpn_1x_dota_le90.py \
checkpoints/SOME_CHECKPOINT.pth 1 --eval mAP
将结果进行可视化
python ./tools/test.py \
configs/rotated_retinanet/rotated_retinanet_obb_r50_fpn_1x_dota_le90.py \
checkpoints/SOME_CHECKPOINT.pth \
--show-dir work_dirs/vis
模型训练一共分为5种方式:使用单个 GPU 进行训练、使用多个 GPU 进行训练、多台机器训练、使用 Slurm 管理作业、在一台机器上启动多个作业。
# 单GPU,如果要在命令中指定工作目录,可以添加参数。--work_dir ${YOUR_WORK_DIR}
python tools/train.py ${CONFIG_FILE} [optional arguments]
# 单节点多GPU
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
可选参数是:
--no-validate(不建议):默认情况下,代码库将在训练期间执行评估。要禁用此行为,请使用--no-validate.
--work-dir ${WORK_DIR}:覆盖配置文件中指定的工作目录。
--resume-from ${CHECKPOINT_FILE}:从以前的检查点文件恢复。
resume-from和之间的区别load-from: resume-from同时加载模型权重和优化器状态,epoch 也是从指定的检查点继承而来的。它通常用于恢复意外中断的训练过程。 load-from只加载模型权重,训练epoch从0开始。通常用于finetuning。
使用仅通过以太网连接的多台机器启动,可以简单地运行以下命令:
# 在第一台机器上:
NNODES=2 NODE_RANK=0 PORT=$MASTER_PORT MASTER_ADDR=$MASTER_ADDR sh tools/dist_train.sh $CONFIG $GPUS
# 在第二台机器上:
NNODES=2 NODE_RANK=1 PORT=$MASTER_PORT MASTER_ADDR=$MASTER_ADDR sh tools/dist_train.sh $CONFIG $GPUS
注:没有像 InfiniBand 这样的高速网络,通常会很慢。
如果在单台机器上启动多个作业,例如在一台有 8 个 GPU 的机器上进行 2 个 4-GPU 训练的作业,则需要为每个作业指定不同的端口(默认为 29500)以避免通信冲突。
(1)若用dist_train.sh启动训练作业,您可以在命令中设置端口。
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4
(2)若使用 Slurm 启动训练作业,则需要修改配置文件(通常配置文件底部的第 6 行)以设置不同的通信端口。
# 在config1.py:
dist_params = dict(backend='nccl', port=29500)
# 在config2.py:
dist_params = dict(backend='nccl', port=29501)
#启动两个作业。
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
Backbone | mAP | Angle | lr schd | Mem (GB) | Inf Time (fps) | Aug | Batch Size | Configs | Download |
---|---|---|---|---|---|---|---|---|---|
ResNet50 (1024,1024,200) | 59.44 | oc | 1x | 3.45 | 15.6 | - | 2 | [rotated_reppoints_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_reppoints/rotated_reppoints_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 64.55 | oc | 1x | 3.38 | 15.7 | - | 2 | [rotated_retinanet_hbb_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_retinanet/rotated_retinanet_hbb_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 65.59 | oc | 1x | 3.12 | 18.5 | - | 2 | [rotated_atss_hbb_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_atss/rotated_atss_hbb_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 66.45 | oc | 1x | 3.53 | 15.3 | - | 2 | [sasm_reppoints_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/sasm/sasm_reppoints_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 68.42 | le90 | 1x | 3.38 | 16.9 | - | 2 | [rotated_retinanet_obb_r50_fpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_retinanet/rotated_retinanet_obb_r50_fpn_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 68.79 | le90 | 1x | 2.36 | 22.4 | - | 2 | [rotated_retinanet_obb_r50_fpn_fp16_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_retinanet_obb_r50_fpn_fp16_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 69.49 | le135 | 1x | 4.05 | 8.6 | - | 2 | [g_reppoints_r50_fpn_1x_dota_le135](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/g_reppoints/g_reppoints_r50_fpn_1x_dota_le135.py) | model | log |
ResNet50 (1024,1024,200) | 69.51 | le90 | 1x | 4.40 | 24.0 | - | 2 | [rotated_retinanet_obb_csl_gaussian_r50_fpn_fp16_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/csl/rotated_retinanet_obb_csl_gaussian_r50_fpn_fp16_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 69.55 | oc | 1x | 3.39 | 15.5 | - | 2 | [rotated_retinanet_hbb_gwd_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/gwd/rotated_retinanet_hbb_gwd_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 69.60 | le90 | 1x | 3.38 | 15.1 | - | 2 | [rotated_retinanet_hbb_kfiou_r50_fpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/kfiou/rotated_retinanet_hbb_kfiou_r50_fpn_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 69.63 | le135 | 1x | 3.45 | 16.1 | - | 2 | [cfa_r50_fpn_1x_dota_le135](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/cfa/cfa_r50_fpn_1x_dota_le135.py) | model | log |
ResNet50 (1024,1024,200) | 69.76 | oc | 1x | 3.39 | 15.6 | - | 2 | [rotated_retinanet_hbb_kfiou_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/kfiou/rotated_retinanet_hbb_kfiou_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 69.77 | le135 | 1x | 3.38 | 15.3 | - | 2 | [rotated_retinanet_hbb_kfiou_r50_fpn_1x_dota_le135](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/kfiou/rotated_retinanet_hbb_kfiou_r50_fpn_1x_dota_le135.py) | model | log |
ResNet50 (1024,1024,200) | 69.79 | le135 | 1x | 3.38 | 17.2 | - | 2 | [rotated_retinanet_obb_r50_fpn_1x_dota_le135](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_retinanet/rotated_retinanet_obb_r50_fpn_1x_dota_le135.py) | model | log |
ResNet50 (1024,1024,200) | 69.80 | oc | 1x | 3.54 | 12.4 | - | 2 | [r3det_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/r3det/r3det_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 69.94 | oc | 1x | 3.39 | 15.6 | - | 2 | [rotated_retinanet_hbb_kld_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/kld/rotated_retinanet_hbb_kld_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 70.18 | oc | 1x | 3.23 | 15.6 | - | 2 | [r3det_tiny_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/r3det/r3det_tiny_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 70.64 | le90 | 1x | 3.12 | 18.2 | - | 2 | [rotated_atss_obb_r50_fpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_atss/rotated_atss_obb_r50_fpn_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 71.83 | oc | 1x | 3.54 | 12.4 | - | 2 | [r3det_kld_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/kld/r3det_kld_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 72.29 | le135 | 1x | 3.19 | 18.8 | - | 2 | [rotated_atss_obb_r50_fpn_1x_dota_le135](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_atss/rotated_atss_obb_r50_fpn_1x_dota_le135.py) | model | log |
ResNet50 (1024,1024,200) | 72.68 | oc | 1x | 3.62 | 12.2 | - | 2 | [r3det_kfiou_ln_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/kfiou/r3det_kfiou_ln_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 72.76 | oc | 1x | 3.44 | 14.0 | - | 2 | [r3det_tiny_kld_r50_fpn_1x_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/kld/r3det_tiny_kld_r50_fpn_1x_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 73.23 | le90 | 1x | 8.45 | 16.4 | - | 2 | [gliding_vertex_r50_fpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/gliding_vertex/gliding_vertex_r50_fpn_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 73.40 | le90 | 1x | 8.46 | 16.5 | - | 2 | [rotated_faster_rcnn_r50_fpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_faster_rcnn/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 73.45 | oc | 40e | 3.45 | 16.1 | - | 2 | [cfa_r50_fpn_40e_dota_oc](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/cfa/cfa_r50_fpn_40e_dota_oc.py) | model | log |
ResNet50 (1024,1024,200) | 73.91 | le135 | 1x | 3.14 | 15.5 | - | 2 | [s2anet_r50_fpn_1x_dota_le135](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/s2anet/s2anet_r50_fpn_1x_dota_le135.py) | model | log |
ResNet50 (1024,1024,200) | 74.19 | le135 | 1x | 2.17 | 17.4 | - | 2 | [s2anet_r50_fpn_fp16_1x_dota_le135](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/s2anet_r50_fpn_fp16_1x_dota_le135.py) | model | log |
ResNet50 (1024,1024,200) | 75.63 | le90 | 1x | 7.37 | 21.2 | - | 2 | [oriented_rcnn_r50_fpn_fp16_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/oriented_rcnn_r50_fpn_fp16_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 75.69 | le90 | 1x | 8.46 | 16.2 | - | 2 | [oriented_rcnn_r50_fpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 75.75 | le90 | 1x | 7.56 | 19.3 | - | 2 | [roi_trans_r50_fpn_fp16_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/roi_trans_r50_fpn_fp16_1x_dota_le90.py) | model | log |
ReResNet50 (1024,1024,200) | 75.99 | le90 | 1x | 7.71 | 13.3 | - | 2 | [redet_re50_refpn_fp16_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/redet_re50_refpn_fp16_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 76.08 | le90 | 1x | 8.67 | 14.4 | - | 2 | [roi_trans_r50_fpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/roi_trans/roi_trans_r50_fpn_1x_dota_le90.py) | model | log |
ResNet50 (1024,1024,200) | 76.50 | le90 | 1x | 17.5 | MS+RR | 2 | [rotated_retinanet_obb_r50_fpn_1x_dota_ms_rr_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/rotated_retinanet/rotated_retinanet_obb_r50_fpn_1x_dota_ms_rr_le90.py) | model | log | |
ReResNet50 (1024,1024,200) | 76.68 | le90 | 1x | 9.32 | 10.9 | - | 2 | [redet_re50_refpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/redet/redet_re50_refpn_1x_dota_le90.py) | model | log |
Swin-tiny (1024,1024,200) | 77.51 | le90 | 1x | 10.9 | - | 2 | [roi_trans_swin_tiny_fpn_1x_dota_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/roi_trans/roi_trans_swin_tiny_fpn_1x_dota_le90.py) | model | log | |
ResNet50 (1024,1024,200) | 79.66 | le90 | 1x | 14.4 | MS+RR | 2 | [roi_trans_r50_fpn_1x_dota_ms_rr_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/roi_trans/roi_trans_r50_fpn_1x_dota_ms_rr_le90.py) | model | log | |
ReResNet50 (1024,1024,200) | 79.87 | le90 | 1x | 10.9 | MS+RR | 2 | [redet_re50_refpn_1x_dota_ms_rr_le90](file:/C:/Users/hp-pc/Desktop/Experiment/mmrotate/configs/redet/redet_re50_refpn_1x_dota_ms_rr_le90.py) | model | log |
MS
means multiple scale image split.RR
means random rotation.{model}_[model setting]_{backbone}_{neck}_[norm setting]_[misc]_[gpu x batch_per_gpu]_{dataset}_{data setting}_{angle version}
{xxx}
是必填字段并且[yyy]
是可选的。
{model}
: 模型类型,如rotated_faster_rcnn
,rotated_retinanet
等。[model setting]
:某些模型的特定设置,例如hbb
forrotated_retinanet
等。{backbone}
:骨干类型,如r50
(ResNet-50),swin_tiny
(SWIN-tiny)。{neck}
: 颈型如fpn
, refpn
.[norm_setting]
: bn
(Batch Normalization) 除非指定,否则使用,其他规范层类型可以是gn
(Group Normalization), syncbn
(Synchronized Batch Normalization)。 gn-head
/gn-neck
表示 GN 仅应用于头部/颈部,而gn-all
表示 GN 应用于整个模型,例如骨干、颈部、头部。[misc]
: 模型的其他设置/插件,例如dconv
, gcb
, attention
, albu
, mstrain
.[gpu x batch_per_gpu]
:GPU 和每个 GPU 的样本,1xb2
默认使用。{dataset}
: 数据集如dota
.{angle version}
: 像oc
, le135
, 或le90
.ResNet50 和 FPN 的 在 RotatedRetinaNet 中的配置
angle_version = 'oc' # The angle version
model = dict(
type='RotatedRetinaNet', # The name of detector
backbone=dict( # The config of backbone
type='ResNet', # The type of the backbone
depth=50, # The depth of backbone
num_stages=4, # Number of stages of the backbone.
out_indices=(0, 1, 2, 3), # The index of output feature maps produced in each stages
frozen_stages=1, # The weights in the first 1 stage are fronzen
zero_init_residual=False, # Whether to use zero init for last norm layer in resblocks to let them behave as identity.
norm_cfg=dict( # The config of normalization layers.
type='BN', # Type of norm layer, usually it is BN or GN
requires_grad=True), # Whether to train the gamma and beta in BN
norm_eval=True, # Whether to freeze the statistics in BN
style='pytorch', # The style of backbone, 'pytorch' means that stride 2 layers are in 3x3 conv, 'caffe' means stride 2 layers are in 1x1 convs.
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), # The ImageNet pretrained backbone to be loaded
neck=dict(
type='FPN', # The neck of detector is FPN. We also support 'ReFPN'
in_channels=[256, 512, 1024, 2048], # The input channels, this is consistent with the output channels of backbone
out_channels=256, # The output channels of each level of the pyramid feature map
start_level=1, # Index of the start input backbone level used to build the feature pyramid
add_extra_convs='on_input', # It specifies the source feature map of the extra convs
num_outs=5), # The number of output scales
bbox_head=dict(
type='RotatedRetinaHead',# The type of bbox head is 'RRetinaHead'
num_classes=15, # Number of classes for classification
in_channels=256, # Input channels for bbox head
stacked_convs=4, # Number of stacking convs of the head
feat_channels=256, # Number of hidden channels
assign_by_circumhbbox='oc', # The angle version of obb2hbb
anchor_generator=dict( # The config of anchor generator
type='RotatedAnchorGenerator', # The type of anchor generator
octave_base_scale=4, # The base scale of octave.
scales_per_octave=3, # Number of scales for each octave.
ratios=[1.0, 0.5, 2.0], # The ratio between height and width.
strides=[8, 16, 32, 64, 128]), # The strides of the anchor generator. This is consistent with the FPN feature strides.
bbox_coder=dict( # Config of box coder to encode and decode the boxes during training and testing
type='DeltaXYWHAOBBoxCoder', # Type of box coder.
angle_range='oc', # The angle version of box coder.
norm_factor=None, # The norm factor of box coder.
edge_swap=False, # The edge swap flag of box coder.
proj_xy=False, # The project flag of box coder.
target_means=(0.0, 0.0, 0.0, 0.0, 0.0), # The target means used to encode and decode boxes
target_stds=(1.0, 1.0, 1.0, 1.0, 1.0)), # The standard variance used to encode and decode boxes
loss_cls=dict( # Config of loss function for the classification branch
type='FocalLoss', # Type of loss for classification branch
use_sigmoid=True, # Whether the prediction is used for sigmoid or softmax
gamma=2.0, # The gamma for calculating the modulating factor
alpha=0.25, # A balanced form for Focal Loss
loss_weight=1.0), # Loss weight of the classification branch
loss_bbox=dict( # Config of loss function for the regression branch
type='L1Loss', # Type of loss
loss_weight=1.0)), # Loss weight of the regression branch
train_cfg=dict( # Config of training hyperparameters
assigner=dict( # Config of assigner
type='MaxIoUAssigner', # Type of assigner
pos_iou_thr=0.5, # IoU >= threshold 0.5 will be taken as positive samples
neg_iou_thr=0.4, # IoU < threshold 0.4 will be taken as negative samples
min_pos_iou=0, # The minimal IoU threshold to take boxes as positive samples
ignore_iof_thr=-1, # IoF threshold for ignoring bboxes
iou_calculator=dict(type='RBboxOverlaps2D')), # Type of Calculator for IoU
allowed_border=-1, # The border allowed after padding for valid anchors.
pos_weight=-1, # The weight of positive samples during training.
debug=False), # Whether to set the debug mode
test_cfg=dict( # Config of testing hyperparameters
nms_pre=2000, # The number of boxes before NMS
min_bbox_size=0, # The allowed minimal box size
score_thr=0.05, # Threshold to filter out boxes
nms=dict(iou_thr=0.1), # NMS threshold
max_per_img=2000)) # The number of boxes to be kept after NMS.
dataset_type = 'DOTADataset' # Dataset type, this will be used to define the dataset
data_root = '../datasets/split_1024_dota1_0/' # Root path of data
img_norm_cfg = dict( # Image normalization config to normalize the input images
mean=[123.675, 116.28, 103.53], # Mean values used to pre-training the pre-trained backbone models
std=[58.395, 57.12, 57.375], # Standard variance used to pre-training the pre-trained backbone models
to_rgb=True) # The channel orders of image used to pre-training the pre-trained backbone models
train_pipeline = [ # Training pipeline
dict(type='LoadImageFromFile'), # First pipeline to load images from file path
dict(type='LoadAnnotations', # Second pipeline to load annotations for current image
with_bbox=True), # Whether to use bounding box, True for detection
dict(type='RResize', # Augmentation pipeline that resize the images and their annotations
img_scale=(1024, 1024)), # The largest scale of image
dict(type='RRandomFlip', # Augmentation pipeline that flip the images and their annotations
flip_ratio=0.5, # The ratio or probability to flip
version='oc'), # The angle version
dict(
type='Normalize', # Augmentation pipeline that normalize the input images
mean=[123.675, 116.28, 103.53], # These keys are the same of img_norm_cfg since the
std=[58.395, 57.12, 57.375], # keys of img_norm_cfg are used here as arguments
to_rgb=True),
dict(type='Pad', # Padding config
size_divisor=32), # The number the padded images should be divisible
dict(type='DefaultFormatBundle'), # Default format bundle to gather data in the pipeline
dict(type='Collect', # Pipeline that decides which keys in the data should be passed to the detector
keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
dict(type='LoadImageFromFile'), # First pipeline to load images from file path
dict(
type='MultiScaleFlipAug', # An encapsulation that encapsulates the testing augmentations
img_scale=(1024, 1024), # Decides the largest scale for testing, used for the Resize pipeline
flip=False, # Whether to flip images during testing
transforms=[
dict(type='RResize'), # Use resize augmentation
dict(
type='Normalize', # Normalization config, the values are from img_norm_cfg
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', # Padding config to pad images divisible by 32.
size_divisor=32),
dict(type='DefaultFormatBundle'), # Default format bundle to gather data in the pipeline
dict(type='Collect', # Collect pipeline that collect necessary keys for testing.
keys=['img'])
])
]
data = dict(
samples_per_gpu=2, # Batch size of a single GPU
workers_per_gpu=2, # Worker to pre-fetch data for each single GPU
train=dict( # Train dataset config
type='DOTADataset', # Type of dataset
ann_file=
'../datasets/split_1024_dota1_0/trainval/annfiles/', # Path of annotation file
img_prefix=
'../datasets/split_1024_dota1_0/trainval/images/', # Prefix of image path
pipeline=[ # pipeline, this is passed by the train_pipeline created before.
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='RResize', img_scale=(1024, 1024)),
dict(type='RRandomFlip', flip_ratio=0.5, version='oc'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
],
version='oc'),
val=dict( # Validation dataset config
type='DOTADataset',
ann_file=
'../datasets/split_1024_dota1_0/trainval/annfiles/',
img_prefix=
'../datasets/split_1024_dota1_0/trainval/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='RResize'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
],
version='oc'),
test=dict( # Test dataset config, modify the ann_file for test-dev/test submission
type='DOTADataset',
ann_file=
'../datasets/split_1024_dota1_0/test/images/',
img_prefix=
'../datasets/split_1024_dota1_0/test/images/',
pipeline=[ # Pipeline is passed by test_pipeline created before
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='RResize'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
],
version='oc'))
evaluation = dict( # The config to build the evaluation hook
interval=12, # Evaluation interval
metric='mAP') # Metrics used during evaluation
optimizer = dict( # Config used to build optimizer
type='SGD', # Type of optimizers
lr=0.0025, # Learning rate of optimizers
momentum=0.9, # Momentum
weight_decay=0.0001) # Weight decay of SGD
optimizer_config = dict( # Config used to build the optimizer hook
grad_clip=dict(
max_norm=35,
norm_type=2))
lr_config = dict( # Learning rate scheduler config used to register LrUpdater hook
policy='step', # The policy of scheduler
warmup='linear', # The warmup policy, also support `exp` and `constant`.
warmup_iters=500, # The number of iterations for warmup
warmup_ratio=0.3333333333333333, # The ratio of the starting learning rate used for warmup
step=[8, 11]) # Steps to decay the learning rate
runner = dict(
type='EpochBasedRunner', # Type of runner to use (i.e. IterBasedRunner or EpochBasedRunner)
max_epochs=12) # Runner that runs the workflow in total max_epochs. For IterBasedRunner use `max_iters`
checkpoint_config = dict( # Config to set the checkpoint hook
interval=12) # The save interval is 12
log_config = dict( # config to register logger hook
interval=50, # Interval to print the log
hooks=[
# dict(type='TensorboardLoggerHook') # The Tensorboard logger is also supported
dict(type='TextLoggerHook')
]) # The logger used to record the training process.
dist_params = dict(backend='nccl') # Parameters to setup distributed training, the port can also be set.
log_level = 'INFO' # The level of logging.
load_from = None # load models as a pre-trained model from a given path. This will not resume training.
resume_from = None # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved.
workflow = [('train', 1)] # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 12 epochs according to the total_epochs.
work_dir = './work_dirs/rotated_retinanet_hbb_r50_fpn_1x_dota_oc' # Directory to save the model checkpoints and logs for the current experiments.
配置文件中使用了一些中间变量,例如数据集中的
train_pipeline
/test_pipeline
。值得注意的是,在修改子配置中的中间变量时,用户需要再次将中间变量传递到相应的字段中。例如,我们想使用离线多尺度策略来训练 RoI-Trans。train_pipeline
是我们想要修改的中间变量。
#我们首先定义新的train_pipeline/test_pipeline并将它们传递给data.
_base_ = ['./roi_trans_r50_fpn_1x_dota_le90.py']
data_root = '../datasets/split_ms_dota1_0/'
angle_version = 'le90'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='RResize', img_scale=(1024, 1024)),
dict(
type='RRandomFlip',
flip_ratio=[0.25, 0.25, 0.25],
direction=['horizontal', 'vertical', 'diagonal'],
version=angle_version),
dict(
type='PolyRandomRotate',
rotate_ratio=0.5,
angles_range=180,
auto_bound=False,
version=angle_version),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
data = dict(
train=dict(
pipeline=train_pipeline,
ann_file=data_root + 'trainval/annfiles/',
img_prefix=data_root + 'trainval/images/'),
val=dict(
ann_file=data_root + 'trainval/annfiles/',
img_prefix=data_root + 'trainval/images/'),
test=dict(
ann_file=data_root + 'test/images/',
img_prefix=data_root + 'test/images/'))
#同样,如果我们想从 切换SyncBN到BNor MMSyncBN,我们需要替换norm_cfg配置中的 every 。
_base_ = './roi_trans_r50_fpn_1x_dota_le90.py'
norm_cfg = dict(type='BN', requires_grad=True)
model = dict(
backbone=dict(norm_cfg=norm_cfg),
neck=dict(norm_cfg=norm_cfg),
...)
# 最简单的方法是将数据集转换为现有数据集格式 (DOTA)。
# DOTA格式的注解txt文件:
184 2875 193 2923 146 2932 137 2885 plane 0
66 2095 75 2142 21 2154 11 2107 plane 0
...
# 每行代表一个对象,并将其记录为一个 10 维数组A。
# A[0:8]: 具有格式的多边形。(x1, y1, x2, y2, x3, y3, x4, y4)
# A[8]: 类别。
# A[9]: 困难。
这里我们举一个例子来展示上述两个步骤,它使用一个自定义的 5 类 COCO 格式的数据集来训练一个现有的 Cascade Mask R-CNN R50-FPN 检测器。
配置文件的修改涉及两个方面:
- data方面。具体来说,您需要显式地在数据中添加classes字段在
data.train``data.val``data.test
中- 模型部分中的num_classes字段。显式重写所有num_classes的默认值(例如COCO中的80)到你的类号。
#configs/my_custom_config.py:
# the new config inherits the base configs to highlight the necessary modification
_base_ = './rotated_retinanet_hbb_r50_fpn_1x_dota_oc'
# 1. dataset settings
dataset_type = 'DOTADataset'
classes = ('a', 'b', 'c', 'd', 'e')
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
# explicitly add your class names to the field `classes`
classes=classes,
ann_file='path/to/your/train/annotation_data',
img_prefix='path/to/your/train/image_data'),
val=dict(
type=dataset_type,
# explicitly add your class names to the field `classes`
classes=classes,
ann_file='path/to/your/val/annotation_data',
img_prefix='path/to/your/val/image_data'),
test=dict(
type=dataset_type,
# explicitly add your class names to the field `classes`
classes=classes,
ann_file='path/to/your/test/annotation_data',
img_prefix='path/to/your/test/image_data'))
# 2. model settings
model = dict(
bbox_head=dict(
type='RotatedRetinaHead',
# explicitly over-write all the `num_classes` field from default 15 to 5.
num_classes=15))
假设您的自定义数据集是 DOTA 格式,请确保您在自定义数据集中具有正确的注释:
配置文件中的classes字段应该与txt注释中的A[8]具有完全相同的元素和顺序。MMRotate自动将类别中的不连续id映射到连续的标签索引,因此类别字段中名称的字符串顺序会影响标签索引的顺序。同时,类在config中的字符串顺序会影响到预测边界框可视化过程中的标签文本。
MMRotate 还支持许多数据集包装器来混合数据集或修改数据集分布以进行训练。目前它支持三个数据集包装器,如下所示:
RepeatDataset
: 简单地重复整个数据集。ClassBalancedDataset
:以类平衡的方式重复数据集。ConcatDataset
: 连接数据集。# 我们`RepeatDataset`用作包装器来重复数据集。例如,假设原始数据集是`Dataset_A`,重复一遍,配置如下所示
dataset_A_train = dict(
type='RepeatDataset',
times=N,
dataset=dict( # This is the original config of Dataset_A
type='Dataset_A',
...
pipeline=train_pipeline
)
)
# 我们使用ClassBalancedDataset包装器来根据类别频率重复数据集。要重复的数据集需要实例化函数
# self.get_cat_ids(idx) 来支持ClassBalancedDataset。例如,重复Dataset_A使用oversample_thr=1e-3,配置如下所示:
dataset_A_train = dict(
type='ClassBalancedDataset',
oversample_thr=1e-3,
dataset=dict( # This is the original config of Dataset_A
type='Dataset_A',
...
pipeline=train_pipeline
)
)
有三种方法可以连接数据集。
1、 如果要连接的数据集属于同一类型且具有不同的注释文件,则可以连接数据集配置,如下所示。
dataset_A_train = dict(
type='Dataset_A',
ann_file = ['anno_file_1', 'anno_file_2'],
pipeline=train_pipeline
)
#如果拼接后的数据集用于测试或评估,这种方式支持对每个数据集分别进行评估。要测试连接的数据集作为一个整体,您可以设置separate_eval=False如下。
dataset_A_train = dict(
type='Dataset_A',
ann_file = ['anno_file_1', 'anno_file_2'],
separate_eval=False,
pipeline=train_pipeline
)
2、如果您要连接的数据集不同,您可以连接数据集配置,如下所示。
dataset_A_train = dict()
dataset_B_train = dict()
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train = [
dataset_A_train,
dataset_B_train
],
val = dataset_A_val,
test = dataset_A_test
)
#注:如果拼接后的数据集用于测试或评估,这种方式还支持对每个数据集分别进行评估。
3、我们也支持ConcatDataset
明确定义如下。
dataset_A_val = dict()
dataset_B_val = dict()
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train=dataset_A_train,
val=dict(
type='ConcatDataset',
datasets=[dataset_A_val, dataset_B_val],
separate_eval=False))
# 种方式允许用户通过设置将所有数据集评估为单个数据集separate_eval=False。
4、注意事项
separate_eval=False
假定数据集self.data_infos
在评估期间使用。因此,COCO 数据集不支持这种行为,因为 COCO 数据集不完全依赖于self.data_infos
评估。没有测试组合不同类型的数据集并对其进行整体评估,因此不建议这样做。ClassBalancedDataset
和RepeatDataset
不支持因此评估这些类型的连接数据集也不支持。Dataset_A
和Dataset_B
M 次,然后连接重复的数据集,如下所示。dataset_A_train = dict(
type='RepeatDataset',
times=N,
dataset=dict(
type='Dataset_A',
...
pipeline=train_pipeline
)
)
dataset_A_val = dict(
...
pipeline=test_pipeline
)
dataset_A_test = dict(
...
pipeline=test_pipeline
)
dataset_B_train = dict(
type='RepeatDataset',
times=M,
dataset=dict(
type='Dataset_B',
...
pipeline=train_pipeline
)
)
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train = [
dataset_A_train,
dataset_B_train
],
val = dataset_A_val,
test = dataset_A_test
)
模型基本上分为5种类型
# 创建一个新文件mmrotate/models/backbones/mobilenet.py。
import torch.nn as nn
from mmrotate.models.builder import ROTATED_BACKBONES
@ROTATED_BACKBONES.register_module()
class MobileNet(nn.Module):
def __init__(self, arg1, arg2):
pass
def forward(self, x): # should return a tuple
pass
# 可以将以下行添加到mmrotate/models/backbones/__init__.py
from .mobilenet import MobileNet
# 或者添加以下代码到配置文件以避免修改原始代码。
custom_imports = dict(
imports=['mmrotate.models.backbones.mobilenet'],
allow_failed_imports=False)
model = dict(
...
backbone=dict(
type='MobileNet',
arg1=xxx,
arg2=xxx),
...
# 创建一个新文件mmrotate/models/necks/pafpn.py。
from mmrotate.models.builder import ROTATED_NECKS
@ROTATED_NECKS.register_module()
class PAFPN(nn.Module):
def __init__(self,
in_channels,
out_channels,
num_outs,
start_level=0,
end_level=-1,
add_extra_convs=False):
pass
def forward(self, inputs):
# implementation is ignored
pass
# 可以将以下行添加到mmrotate/models/necks/__init__.py
from .pafpn import PAFPN
# 或者添加以下代码到配置文件以避免修改原始代码。
custom_imports = dict(
imports=['mmrotate.models.necks.pafpn.py'],
allow_failed_imports=False)
neck=dict(
type='PAFPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5)
#首先,在mmrotate/models/roi_heads/bbox_heads/double_bbox_head.py. 双头 R-CNN 实现了一个新的 bbox 头来进行目标检测。要实现一个bbox head,基本上我们需要实现新模块的三个功能,如下所示。
from mmrotate.models.builder import ROTATED_HEADS
from mmrotate.models.roi_heads.bbox_heads.bbox_head import BBoxHead
@ROTATED_HEADS.register_module()
class DoubleConvFCBBoxHead(BBoxHead):
r"""Bbox head used in Double-Head R-CNN
/-> cls
/-> shared convs ->
\-> reg
roi features
/-> cls
\-> shared fc ->
\-> reg
""" # noqa: W605
def __init__(self,
num_convs=0,
num_fcs=0,
conv_out_channels=1024,
fc_out_channels=1024,
conv_cfg=None,
norm_cfg=dict(type='BN'),
**kwargs):
kwargs.setdefault('with_avg_pool', True)
super(DoubleConvFCBBoxHead, self).__init__(**kwargs)
def forward(self, x_cls, x_reg):
# 其次,如有必要,实施新的 RoI Head。我们计划DoubleHeadRoIHead从StandardRoIHead. 我们可以发现aStandardRoIHead已经实现了以下功能。
import torch
from mmdet.core import bbox2result, bbox2roi, build_assigner, build_sampler
from mmrotate.models.builder import ROTATED_HEADS, build_head, build_roi_extractor
from mmrotate.models.roi_heads.base_roi_head import BaseRoIHead
from mmrotate.models.roi_heads.test_mixins import BBoxTestMixin, MaskTestMixin
@ROTATED_HEADS.register_module()
class StandardRoIHead(BaseRoIHead, BBoxTestMixin, MaskTestMixin):
"""Simplest base roi head including one bbox head and one mask head.
"""
def init_assigner_sampler(self):
def init_bbox_head(self, bbox_roi_extractor, bbox_head):
def forward_dummy(self, x, proposals):
def forward_train(self,
x,
img_metas,
proposal_list,
gt_bboxes,
gt_labels,
gt_bboxes_ignore=None,
gt_masks=None):
def _bbox_forward(self, x, rois):
def _bbox_forward_train(self, x, sampling_results, gt_bboxes, gt_labels,
img_metas):
def simple_test(self,
x,
proposal_list,
img_metas,
proposals=None,
rescale=False):
"""Test without augmentation."""
# 双头的修改主要在 bbox_forward 逻辑中,它继承了StandardRoIHead. 在 中mmrotate/models/roi_heads/double_roi_head.py,我们实现了新的 RoI Head,如下所示:
from mmrotate.models.builder import ROTATED_HEADS
from mmrotate.models.roi_heads.standard_roi_head import StandardRoIHead
@ROTATED_HEADS.register_module()
class DoubleHeadRoIHead(StandardRoIHead):
"""RoI head for Double Head RCNN
https://arxiv.org/abs/1904.06493
"""
def __init__(self, reg_roi_scale_factor, **kwargs):
super(DoubleHeadRoIHead, self).__init__(**kwargs)
self.reg_roi_scale_factor = reg_roi_scale_factor
def _bbox_forward(self, x, rois):
bbox_cls_feats = self.bbox_roi_extractor(
x[:self.bbox_roi_extractor.num_inputs], rois)
bbox_reg_feats = self.bbox_roi_extractor(
x[:self.bbox_roi_extractor.num_inputs],
rois,
roi_scale_factor=self.reg_roi_scale_factor)
if self.with_shared_head:
bbox_cls_feats = self.shared_head(bbox_cls_feats)
bbox_reg_feats = self.shared_head(bbox_reg_feats)
cls_score, bbox_pred = self.bbox_head(bbox_cls_feats, bbox_reg_feats)
bbox_results = dict(
cls_score=cls_score,
bbox_pred=bbox_pred,
bbox_feats=bbox_cls_feats)
return bbox_results
最后,用户需要添加模块,
mmrotate/models/bbox_heads/__init__.py
相应mmrotate/models/roi_heads/__init__.py
的注册表才能找到并加载它们。
或者,用户可以添加到配置文件并实现相同的目标。
custom_imports=dict(
imports=['mmrotate.models.roi_heads.double_roi_head', 'mmrotate.models.bbox_heads.double_bbox_head'])
假设您想
MyLoss
为边界框回归添加一个新的损失。要添加新的损失函数,用户需要在mmrotate/models/losses/my_loss.py
. 装饰器weighted_loss
可以为每个元素加权损失。
import torch
import torch.nn as nn
from mmrotate.models.builder import ROTATED_LOSSES
from mmdet.models.losses.utils import weighted_loss
@weighted_loss
def my_loss(pred, target):
assert pred.size() == target.size() and target.numel() > 0
loss = torch.abs(pred - target)
return loss
@ROTATED_LOSSES.register_module()
class MyLoss(nn.Module):
def __init__(self, reduction='mean', loss_weight=1.0):
super(MyLoss, self).__init__()
self.reduction = reduction
self.loss_weight = loss_weight
def forward(self,
pred,
target,
weight=None,
avg_factor=None,
reduction_override=None):
assert reduction_override in (None, 'none', 'mean', 'sum')
reduction = (
reduction_override if reduction_override else self.reduction)
loss_bbox = self.loss_weight * my_loss(
pred, target, weight, reduction=reduction, avg_factor=avg_factor)
return loss_bbox
然后用户需要将其添加到
mmrotate/models/losses/__init__.py
.
from .my_loss import MyLoss, my_loss
或者,可以添加 如下到配置文件并实现相同的目标。
custom_imports=dict(
imports=['mmrotate.models.losses.my_loss'])
要使用它,请修改该
loss_xxx
字段。由于 MyLoss 是用于回归的,因此您需要修改loss_bbox
头部中的字段。
loss_bbox=dict(type='MyLoss', loss_weight=1.0))
我们已经支持使用 PyTorch 实现的所有优化器,唯一的修改是更改
optimizer
配置文件的字段。比如你要使用ADAM
(注意性能可能会下降很多),修改可以如下。
optimizer = dict(type='Adam', lr=0.0003, weight_decay=0.0001)
# 修改模型的学习率,用户只需要修改lrconfig 中的optimizer. 用户可以按照PyTorch的API文档直接设置参数。
API 文档直接设置参数。
(1)定义一个新的优化器
假设您要添加一个名为 的优化器
MyOptimizer
,它具有参数a
、b
和c
。您需要创建一个名为mmrotate/core/optimizer
. 然后在一个文件中实现新的优化器,例如,在mmrotate/core/optimizer/my_optimizer.py
:
from mmdet.core.optimizer.registry import OPTIMIZERS
from torch.optim import Optimizer
@OPTIMIZERS.register_module()
class MyOptimizer(Optimizer):
def __init__(self, a, b, c)
(2)将优化器添加到注册表
要找到上面定义的模块,首先应该将该模块导入主命名空间。有两种选择来实现它。
应该导入新定义的模块,mmrotate/core/optimizer/init.py以便注册表找到新模块并添加它:
from .my_optimizer import MyOptimizer
custom_imports
在配置中使用手动导入它custom_imports = dict(imports=['mmrotate.core.optimizer.my_optimizer'], allow_failed_imports=False)
模块mmrotate.core.optimizer.my_optimizer
将在程序开始时导入,MyOptimizer
然后自动注册类。MyOptimizer
请注意,只应导入包含该类的包。mmrotate.core.optimizer.my_optimizer.MyOptimizer
不能直接导入。
实际上用户可以通过这种导入方式使用完全不同的文件目录结构,只要模块根目录可以位于PYTHONPATH
.
(3)在配置文件中指定优化器
然后您可以
MyOptimizer
在optimizer
配置文件的字段中使用。在配置中,优化器由如下字段定义optimizer
:
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
要使用您自己的优化器,该字段可以更改为 :
optimizer = dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value)
一些模型可能有一些特定参数的优化设置,例如 BatchNorm 层的权重衰减。用户可以通过自定义优化器构造函数来进行这些细粒度的参数调整。
from mmcv.utils import build_from_cfg
from mmcv.runner.optimizer import OPTIMIZER_BUILDERS, OPTIMIZERS
from mmrotate.utils import get_root_logger
from .my_optimizer import MyOptimizer
@OPTIMIZER_BUILDERS.register_module()
class MyOptimizerConstructor(object):
def __init__(self, optimizer_cfg, paramwise_cfg=None):
def __call__(self, model):
return my_optimizer
# 默认的优化器构造函数在这里实现,它也可以作为新的优化器构造函数的模板
默认的优化器构造函数在这里 。
优化器未实现的技巧应该通过优化器构造函数(例如,设置参数学习率)或钩子来实现。我们列出了一些可以稳定训练或加速训练的常见设置。随意创建 PR,发布更多设置。
optimizer_config = dict(
_delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
如果您的配置继承了已经设置的基本配置optimizer_config
,您可能需要_delete_=True
覆盖不必要的设置。有关更多详细信息,请参阅配置文档。
lr_config = dict(
policy='cyclic',
target_ratio=(10, 1e-4),
cyclic_times=1,
step_ratio_up=0.4,
)
momentum_config = dict(
policy='cyclic',
target_ratio=(0.85 / 0.95, 1),
cyclic_times=1,
step_ratio_up=0.4,
)
工作流是(阶段,时期)的列表,用于指定运行顺序和时期。默认情况下,它设置为
workflow = [('train', 1)]
这意味着运行 1 个 epoch 进行训练。有时用户可能想要检查验证集上模型的一些指标(例如损失、准确性)。在这种情况下,我们可以将工作流设置为
# 这样 1 个 epoch 的训练和 1 个 epoch 的验证将被迭代运行。
[('train', 1), ('val', 1)]
注意:
total_epochs
仅控制训练 epoch 的数量,不会影响验证工作流程。[('train', 1), ('val', 1)]``[('train', 1)]``EvalHook``EvalHook``after_train_epoch``after_val_epoch``[('train', 1), ('val', 1)]``[('train', 1)]
在某些情况下,用户可能需要实现一个新的钩子。MMRotate 支持训练中的自定义钩子。因此,用户可以直接在 mmrotate 或其基于 mmdet 的代码库中实现钩子,并通过仅在训练中修改配置来使用钩子。这里我们举一个例子,在 mmrotate 中创建一个新的钩子并在训练中使用它。
from mmcv.runner import HOOKS, Hook
@HOOKS.register_module()
class MyHook(Hook):
def __init__(self, a, b):
pass
def before_run(self, runner):
pass
def after_run(self, runner):
pass
def before_epoch(self, runner):
pass
def after_epoch(self, runner):
pass
def before_iter(self, runner):
pass
def after_iter(self, runner):
pass
注:根据钩子的功能,用户需要在before_run
、after_run
、before_epoch
、after_epoch
、before_iter
和中指定钩子在训练的每个阶段将做什么after_iter
。
然后我们需要
MyHook
导入。假设文件在mmrotate/core/utils/my_hook.py
有两种方法可以做到这一点:
修改mmrotate/core/utils/__init__.py
以导入它。
应该导入新定义的模块,mmrotate/core/utils/__init__.py
以便注册表找到新模块并添加它:
from .my_hook import MyHook
custom_imports
在配置中使用手动导入它custom_imports = dict(imports=['mmrotate.core.utils.my_hook'], allow_failed_imports=False)
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value)
]
您还可以通过添加键或如下设置挂钩
priority
的'NORMAL'
优先'HIGHEST'
级
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value, priority='NORMAL')
]
注:默认情况下,挂钩的优先级设置为NORMAL
注册期间。
如果 MMCV 中已经实现了该钩子,则可以直接修改配置以使用该钩子,如下所示
NumClassCheckHook
我们实现了一个名为 NumClassCheckHook的自定义钩子来检查 in 头是否与innum_classes
的长度匹配。CLASSSES``dataset
我们在default_runtime.py中设置它。
custom_hooks = [dict(type=‘NumClassCheckHook’)]
有一些常见的钩子不是通过注册的
custom_hooks
,它们是
在这些钩子中,只有 logger 钩子具有
VERY_LOW
优先级,其他钩子的优先级是NORMAL
. 上述教程已经介绍了如何修改optimizer_config
、momentum_config
和lr_config
. 在这里,我们揭示了我们可以如何使用log_config
、checkpoint_config
和evaluation
。
MMCV 运行器将用于checkpoint_config初始化CheckpointHook.
注:用户可以设置max_keep_ckpts
只保存少量检查点或决定是否存储优化器的状态字典save_optimizer
。论点的更多细节在这里
包装多个记录器
log_config
挂钩并允许设置间隔。现在 MMCV 支持WandbLoggerHook
、MlflowLoggerHook
和TensorboardLoggerHook
. 详细用法可以在文档中找到。
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])
评估配置evaluation
将用于初始化EvalHook
. 除了 key interval
,其他参数如metric
将传递给dataset.evaluate()
evaluation = dict(interval=1, metric='bbox')
tools/analysis_tools/analyze_logs.py
在给定训练日志文件的情况下绘制损失/mAP 曲线。首先运行以安装依赖项。pip install seaborn
python tools/analysis_tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title ${TITLE}] [--legend ${LEGEND}] [--backend ${BACKEND}] [--style ${STYLE}] [--out ${OUT_FILE}]
例子:
python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
python tools/analysis_tools/analyze_logs.py plot_curve log1.json log2.json --keys bbox_mAP --legend run1 run2
python tools/analysis_tools/analyze_logs.py cal_train_time log.json [--include-outliers]
输出预计如下所示。
-----Analyze train time of work_dirs/some_exp/20190611_192040.log.json-----
slowest epoch 11, average time is 1.2024
fastest epoch 1, average time is 1.1909
time std over epochs is 0.0028
average iter time: 1.1959 s/iter
tools/misc/browse_dataset.py
帮助用户直观地浏览检测数据集(包括图像和边界框注释),或将图像保存到指定目录。
python tools/misc/browse_dataset.py ${CONFIG} [-h] [--skip-type ${SKIP_TYPE[SKIP_TYPE...]}] [--output-dir ${OUTPUT_DIR}] [--not-show] [--show-interval ${SHOW_INTERVAL}]