Faster R-CNN and Mask R-CNN in PyTorch 1.0
This project aims at providing the necessary building blocks for easily
creating detection and segmentation models using PyTorch 1.0.
How to install, 安装教程 (Suppose you have installed PyTorch 1.0)
安装依赖项
sudopip installtorchvision ninja yacs cython matplotlib tqdm
安装pycocotools(COCO数据解析依赖)
cd ~/github
git clone https://github.com/cocodataset/cocoapi.git
cdcocoapi/PythonAPI
sudopython setup.py build_ext install
安装maskrcnn-benchmark
git clone https://github.com/jario-jin/maskrcnn-benchmark.git
cdmaskrcnn-benchmark
sudopython setup.py build develop
Download benchmark models from chinese cloud, 百度网盘下载预训练权重文件
R-50.pkl, R-101.pkl为训练新数据集所需, 其他为COCO数据集训练好的模型, 将下载好的权重放到同一个目录下
修改maskrcnn-benchmark/maskrcnn_benchmark/config/defaults.py中268行
_C.MODEL_DIR = "权重存放的路径"
Label your data, 标注自己的数据
下载标注软件:
设置标注工作空间, 打开图像(如果数据为视频,可通过tools->video2image转换), 选择标注类型,即可开始
方框标注, 用于Faster R-CNN
实例分割标注, 用于Mask R-CNN
标注完成后, 按Ctrl+O, 确定,将结果按COCO形式输出
Start Training, 开始训练
以Faster R-CNN为例
修改maskrcnn-benchmark/configs/e2e_faster_rcnn_R_50_FPN_1x.yaml
DATASETS:
TRAIN: ("BB180913_vis_drone_train",) # 标注工具中save path的文件夹名, 用于训练
TEST: ("BB180913_vis_drone_val",) # 同上, 用于测试
DATA_DIR: "标注工具中, save path的父路径"
OUTPUT_DIR: "训练结果保存的路径"
在终端的maskrcnn-benchmark路径下, 执行
python tools/train_net.py --config-file configs/e2e_faster_rcnn_R_50_FPN_1x.yaml
How to deploy, 那么如何利用摄像头查看模型效果呢
将训练好的权重放到上面自己定义的模型目录中,修改对应的部署配置文件,以Faster R-CNN R-50为例
maskrcnn-benchmark/configs/caffe2/e2e_faster_rcnn_R_50_FPN_1x_caffe2.yaml, 修改第2行
WEIGHT: "catalog://ModelDir/训练好的权重文件名称.pth"
打开maskrcnn-benchmark/demo/webcam.py, 修改第15行, 指向刚才的部署配置, 运行命令
cddemo
python webcam.py --min-image-size 300
如果发现标签不对应, 需要修改maskrcnn-benchmark/demo/predictor.py中的标签名称
Detection DEMO
Highlights
PyTorch 1.0: RPN, Faster R-CNN and Mask R-CNN implementations that matches or exceeds Detectron accuracies
Very fast: up to 2x faster than Detectron and 30% faster than mmdetection during training. See MODEL_ZOO.md for more details.
Memory efficient: uses roughly 500MB less GPU memory than mmdetection during training
Multi-GPU training and inference
Batched inference: can perform inference using multiple images per batch per GPU
CPU support for inference: runs on CPU in inference time. See our webcam demo for an example
Provides pre-trained models for almost all reference Mask R-CNN and Faster R-CNN configurations with 1x schedule.
Webcam and Jupyter notebook demo
We provide a simple webcam demo that illustrates how you can use maskrcnn_benchmark for inference:
cddemo
# by default, it runs on the GPU
# for best results, use min-image-size 800
python webcam.py --min-image-size 800
# can also run it on the CPU
python webcam.py --min-image-size 300 MODEL.DEVICE cpu
# or change the model that you want to use
python webcam.py --config-file ../configs/caffe2/e2e_mask_rcnn_R_101_FPN_1x_caffe2.yaml --min-image-size 300 MODEL.DEVICE cpu
# in order to see the probability heatmaps, pass --show-mask-heatmaps
python webcam.py --min-image-size 300 --show-mask-heatmaps MODEL.DEVICE cpu
A notebook with the demo can be found in demo/Mask_R-CNN_demo.ipynb.
Installation
Check INSTALL.md for installation instructions.
Model Zoo and Baselines
Pre-trained models, baselines and comparison with Detectron and mmdetection
can be found in MODEL_ZOO.md
Inference in a few lines
We provide a helper class to simplify writing inference pipelines using pre-trained models.
Here is how we would do it. Run this from the demo folder:
from maskrcnn_benchmark.config import cfg
from predictor import COCODemo
config_file = "../configs/caffe2/e2e_mask_rcnn_R_50_FPN_1x_caffe2.yaml"
# update the config options with the config file
cfg.merge_from_file(config_file)
# manual override some options
cfg.merge_from_list(["MODEL.DEVICE", "cpu"])
coco_demo = COCODemo(
cfg,
min_image_size=800,
confidence_threshold=0.7,
)
# load image and then run prediction
image = ...
predictions = coco_demo.run_on_opencv_image(image)
Perform training on COCO dataset
For the following examples to work, you need to first install maskrcnn_benchmark.
You will also need to download the COCO dataset.
We recommend to symlink the path to the coco dataset to datasets/ as follows
We use minival and valminusminival sets from Detectron
# symlink the coco dataset
cd ~/github/maskrcnn-benchmark
mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2014 datasets/coco/train2014
ln -s /path_to_coco_dataset/test2014 datasets/coco/test2014
ln -s /path_to_coco_dataset/val2014 datasets/coco/val2014
You can also configure your own paths to the datasets.
For that, all you need to do is to modify maskrcnn_benchmark/config/paths_catalog.py to
point to the location where your dataset is stored.
You can also create a new paths_catalog.py file which implements the same two classes,
and pass it as a config argument PATHS_CATALOG during training.
Single GPU training
Most of the configuration files that we provide assume that we are running on 8 GPUs.
In order to be able to run it on fewer GPUs, there are a few possibilities:
1. Run the following without modifications
python /path_to_maskrnn_benchmark/tools/train_net.py --config-file "/path/to/config/file.yaml"
This should work out of the box and is very similar to what we should do for multi-GPU training.
But the drawback is that it will use much more GPU memory. The reason is that we set in the
configuration files a global batch size that is divided over the number of GPUs. So if we only
have a single GPU, this means that the batch size for that GPU will be 8x larger, which might lead
to out-of-memory errors.
If you have a lot of memory available, this is the easiest solution.
2. Modify the cfg parameters
If you experience out-of-memory errors, you can reduce the global batch size. But this means that
you'll also need to change the learning rate, the number of iterations and the learning rate schedule.
Here is an example for Mask R-CNN R-50 FPN with the 1x schedule:
python tools/train_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS "(480000, 640000)" TEST.IMS_PER_BATCH 1
This follows the scheduling rules from Detectron.
Note that we have multiplied the number of iterations by 8x (as well as the learning rate schedules),
and we have divided the learning rate by 8x.
We also changed the batch size during testing, but that is generally not necessary because testing
requires much less memory than training.
Multi-GPU training
We use internally torch.distributed.launch in order to launch
multi-gpu training. This utility function from PyTorch spawns as many
Python processes as the number of GPUs we want to use, and each Python
process will only use a single GPU.
exportNGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "path/to/config/file.yaml"
Abstractions
For more information on some of the main abstractions in our implementation, see ABSTRACTIONS.md.
Adding your own dataset
This implementation adds support for COCO-style datasets.
But adding support for training on a new dataset can be done as follows:
from maskrcnn_benchmark.structures.bounding_box import BoxList
class MyDataset(object):
def __init__(self, ...):
# as you would do normally
def __getitem__(self, idx):
# load the image as a PIL Image
image = ...
# load the bounding boxes as a list of list of boxes
# in this case, for illustrative purposes, we use
# x1, y1, x2, y2 order.
boxes = [[0, 0, 10, 10], [10, 20, 50, 50]]
# and labels
labels = torch.tensor([10, 20])
# create a BoxList from the boxes
boxlist = BoxList(boxes, image.size, mode="xyxy")
# add the labels to the boxlist
boxlist.add_field("labels", labels)
if self.transforms:
image, boxlist = self.transforms(image, boxlist)
# return the image, the boxlist and the idx in your dataset
return image, boxlist, idx
def get_img_info(self, idx):
# get img_height and img_width. This is used if
# we want to split the batches according to the aspect ratio
# of the image, as it can be more efficient than loading the
# image from disk
return {"height": img_height, "width": img_width}
That's it. You can also add extra fields to the boxlist, such as segmentation masks
(using structures.segmentation_mask.SegmentationMask), or even your own instance type.
For a full example of how the COCODataset is implemented, check maskrcnn_benchmark/data/datasets/coco.py.
Note:
While the aforementioned example should work for training, we leverage the
cocoApi for computing the accuracies during testing. Thus, test datasets
should currently follow the cocoApi for now.
Troubleshooting
If you have issues running or compiling this code, we have compiled a list of common issues in
TROUBLESHOOTING.md. If your issue is not present there, please feel
free to open a new issue.
License
maskrcnn-benchmark is released under the MIT license. See LICENSE for additional details.