PyTorch实现的RRPN任意方向场景文本检测

本文素材来源于GitHub,经本人编辑首发于CSDN,仅供技术分享所用,不作商用。欢迎大家关注我的公众号: gbxiao992

RRPN_pytorch

RRPN的pytorch版本,已在facebook的基准测试中实现:https://github.com/facebookresearch/maskrcnn-benchmark .
它的caffe版本可以在以下网址查看:https://github.com/mjq11302010044/RRPN 。
PyTorch实现的RRPN任意方向场景文本检测_第1张图片

亮点

  • From original repo: In pytorch 1.0, Somehow faster than original repo in both training and inference.
  • Training and evaluation checked: Testing in IC15 with training data in {IC13, IC15, IC17mlt}, and receives Fscore of 83% vs. 81% in caffe repo.
  • What’s new: RRoI Pooling is replaced with RRoI Align(bilinear interpolation for sampling), FPN structure supported, easy to change various backbones for different purposes.

安装

依赖

  • PyTorch 1.0 from a nightly release. Installation instructions can be found in https://pytorch.org/get-started/locally/
  • torchvision from master
  • cocoapi
  • yacs
  • matplotlib
  • GCC >= 4.9
  • (optional) OpenCV for the webcam demo

选择1:逐步安装

# first, make sure that your conda is setup properly with the right environment
# for that, check that `which conda`, `which pip` and `which python` points to the
# right path. From a clean conda env, this is what you need to do

conda create --name rrpn_pytorch
source activate rrpn_pytorch

# this installs the right pip and dependencies for the fresh python
conda install ipython

# maskrcnn_benchmark and coco api dependencies
pip install ninja yacs cython matplotlib

# follow PyTorch installation in https://pytorch.org/get-started/locally/
# we give the instructions for CUDA 9.0
conda install pytorch-nightly -c pytorch

# install torchvision
cd ~/github
git clone https://github.com/pytorch/vision.git
cd vision
python setup.py install

# install pycocotools
cd ~/github
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install

# install PyTorch Detection
cd ~/github
git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
cd maskrcnn-benchmark
# the following will install the lib with
# symbolic links, so that you can modify
# the files if you want and won't need to
# re-build it
python setup.py build develop

#-------
python rotation_steup.py install
mv build/lib/rotation/*.so ./rotation
#-------

# or if you are on macOS
# MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py build develop

选择2:docker 镜像(需要CUDA,仅支持Linux系统)

  • 编译默认参数镜像(CUDA=9.0, CUDNN=7)
nvidia-docker build -t maskrcnn-benchmark docker/
  • 编译其他版本CUDA和CUDNN版本的镜像
nvidia-docker build -t maskrcnn-benchmark --build-arg CUDA=9.2 --build-arg CUDNN=7 docker/ 

  • 编译及启动jupyter notebook的镜像(注意notebook的密码设置)
nvidia-docker build -t maskrcnn-benchmark-jupyter docker/docker-jupyter/
nvidia-docker run -td -p 8888:8888 -e PASSWORD= -v : maskrcnn-benchmark-jupyter

数据集配置

  • 您的数据集路径可以设置为$RRPN_ROOT/maskrcnn_benchmark/config/paths_catalog.py。我们为{IC13,IC15,IC17mlt,LSVT,ArT}实现了常用的接口(从第96行开始):
"RRPN_train": {  # including IC13 and IC15
            'dataset_list':{
                # 'IC13': 'Your dataset path',
                ...
            },
            "split": 'train'
        },
  • 想要添加你自己的数据集?你需要形成一个dict数组,如下所示:
im_info = {
    'gt_classes': your class_id array,
    'max_classes': your class_id array,
    'image': path to access one image,
    'boxes': rotate box in {cx, cy, w, h, θ},
    'flipped': Not supported, just False, 
    'gt_overlaps': overlaps fill with 1 (gt with gt),
    'seg_areas': H * W for an rbox,
    'height': height of an image,
    'width': width of an image,
    'max_overlaps': overlaps fill with 1 (gt with gt),
    'rotated': just True
}
  • 你可以在$RRPN_ROOT/maskrcnn_benchmark/data/rotation_series.py 看到示例。您的数据API中应添加到变量DATASET:
DATASET = {
    'IC13':get_ICDAR2013,
    'IC15':get_ICDAR2015_RRC_PICK_TRAIN,
    'IC17mlt':get_ICDAR2017_mlt,
    ...
    'Your Dataset Name': 'Your Dataset API'
}

训练

# 在你的RRPN根目录执行:
python tools/train_net.py --config-file=configs/rrpn/e2e_rrpn_R_50_C4_1x_ICDAR13_15_17_trial.yaml
  • 多GPU环境下未测试过,如果有多块GPU,建议设置使用GPU数量不超过1

测试

  • $RRPN_ROOT/demo/RRPN_Demo.py 命令可用以测试你的图片。这个demo会将所检测到的文字生成txt.
  • 如果你想展示检测后的图片,注意设置变量 vis 的值为“True”.

引用

@misc{ma2019rrpn,
    author = {Jianqi Ma},
    title = {{RRPN in pytorch}},
    year = {2019},
    howpublished = {\url{https://github.com/mjq11302010044/RRPN_pytorch}},
}
@article{Jianqi17RRPN,
    Author = {Jianqi Ma and Weiyuan Shao and Hao Ye and Li Wang and Hong Wang and Yingbin Zheng and Xiangyang Xue},
    Title = {Arbitrary-Oriented Scene Text Detection via Rotation Proposals},
    journal = {IEEE Transactions on Multimedia},
    volume={20}, 
    number={11}, 
    pages={3111-3122}, 
    year={2018}
}

你可能感兴趣的:(深度学习,项目学习)