#创建虚拟环境
conda create -n abcNet python=3.6.2
#激活虚拟环境
source activate abcNet
#安装pytorch 1.8
pip install torch==1.8.1+cu101 torchvision==0.9.1+cu101 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
#检查GPU是否可用
(abcNet):~$ python
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information
>>> import torch
>>> torch.cuda.is_available()
True
#安装相关库
pip install pythran scikit-image
pip install ninja yacs cython matplotlib tqdm opencv-python shapely scipy tensorboardX pyclipper Polygon3 editdistance
#本地安装 detectron2
git clone https://github.com/facebookresearch/detectron2.git
cd detectron2
git checkout -f 9eb4831
cd ..
python -m pip install -e detectron2
若报错
Attempting uninstall: certifi
Found existing installation: certifi 2016.9.26
ERROR: Cannot uninstall 'certifi'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall
则前往/home/XXX/anaconda3/envs/abcNet/lib/python3.6/site-packagess删除certifi的相关文件,再重新安装。
从https://github.com/aim-uofa/AdelaiDet下载 AdelaiDet 包。
git clone https://github.com/aim-uofa/AdelaiDet.git
cd AdelaiDet
# 安装AdelaiDet
python setup.py build develop
根据官方文档
This is a python3 example showing how to build a custom dataset for abcnet training. The example image and annotation are from CTW1500 dataset (https://github.com/Yuliang-Liu/Curve-Text-Detector/tree/master/data)
Step one: Given polygonal annotation, generating bezier curve annotation.
python Bezier_generator2.py
Step two: Given bezier curve annotation, generating coco-like annotation format for training abcnet.
python generate_abcnet_json.py ./ train 0
我们需要先将多边形标注文件转化为如下的Bezier曲线文件,再将Bezier曲线文件整合成COCO格式的json文件
由于ICDAR 2015 只有4个坐标点数据,而Bezier曲线文件需要10个坐标(上5个,下5个),因此我们需要通过差值的方式,在ICDAR中坐标1和坐标2之间插入v1,v2,v3,在坐标3和坐标4之间插入v4,v5,v6.
代码如下:
for il, line in enumerate(fin):
# line = line.strip().split(',')
icdar_line = list(map(int, line.rstrip('\n').lstrip('\ufeff').split(',')[:8]))
x1 = icdar_line[0]/4*3+icdar_line[2]/4*1
y1 = icdar_line[1]/4*3+icdar_line[3]/4*1
x2 = icdar_line[0]/2+icdar_line[2]/2
y2 = icdar_line[1] / 2 + icdar_line[3] / 2
x3 = icdar_line[0] / 4 + icdar_line[2] / 4*3
y3 = icdar_line[1] / 4 + icdar_line[3] / 4*3
new_line = [icdar_line[0],icdar_line[1]]+[x1,y1,x2,y2,x3,y3]+[icdar_line[2],icdar_line[3]]
x4 = icdar_line[4]/4*3+icdar_line[6]/4
y4 = icdar_line[5] / 4*3 + icdar_line[7] / 4
x5 = icdar_line[4] / 2 + icdar_line[6] / 2
y5 = icdar_line[5] / 2 + icdar_line[7] / 2
x6 = icdar_line[4] / 4 + icdar_line[6] / 4*3
y6 = icdar_line[5] / 4 + icdar_line[7] / 4*3
text = line.rstrip('\n').lstrip('\ufeff').split(',')[8]
new_line = new_line +[icdar_line[4],icdar_line[5]]+[x4,y4,x5,y5,x6,y6]+[icdar_line[6],icdar_line[7]]
new_line = [int(line/math.sqrt(factor)) for line in new_line]+[text]
line = new_line
if not len(line[:-1]) == 20: continue
ct = line[-1]
if ct == '###': continue
coords = [(float(line[:-1][ix]), float(line[:-1][ix+1])) for ix in range(0, len(line[:-1]), 2)]
poly = Polygon(coords)
data.append(np.array([float(x) for x in line[:-1]]))
cts.append(ct)
polys.append(poly)
之后调用官方的Bezier_generator2_txt_totaltext.py将icdar2015 的txt文件转化为bezier的txt文件。转化后文件如下:
可视化后的结果如下:
最后调用官方的generate_abcnet_json.py文件,生成用于训练的coco数据集。
修改后的数据集转化代码及total text 的ABCNet 训练模型可从https://download.csdn.net/download/lhe159324/85055738下载。
AdelaiDet-master中涉及到ABCNet的文件目录如下
AdelaiDet-master/
|-- AdelaiDet.egg-info /
|-- adet /
| |-- configs /
| | |-- defaults.py # 配置文件
| |-- data/
| | |-- builtin.py # 数据集路径配置
| |-- modeling/
| | |-- roi_heads/
| | | |-- __init__.py
| | | |-- text_head.py # BezierAlign + Recognizator
| | | |-- attn_predictor.py # Recognizator
| | |-- one_stage_detector.py # 网络模型总体结构
| | |-- poolers.py # BezierPooler
| |-- utils/
| | |--visualizer.py # 预测结果可视化
|-- build / # 编译文件存储路径
|-- configs /
| |-- BAText/
| | |-- TotalText/
| | | |-- attn_R_50.yaml # ABCNet 配置文件
| | | |-- Base-TotalText.yaml
| | | |-- v2_attn_R_50.yaml # ABCNetV2 配置文件
|-- datasets /
|-- demo /
| |-- demo.py #模型测试demo
| |--predictor.py # 模型调用并预测
|-- docker /
|-- docs /
|-- onnx/
|-- output /
| |-- batext/
| | |-- totaltext/
| | | |-- attn_R_50/ # 模型保存路径
|-- tools /
| |-- train_net.py # 训练脚本
|-- setup.py # AdelaiDet 编译脚本
|-- README.md
修改AdelaiDet-master/adet/data/builtin.py 文件,添加icdar 数据集并将datasets路径设置为数据集本地存储路径。
修改 AdelaiDet-master/configs/BAText/TotalText/attn_R_50.yaml文件,添加icdar数据集
attn_R_50.yaml文件 说明如下:
_BASE_: "Base-TotalText.yaml"# 基础配置文件
DATASETS:# 数据集
TRAIN: ("icdar_train",)
TEST: ("icdar_test",)
INPUT:# 训练/测试图片的最大尺寸和最小尺寸
MIN_SIZE_TRAIN: (600,)
MAX_SIZE_TRAIN: 760
MIN_SIZE_TEST: 600
MAX_SIZE_TEST: 760
MODEL:
WEIGHTS: "AdelaiDet-master/tt_e2e_attn_R_50.pth" :# 模型加载路径
RESNETS:
DEPTH: 50
BATEXT:
RECOGNIZER: "attn" # "attn" "rnn"
SOLVER:
IMS_PER_BATCH: 1 # 模型训练Batch
BASE_LR: 0.001 # 初始学习率
MAX_ITER: 5000 # 模型训练次数
CHECKPOINT_PERIOD: 1000 # 模型保存周期
TEST:
EVAL_PERIOD: 1000 # 模型测试周期
OUTPUT_DIR: "output/batext/totaltext/attn_R_50" # 模型保存路径
cd 进入AdelaiDet-master文件夹内,运行
OMP_NUM_THREADS=1 python tools/train_net.py --config-file configs/BAText/TotalText/attn_R_50.yaml
进行训练。
问题:
RuntimeError: Default process group has not been initialized,
please make sure to call init_process_group.
解决方案: 由于ABCNet Paper中是多GPU训练,但是本地只有单GPU,因此需要将 AdelaiDet-master/configs/BAText/TotalText/v2_attn_R_50.yaml和AdelaiDet-master/adet/config/defaults 中的"SyncBN"改为BN。
示例如下: