看看作者自己的介绍吧
NanoDet-Plus 知乎中文介绍
NanoDet 知乎中文介绍
conda create -n nanodet python=3.8 -y
conda activate nanodet
conda install pytorch torchvision cudatoolkit=11.1 -c pytorch -c conda-forge
git clone https://github.com/RangiLyu/nanodet.git
cd nanodet
pip install -r requirements.txt
python setup.py develop
该示例最后使用的是coco格式的标注文件,下方提供了一个voc转coco的脚本。
import os
from tqdm import tqdm
import xml.etree.ElementTree as ET
import json
class_names = ["cat", "bird", "dog"]
def voc2coco(data_dir, train_path, val_path):
xml_dir = os.path.join(data_dir, 'Annotations')
img_dir = os.path.join(data_dir, 'JPEGImages')
train_xmls = []
for f in os.listdir(train_path):
train_xmls.append(os.path.join(train_path, f))
val_xmls = []
for f in os.listdir(val_path):
val_xmls.append(os.path.join(val_path, f))
print('got xmls')
train_coco = xml2coco(train_xmls)
val_coco = xml2coco(val_xmls)
with open(os.path.join(data_dir, 'coco_train.json'), 'w') as f:
json.dump(train_coco, f, ensure_ascii=False, indent=2)
json.dump(val_coco, f, ensure_ascii=False, indent=2)
print('done')
def xml2coco(xmls):
coco_anno = {'info': {}, 'images': [], 'licenses': [], 'annotations': [], 'categories': []}
coco_anno['categories'] = [{'supercategory': j, 'id': i + 1, 'name': j} for i, j in enumerate(class_names)]
img_id = 0
anno_id = 0
for fxml in tqdm(xmls):
try:
tree = ET.parse(fxml)
objects = tree.findall('object')
except:
print('err xml file: ', fxml)
continue
if len(objects) < 1:
print('no object in ', fxml)
continue
img_id += 1
size = tree.find('size')
ih = float(size.find('height').text)
iw = float(size.find('width').text)
img_name = fxml.strip().split('/')[-1].replace('xml', 'jpg')
img_name = img_name.split('\\')
img_name = img_name[-1]
img_info = {}
img_info['id'] = img_id
img_info['file_name'] = img_name
img_info['height'] = ih
img_info['width'] = iw
coco_anno['images'].append(img_info)
for obj in objects:
cls_name = obj.find('name').text
if cls_name == "water":
continue
bbox = obj.find('bndbox')
x1 = float(bbox.find('xmin').text)
y1 = float(bbox.find('ymin').text)
x2 = float(bbox.find('xmax').text)
y2 = float(bbox.find('ymax').text)
if x2 < x1 or y2 < y1:
print('bbox not valid: ', fxml)
continue
anno_id += 1
bb = [x1, y1, x2 - x1, y2 - y1]
categery_id = class_names.index(cls_name) + 1
area = (x2 - x1) * (y2 - y1)
anno_info = {}
anno_info['segmentation'] = []
anno_info['area'] = area
anno_info['image_id'] = img_id
anno_info['bbox'] = bb
anno_info['iscrowd'] = 0
anno_info['category_id'] = categery_id
anno_info['id'] = anno_id
coco_anno['annotations'].append(anno_info)
return coco_anno
if __name__ == '__main__':
save_dir = './datasets/annotations' # 保存json文件的路径
train_dir = './datasets/annotations/train/' # 训练集xml文件的存放路径
val_dir = './datasets/annotations/val/' # 验证集xml文件的存放路径
voc2coco(save_dir, train_dir, val_dir)
最后数据集的路径如下:
-datasets
|--images
| |--train
| | |--00001.jpg
| | |--00004.jpg
| | |--...
| |--val
| | |--00002.jpg
| | |--00003.jpg
| | |--...
|--annatotions
| |--coco_train.json
| |--coco_val.json
以nanodet-m-416.yml
为例,对照自己的数据集主要修改以下部分
model:
head:
num_classes: 3 # 数据集类别数
data:
train:
img_path: F:/datasets/images/train # 训练集图片路径
ann_path: F:/datasets/annotations/coco_train.json # 训练集json文件路径
val:
img_path: F:/datasets/images/val # 验证集图片路径
ann_path: F:/datasets/annotations/coco_val.json # 验证集json文件路径
device:
gpu_ids: [0] # GPU
workers_per_gpu: 8 # 线程数
batchsize_per_gpu: 60 # batch size
schedule:
total_epochs: 280 # 总epoch数
val_intervals: 10 # 每10个epoch进行输出一次对验证集的识别结果
class_names: ["cat", "bird", "dog"] # 数据集类别
python tools/train.py config/legacy_v0.x_configs/nanodet-m-416.yml
如果训练中途断了,需要接着训练。首先修改nanodet-m-416.yml
中resume
和load_model
这两行注释去掉,并将model_last.ckpt
的路径补上(注意去掉注释后检查下这两行缩进是否正确),然后再python tools/train.py config/legacy_v0.x_configs/nanodet-m-416.yml
。
schedule:
resume:
load_model: F:/nanodet/workspace/nanodet_m_416/model_last.ckpt
optimizer:
name: SGD
lr: 0.14
momentum: 0.9
weight_decay: 0.0001
报错:
OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "F:\Anaconda3\envs\ nanodet\lib\site-packages\torch\lib\shm.dll" or one of its dependencies.
方案:减小配置文件中线程数
workers_per_gpu
,或者直接设为0不使用并行。
TensorBoard日志保存在./nanodet/workspace/nanodet_m_416
路径下,可视化命令如下:
tensorboard --logdir=./nanodet/workspace/nanodet_m_416
方法一:
python demo/demo.py image --config config/legacy_v0.x_configs/nanodet-m-416.yml --model nanodet_m_416.ckpt --path test.jpg
方法二:
运行demo\demo-inference-with-pytorch.ipynb
脚本(修改代码中from demo.demo import Predictor
为from demo import Predictor
)
1)在F:\nanodet\demo_android_ncnn\app\src\main
路径下新建一个文件夹assets
;
2)将F:\nanodet\demo_android_ncnn\app\src\main\cpp\ncnn-20211208-android-vulkan
路径下的nanodet-plus-m_416.bin
和nanodet-plus-m_416.param
复制到F:\nanodet\demo_android_ncnn\app\src\main\assets
下,并重命名为nanodet.bin
和nanodet.param
;
3)(可选)下载Yolov4和v5的ncnn模型到F:\nanodet\demo_android_ncnn\app\src\main\assets
路径下;
使用Android Studio打开F:\nanodet\demo_android_ncnn
文件夹,按照自己的安卓版本选择相应的Platforms,值得注意的是,NDK需要安装21.0.6113669
版本的,否则会报错类似“No version of NDK matched the requested version 21.0.6113669. Versions available locally: 21.3.6528147
”。【详细操作可以查看我之前的文章中的1.2节:【终端目标检测01】基于NCNN将YOLOX部署到Android】
部署结果:
python tools/export_onnx.py --cfg_path config\legacy_v0.x_configs\nanodet-m-416.yml --model_path nanodet_m_416.ckpt
使用在线转换https://convertmodel.com/
将转换后的bin和param文件放置到assets文件夹下,可以重命名为nanodet.bin和nanodet.param,也可以修改jni_interface.cpp
文件中NanoDet::detector = new NanoDet(mgr, "nanodet_self-sim-opt.param", "nanodet_self-sim-opt.bin", useGPU);
我使用的是nanodet-m-416.yml
训练了自己的模型,按照官方的文档修改nanodet.h中
超参数,make project
和run app
都没有报错,但是手机运行程序时识别有问题(类别并不是我自己数据集的类别),暂时还没发现问题所在。