chainer-目标检测-FasterRCNN

文章目录

  • 前言
  • 一、数据集的准备
    • 1.标注工具的安装
    • 2.数据集的准备
    • 3.标注数据
    • 4.解释xml文件的内容
  • 二、基于chainer的目标检测构建-FasterRCNN
    • 1.引入第三方标准库
    • 2.数据加载器
    • 3.模型构建
    • 4.模型代码
      • 1.FasterRCNN中的主体部分:
      • 2.FasterRCNN中的VGG部分:
      • 3.FasterRCNN中的region_proposal部分:
    • 5.整体代码构建
      • 1.chainer初始化
      • 2.数据集以及模型构建
      • 3.模型训练
    • 6、模型预测
  • 三、训练预测代码
  • 四、效果
  • 总结


前言

  通俗的讲就是在一张图像里边找感兴趣的物体,并且标出物体在图像上的位置,在后续很多应用中,都需要目标检测做初步识别结构后做处理,比如目标跟踪,检测数量,检测有无等。


一、数据集的准备

  首先我是用的是halcon数据集里边的药片,去了前边的100张做标注,后面的300张做测试,其中100张里边选择90张做训练集,10张做验证集。

1.标注工具的安装

pip install labelimg

进入cmd,输入labelimg,会出现如图的标注工具:
在这里插入图片描述

2.数据集的准备

首先我们先创建3个文件夹,如图:
在这里插入图片描述
DataImage:100张需要标注的图像
DataLabel:空文件夹,主要是存放标注文件,这个在labelimg中生成标注文件
test:存放剩下的300张图片,不需要标注
DataImage目录下和test目录的存放样子是这样的(以DataImage为例):
在这里插入图片描述

3.标注数据

  首先我们需要在labelimg中设置图像路径和标签存放路径,如图:
在这里插入图片描述
  然后先记住快捷键:w:开始编辑,a:上一张,d:下一张。这个工具只需要这三个快捷键即可完成工作。
  开始标注工作,首先按下键盘w,这个时候进入编辑框框的模式,然后在图像上绘制框框,输入标签(框框属于什么类别),即可完成物体1的标注,一张物体可以多个标注和多个类别,但是切记不可摸棱两可,比如这张图像对于某物体标注了,另一张图像如果出现同样的就需要标注,或者标签类别不可多个,比如这个图象A物体标注为A标签,下张图的A物体标出成了B标签,最终的效果如图:
在这里插入图片描述
最后标注完成会在DataLabel中看到标注文件,json格式:
在这里插入图片描述

4.解释xml文件的内容

在这里插入图片描述
xml标签文件如图,我们用到的就只有object对象,对其进行解析即可。

二、基于chainer的目标检测构建-FasterRCNN

1.引入第三方标准库

import numpy as np
import chainer, cv2, os, json, random, time,sys
sys.path.append('.')
from chainer.datasets import TransformDataset
from chainer.training import extensions
from data.data_loader import CreateDataList,VOCBboxDataset
from nets.faster_rcnn_vgg import FasterRCNNVGG16
from data.data_transform import Transform
from PIL import Image,ImageDraw,ImageFont
from chainercv.extensions import DetectionVOCEvaluator
from chainercv.links.model.faster_rcnn import FasterRCNNTrainChain

2.数据加载器

  读取数据文件夹,这里为的是加载每一张图像以及对应的xml文件,并且解析成自定义格式即可,如图:
读取数据文件夹及内容
  这里解释一下,IMGDir代表存放图像文件的路径,XMLDir代表存放标签文件的路径,train_split=0.9代表训练集和验证集9:1


  数据格式使用yolo格式,在每次迭代的过程中才开始加载图像以及xml文件,节省内存,如图:
在这里插入图片描述
这里的输入data_list是前边CreateDataList_Detection函数的返回,分为训练和验证集合,这里是一个迭代器


下面这里主要是做一些数据增强的操作:
在这里插入图片描述
此步骤在训练的时候不是必要的。

3.模型构建

本次使用的FasterRCNN目标检测算法,我们先看一张图:
chainer-目标检测-FasterRCNN_第1张图片

4.模型代码

1.FasterRCNN中的主体部分:

import numpy as np
import chainer
import chainer.functions as F
import chainer.links as L
from nets.faster_rcnn import FasterRCNN
from nets.region_proposal_network import RegionProposalNetwork
from nets.vgg import VGG16

def _roi_pooling_2d_yx(x, indices_and_rois, outh, outw, spatial_scale):
    xy_indices_and_rois = indices_and_rois[:, [0, 2, 1, 4, 3]]
    pool = F.roi_pooling_2d(
        x, xy_indices_and_rois, outh, outw, spatial_scale)
    return pool
        
class VGG16RoIHead(chainer.Chain):
    def __init__(self, n_class,alpha, roi_size, spatial_scale,
                 vgg_initialW=None, loc_initialW=None, score_initialW=None):
        super(VGG16RoIHead, self).__init__()
        self.alpha = alpha
        with self.init_scope():
            self.fc6 = L.Linear(25088//self.alpha, 4096//self.alpha, initialW=vgg_initialW)
            self.fc7 = L.Linear(4096//self.alpha, 4096//self.alpha, initialW=vgg_initialW)
            self.cls_loc = L.Linear(4096//self.alpha, n_class * 4, initialW=loc_initialW)
            self.score = L.Linear(4096//self.alpha, n_class, initialW=score_initialW)

        self.n_class = n_class
        self.roi_size = roi_size
        self.spatial_scale = spatial_scale

    def forward(self, x, rois, roi_indices):
        roi_indices = roi_indices.astype(np.float32)
        indices_and_rois = self.xp.concatenate((roi_indices[:, None], rois), axis=1)
        pool = _roi_pooling_2d_yx(x, indices_and_rois, self.roi_size, self.roi_size, self.spatial_scale)

        fc6 = F.relu(self.fc6(pool))
        fc7 = F.relu(self.fc7(fc6))
        roi_cls_locs = self.cls_loc(fc7)
        roi_scores = self.score(fc7)
        return roi_cls_locs, roi_scores

class FasterRCNNVGG16(FasterRCNN):
    feat_stride = 16

    def __init__(self,
                 n_fg_class=None,
                 pretrained_model=None,alpha=1,
                 min_size=600, max_size=1000,
                 ratios=[0.5, 1, 2], anchor_scales=[8, 16, 32],
                 vgg_initialW=None, rpn_initialW=None,
                 loc_initialW=None, score_initialW=None,
                 proposal_creator_params={}):
        self.alpha=alpha
        if loc_initialW is None:
            loc_initialW = chainer.initializers.Normal(0.001)
        if score_initialW is None:
            score_initialW = chainer.initializers.Normal(0.01)
        if rpn_initialW is None:
            rpn_initialW = chainer.initializers.Normal(0.01)
        if vgg_initialW is None and pretrained_model:
            vgg_initialW = chainer.initializers.Zero()

        extractor = VGG16(alpha=self.alpha)
        rpn = RegionProposalNetwork(
            512//self.alpha, 512//self.alpha,
            ratios=ratios,
            anchor_scales=anchor_scales,
            feat_stride=self.feat_stride,
            initialW=rpn_initialW,
            proposal_creator_params=proposal_creator_params,
        )
        head = VGG16RoIHead(
            n_fg_class + 1,alpha=self.alpha,
            roi_size=7, spatial_scale=1. / self.feat_stride,
            vgg_initialW=vgg_initialW,
            loc_initialW=loc_initialW,
            score_initialW=score_initialW
        )
        head.fc6.copyparams(extractor.fc6)
        head.fc7.copyparams(extractor.fc7)
        extractor.pick = 'conv5_3'
        extractor.remove_unused()

        super(FasterRCNNVGG16, self).__init__(
            extractor,
            rpn,
            head,
            mean=np.array([122.7717, 115.9465, 102.9801], dtype=np.float32)[:, None, None],
            min_size=min_size,
            max_size=max_size
        )

2.FasterRCNN中的VGG部分:

import numpy as np
from chainer.functions import dropout, max_pooling_2d, relu, softmax
from chainer.initializers import constant,normal
from chainer.links import Linear
from nets.connection.conv_2d_activ import Conv2DActiv
from nets.connection.pickable_sequential_chain import PickableSequentialChain

def _max_pooling_2d(x):
    return max_pooling_2d(x, ksize=2)

_imagenet_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32)[:, np.newaxis, np.newaxis]

class VGG16(PickableSequentialChain):
    def __init__(self,
                 n_class=None,alpha=1, pretrained_model=None, mean=None,
                 initialW=None, initial_bias=None):
        self.mean = _imagenet_mean
        self.alpha=alpha

        if initialW is None:
            initialW = normal.Normal(0.01)
        if pretrained_model:
            initialW = constant.Zero()
        kwargs = {'initialW': initialW, 'initial_bias': initial_bias}

        super(VGG16, self).__init__()
        with self.init_scope():
            self.conv1_1 = Conv2DActiv(None, 64//self.alpha, 3, 1, 1, **kwargs)
            self.conv1_2 = Conv2DActiv(None, 64//self.alpha, 3, 1, 1, **kwargs)
            self.pool1 = _max_pooling_2d
            self.conv2_1 = Conv2DActiv(None, 128//self.alpha, 3, 1, 1, **kwargs)
            self.conv2_2 = Conv2DActiv(None, 128//self.alpha, 3, 1, 1, **kwargs)
            self.pool2 = _max_pooling_2d
            self.conv3_1 = Conv2DActiv(None, 256//self.alpha, 3, 1, 1, **kwargs)
            self.conv3_2 = Conv2DActiv(None, 256//self.alpha, 3, 1, 1, **kwargs)
            self.conv3_3 = Conv2DActiv(None, 256//self.alpha, 3, 1, 1, **kwargs)
            self.pool3 = _max_pooling_2d
            self.conv4_1 = Conv2DActiv(None, 512//self.alpha, 3, 1, 1, **kwargs)
            self.conv4_2 = Conv2DActiv(None, 512//self.alpha, 3, 1, 1, **kwargs)
            self.conv4_3 = Conv2DActiv(None, 512//self.alpha, 3, 1, 1, **kwargs)
            self.pool4 = _max_pooling_2d
            self.conv5_1 = Conv2DActiv(None, 512//self.alpha, 3, 1, 1, **kwargs)
            self.conv5_2 = Conv2DActiv(None, 512//self.alpha, 3, 1, 1, **kwargs)
            self.conv5_3 = Conv2DActiv(None, 512//self.alpha, 3, 1, 1, **kwargs)
            self.pool5 = _max_pooling_2d
            self.fc6 = Linear(None, 4096//self.alpha, **kwargs)
            self.fc6_relu = relu
            self.fc6_dropout = dropout
            self.fc7 = Linear(None, 4096//self.alpha, **kwargs)
            self.fc7_relu = relu
            self.fc7_dropout = dropout
            self.fc8 = Linear(None, n_class, **kwargs)
            self.prob = softmax

        # chainer.serializers.load_npz(pretrained_model, self)

3.FasterRCNN中的region_proposal部分:

import numpy as np
import chainer
from chainer.backends import cuda
import chainer.functions as F
import chainer.links as L
from nets.utils.generate_anchor_base import generate_anchor_base
from nets.utils.proposal_creator import ProposalCreator

class RegionProposalNetwork(chainer.Chain):
    def __init__(
            self, in_channels=512, mid_channels=512, ratios=[0.5, 1, 2],
            anchor_scales=[8, 16, 32], feat_stride=16,
            initialW=None,
            proposal_creator_params={},
    ):
        self.anchor_base = generate_anchor_base(
            anchor_scales=anchor_scales, ratios=ratios)
        self.feat_stride = feat_stride
        self.proposal_layer = ProposalCreator(**proposal_creator_params)

        n_anchor = self.anchor_base.shape[0]
        super(RegionProposalNetwork, self).__init__()
        with self.init_scope():
            self.conv1 = L.Convolution2D(
                in_channels, mid_channels, 3, 1, 1, initialW=initialW)
            self.score = L.Convolution2D(
                mid_channels, n_anchor * 2, 1, 1, 0, initialW=initialW)
            self.loc = L.Convolution2D(
                mid_channels, n_anchor * 4, 1, 1, 0, initialW=initialW)

    def forward(self, x, img_size, scales=None):
        n, _, hh, ww = x.shape
        if scales is None:
            scales = [1.0] * n
        if not isinstance(scales, chainer.utils.collections_abc.Iterable):
            scales = [scales] * n

        anchor = _enumerate_shifted_anchor(
            self.xp.array(self.anchor_base), self.feat_stride, hh, ww)
        n_anchor = anchor.shape[0] // (hh * ww)
        h = F.relu(self.conv1(x))

        rpn_locs = self.loc(h)
        rpn_locs = rpn_locs.transpose((0, 2, 3, 1)).reshape((n, -1, 4))

        rpn_scores = self.score(h)
        rpn_scores = rpn_scores.transpose((0, 2, 3, 1))
        rpn_fg_scores =\
            rpn_scores.reshape((n, hh, ww, n_anchor, 2))[:, :, :, :, 1]
        rpn_fg_scores = rpn_fg_scores.reshape((n, -1))
        rpn_scores = rpn_scores.reshape((n, -1, 2))

        rois = []
        roi_indices = []
        for i in range(n):
            roi = self.proposal_layer(
                rpn_locs[i].array, rpn_fg_scores[i].array, anchor, img_size,
                scale=scales[i])
            batch_index = i * self.xp.ones((len(roi),), dtype=np.int32)
            rois.append(roi)
            roi_indices.append(batch_index)

        rois = self.xp.concatenate(rois, axis=0)
        roi_indices = self.xp.concatenate(roi_indices, axis=0)
        return rpn_locs, rpn_scores, rois, roi_indices, anchor


def _enumerate_shifted_anchor(anchor_base, feat_stride, height, width):
    xp = cuda.get_array_module(anchor_base)
    shift_y = xp.arange(0, height * feat_stride, feat_stride)
    shift_x = xp.arange(0, width * feat_stride, feat_stride)
    shift_x, shift_y = xp.meshgrid(shift_x, shift_y)
    shift = xp.stack((shift_y.ravel(), shift_x.ravel(),
                      shift_y.ravel(), shift_x.ravel()), axis=1)

    A = anchor_base.shape[0]
    K = shift.shape[0]
    anchor = anchor_base.reshape((1, A, 4)) + \
        shift.reshape((1, K, 4)).transpose((1, 0, 2))
    anchor = anchor.reshape((K * A, 4)).astype(np.float32)
    return anchor


5.整体代码构建

1.chainer初始化

self.image_size = image_size if (image_size == 300 or image_size == 512) else 300
if USEGPU =='-1':
     self.gpu_devices = -1
 else:
     self.gpu_devices = int(USEGPU)
     chainer.cuda.get_device_from_id(self.gpu_devices).use()

2.数据集以及模型构建

train_data_list, self.val_data_list, self.classes_names = CreateDataList_Detection(os.path.join(DataDir,'DataImage'),os.path.join(DataDir,'DataLabel'),train_split)
        
faster_rcnn = FasterRCNNVGG16(n_fg_class=len(self.classes_names),min_size = self.min_size,max_size = self.max_size, alpha=self.alpha)
faster_rcnn.use_preset('evaluate')
self.model = FasterRCNNTrainChain(faster_rcnn)
if self.gpu_devices>=0:
    self.model.to_gpu()
        
train = TransformDataset(VOCBboxDataset(data_list=train_data_list,classes_names=self.classes_names),Detection_Transform(self.model.coder, self.model.insize, self.model.mean))
self.train_iter = chainer.iterators.SerialIterator(train, self.batch_size)
test = VOCBboxDataset(data_list=self.val_data_list,classes_names=self.classes_names)
self.test_iter = chainer.iterators.SerialIterator(test, self.batch_size, repeat=False, shuffle=False)

3.模型训练

这里与分类网络一样需要先理解chainer的工作原理:
在这里插入图片描述  从图中我们可以了解到,首先我们需要设置一个Trainer,这个可以理解为一个大大的训练板块,然后做一个Updater,这个从图中可以看出是把训练的数据迭代器和优化器链接到更新器中,实现对模型的正向反向传播,更新模型参数。然后还有就是Extensions,此处的功能是在训练的中途进行操作可以随时做一些回调(描述可能不太对),比如做一些模型评估,修改学习率,可视化验证集等操作。
  因此我们只需要严格按照此图建设训练步骤基本上没有什么大问题,下面一步一步设置

设置优化器:

optimizer = optimizers.MomentumSGD(lr=learning_rate, momentum=0.9)
optimizer.setup(self.train_chain)
optimizer.add_hook(chainer.optimizer.WeightDecay(rate=0.0005))

设置update和trainer:

updater = StandardUpdater(self.train_iter, optimizer, device=self.gpu_devices)
trainer = chainer.training.Trainer(updater, (TrainNum, 'epoch'), out=ModelPath)

Extensions功能设置:

# 修改学习率
trainer.extend(
    extensions.ExponentialShift('lr', 0.9, init=learning_rate),
    trigger=chainer.training.triggers.ManualScheduleTrigger([50,80,150,200,280,350], 'epoch'))
# 每过一次迭代验证集跑一次
trainer.extend(
    DetectionVOCEvaluator(self.test_iter, self.train_chain.model, use_07_metric=True, label_names=self.classes_names),
    trigger=chainer.training.triggers.ManualScheduleTrigger([each for each in range(1,TrainNum)], 'epoch'))
# 可视化验证集效果
trainer.extend(Detection_VIS(
    self.model, 
    self.val_data_list,
    self.classes_names, image_size=self.image_size,
    trigger=chainer.training.triggers.ManualScheduleTrigger([each for each in range(1,TrainNum)], 'epoch'), 
    device=self.gpu_devices,ModelPath=ModelPath,predict_score=0.5
))
# 模型保存
trainer.extend(
    extensions.snapshot_object(self.model, 'Ctu_best_Model.npz'),
    trigger=chainer.training.triggers.MaxValueTrigger('validation/main/map',trigger=chainer.training.triggers.ManualScheduleTrigger([each for each in range(1,TrainNum)], 'epoch')),
)
# 日志及文件输出
log_interval = 0.1, 'epoch' 
trainer.extend(chainer.training.extensions.LogReport(filename='ctu_log.json',trigger=log_interval))
trainer.extend(chainer.training.extensions.observe_lr(), trigger=log_interval)
trainer.extend(extensions.dump_graph("main/loss", filename='ctu_net.net'))

最后配置完之后只需要一行代码即可开始训练

trainer.run()

6、模型预测

  模型预测主要还是输入为opencv格式,在数据预处理之前与前面数据加载时做的操作一致就行,直接上代码:
在这里插入图片描述

三、训练预测代码

  因为本代码是以对象形式编写的,因此调用起来也是很方便的,如下显示:

# ctu = Ctu_FasterRCNN(USEGPU='0',min_size=600,max_size=1000)
# ctu.InitModel(r'/home/ctu/Ctu_Project/DL_Project/DataDir/DataSet_Detection_YaoPian',train_split=0.9, batch_size=1,Pre_Model='./result_Model/Ctu_best_Model.npz',alpha=4)
# ctu.train(TrainNum=500,learning_rate=0.001, ModelPath='result_Model1')

ctu = Ctu_FasterRCNN(USEGPU='0')
ctu.LoadModel('./result_Model')
predictNum=1
predict_cvs = []
cv2.namedWindow("result", 0)
cv2.resizeWindow("result", 640, 480)
for root, dirs, files in os.walk(r'E:\DL_Project\DataSet\DataSet_Detection\DataSet_Halcon_YaoPian/test'):
    for f in files:
        if len(predict_cvs) >=predictNum:
            predict_cvs.clear()
        img_cv = ctu.read_image(os.path.join(root, f))
        if img_cv is None:
            continue
        predict_cvs.append(img_cv)
        if len(predict_cvs) == predictNum:
            result = ctu.predict(predict_cvs,0.0)
            print(result['time'])
            for each_id in range(result['img_num']):
                for each_bbox in result['bboxes_result'][each_id]:
                    print(each_bbox)
                cv2.imshow("result", result['imges_result'][each_id])
                cv2.waitKey()

四、效果

训练时的可视化,数据是验证集数据(没有参与过训练的数据),效果 包含loss,map等指标如图:
chainer-目标检测-FasterRCNN_第2张图片
chainer-目标检测-FasterRCNN_第3张图片

总结

本文章主要是基于chainer的目标检测FasterRCNN的基本实现思路和步骤

你可能感兴趣的:(深度学习-chainer,目标检测,计算机视觉,深度学习)