YOLO发展至YOLOv3时,基本上这个系列都达到了一个高潮阶段,很多实际任务中,都会见到YOLOv3的身上,而对于较为简单和场景,比如没有太密集的目标和极端小的目标,多数时候仅用YOLOv2即可。除了YOLO系列,也还有其他很多优秀的工作,比如结构同样简洁的RetinaNet和SSD。后者SSD其实也会常在实际任务中见到,只不过就性能而言,要略差于YOLOv3,当然,这也是因为SSD并没有去做后续的升级,反倒很多新工作如RFB-Net、DSSD等工作都将其作为baseline。论性能,RetinaNet当然是不输于YOLOv3的,只是,相较于YOLOv3,RetinaNet的一个较为致命的问题就是:速度太慢。而这一个问题的主要原因就是RetinaNet使用较大的输出图像尺寸和较重的检测头。
yolov4的创新点
1.输入端的创新点:训练时对输入端的改进,主要包括Mosaic数据增强、cmBN、SAT自对抗训练
2.BackBone主干网络:各种方法技巧结合起来,包括:CSPDarknet53、Mish激活函数、Dropblock
3.Neck:目标检测网络在BackBone和最后的输出层之间往往会插入一些层,比如Yolov4中的SPP模块、FPN+PAN结构
4.Head:输出层的锚框机制和Yolov3相同,主要改进的是训练时的回归框位置损失函数CIOU_Loss,以及预测框筛选的nms变为DIOU_nms
通俗的讲,就是说这个YOLO-v4算法是在原有YOLO目标检测架构的基础上,采用了近些年CNN领域中最优秀的优化策略,从数据处理、主干网络、网络训练、激活函数、损失函数等各个方面都有着不同程度的优化,虽没有理论上的创新,但是会受到许许多多的工程师的欢迎,各种优化算法的尝试。文章如同于目标检测的trick综述,效果达到了实现FPS与Precision平衡的目标检测 new baseline。
yolov4 网络结构的采用的算法,其中保留了yolov3的head部分,修改了主干网络为CSPDarknet53,同时采用了SPP(空间金字塔池化)的思想来扩大感受野,PANet作为neck部分。
yolov4在技术处理的思维导图:
clc;
clear;
close all;
warning off;
addpath(genpath(pwd));
%****************************************************************************
%更多关于matlab和fpga的搜索“fpga和matlab”的CSDN博客:
%matlab/FPGA项目开发合作
%https://blog.csdn.net/ccsss22?type=blog
%****************************************************************************
%% Download Pretrained Network
% Set the modelName from the above ones to download that pretrained model.
modelName = 'YOLOv4-coco';
model = helper.downloadPretrainedYOLOv4(modelName);
net = model.net;
%% Load Data
% Unzip the vehicle images and load the vehicle ground truth data.
unzip vehicleDatasetImages.zip
data = load('vehicleDatasetGroundTruth.mat');
vehicleDataset = data.vehicleDataset;
% Add the full path to the local vehicle data folder.
vehicleDataset.imageFilename = fullfile(pwd, vehicleDataset.imageFilename);
rng('default')
shuffledIndices = randperm(height(vehicleDataset));
idx = floor(0.6 * length(shuffledIndices));
trainingDataTbl = vehicleDataset(shuffledIndices(1:idx), :);
testDataTbl = vehicleDataset(shuffledIndices(idx+1:end), :);
% Create an image datastore for loading the images.
imdsTrain = imageDatastore(trainingDataTbl.imageFilename);
imdsTest = imageDatastore(testDataTbl.imageFilename);
% Create a datastore for the ground truth bounding boxes.
bldsTrain = boxLabelDatastore(trainingDataTbl(:, 2:end));
bldsTest = boxLabelDatastore(testDataTbl(:, 2:end));
% Combine the image and box label datastores.
trainingData = combine(imdsTrain, bldsTrain);
testData = combine(imdsTest, bldsTest);
helper.validateInputData(trainingData);
helper.validateInputData(testData);
%% Data Augmentation
augmentedTrainingData = transform(trainingData, @helper.augmentData);
augmentedData = cell(4,1);
for k = 1:4
data = read(augmentedTrainingData);
augmentedData{k} = insertShape(data{1,1}, 'Rectangle', data{1,2});
reset(augmentedTrainingData);
end
figure
montage(augmentedData, 'BorderSize', 10)
%% Preprocess Training Data
% Specify the network input size.
networkInputSize = net.Layers(1).InputSize;
preprocessedTrainingData = transform(augmentedTrainingData, @(data)helper.preprocessData(data, networkInputSize));
% Read the preprocessed training data.
data = read(preprocessedTrainingData);
% Display the image with the bounding boxes.
I = data{1,1};
bbox = data{1,2};
annotatedImage = insertShape(I, 'Rectangle', bbox);
annotatedImage = imresize(annotatedImage,2);
figure
imshow(annotatedImage)
% Reset the datastore.
reset(preprocessedTrainingData);
%% Modify Pretrained YOLO v4 Network
rng(0)
trainingDataForEstimation = transform(trainingData, @(data)helper.preprocessData(data, networkInputSize));
numAnchors = 9;
[anchorBoxes, meanIoU] = estimateAnchorBoxes(trainingDataForEstimation, numAnchors);
% Specify the classNames to be used in the training.
classNames = {'vehicle'};
[lgraph, networkOutputs, anchorBoxes, anchorBoxMasks] = configureYOLOv4(net, classNames, anchorBoxes, modelName);
%% Specify Training Options
numEpochs = 90;
miniBatchSize = 4;
learningRate = 0.001;
warmupPeriod = 1000;
l2Regularization = 0.001;
penaltyThreshold = 0.5;
velocity = [];
%% Train Model
if canUseParallelPool
dispatchInBackground = true;
else
dispatchInBackground = false;
end
mbqTrain = minibatchqueue(preprocessedTrainingData, 2,...
"MiniBatchSize", miniBatchSize,...
"MiniBatchFcn", @(images, boxes, labels) helper.createBatchData(images, boxes, labels, classNames), ...
"MiniBatchFormat", ["SSCB", ""],...
"DispatchInBackground", dispatchInBackground,...
"OutputCast", ["", "double"]);
% Convert layer graph to dlnetwork.
net = dlnetwork(lgraph);
% Create subplots for the learning rate and mini-batch loss.
fig = figure;
[lossPlotter, learningRatePlotter] = helper.configureTrainingProgressPlotter(fig);
iteration = 0;
% Custom training loop.
for epoch = 1:numEpochs
reset(mbqTrain);
shuffle(mbqTrain);
while(hasdata(mbqTrain))
iteration = iteration + 1;
[XTrain, YTrain] = next(mbqTrain);
% Evaluate the model gradients and loss using dlfeval and the
% modelGradients function.
[gradients, state, lossInfo] = dlfeval(@modelGradients, net, XTrain, YTrain, anchorBoxes, anchorBoxMasks, penaltyThreshold, networkOutputs);
% Apply L2 regularization.
gradients = dlupdate(@(g,w) g + l2Regularization*w, gradients, net.Learnables);
% Determine the current learning rate value.
currentLR = helper.piecewiseLearningRateWithWarmup(iteration, epoch, learningRate, warmupPeriod, numEpochs);
% Update the network learnable parameters using the SGDM optimizer.
[net, velocity] = sgdmupdate(net, gradients, velocity, currentLR);
% Update the state parameters of dlnetwork.
net.State = state;
% Display progress.
if mod(iteration,10)==1
helper.displayLossInfo(epoch, iteration, currentLR, lossInfo);
end
% Update training plot with new points.
helper.updatePlots(lossPlotter, learningRatePlotter, iteration, currentLR, lossInfo.totalLoss);
end
end
% Save the trained model with the anchors.
anchors.anchorBoxes = anchorBoxes;
anchors.anchorBoxMasks = anchorBoxMasks;
save('yolov4_trained', 'net', 'anchors');
%% Evaluate Model
confidenceThreshold = 0.5;
overlapThreshold = 0.5;
% Create a table to hold the bounding boxes, scores, and labels returned by
% the detector.
numImages = size(testDataTbl, 1);
results = table('Size', [0 3], ...
'VariableTypes', {'cell','cell','cell'}, ...
'VariableNames', {'Boxes','Scores','Labels'});
% Run detector on images in the test set and collect results.
reset(testData)
while hasdata(testData)
% Read the datastore and get the image.
data = read(testData);
image = data{1};
% Run the detector.
executionEnvironment = 'auto';
[bboxes, scores, labels] = detectYOLOv4(net, image, anchors, classNames, executionEnvironment);
% Collect the results.
tbl = table({bboxes}, {scores}, {labels}, 'VariableNames', {'Boxes','Scores','Labels'});
results = [results; tbl];
end
% Evaluate the object detector using Average Precision metric.
[ap, recall, precision] = evaluateDetectionPrecision(results, testData);
% The precision-recall (PR) curve shows how precise a detector is at varying
% levels of recall. Ideally, the precision is 1 at all recall levels.
% Plot precision-recall curve.
figure
plot(recall, precision)
xlabel('Recall')
ylabel('Precision')
grid on
title(sprintf('Average Precision = %.2f', ap))
%% Detect Objects Using Trained YOLO v4
reset(testData)
data = read(testData);
% Get the image.
I = data{1};
% Run the detector.
executionEnvironment = 'auto';
[bboxes, scores, labels] = detectYOLOv4(net, I, anchors, classNames, executionEnvironment);
% Display the detections on image.
if ~isempty(scores)
I = insertObjectAnnotation(I, 'rectangle', bboxes, scores);
end
figure
imshow(I)
资源