此示例演示如何训练非常深的超分辨率 VDSR神经网络,并提供预先训练的VDSR网络,使用VDSR网络从单个低分辨率图像估计高分辨率图像。原文链接: Single Image Super-Resolution Using Deep Learning
超分辨率是使用低分辨率图像创建高分辨率图像的过程,本示例考虑单个图像超高分辨率 (SISR),其目标是从一个低分辨率图像中恢复一个高分辨率图像。但是,SISR 是一个不适定问题(ill-posed problem),因为一个低分辨率图像可以生成几个可能的高分辨率图像。
本示例探索 SISR 的一种深度学习算法,称为Very Deep Super-Resolution (VDSR)。
VDSR 是一种用于执行单图像超分辨率的卷积神经网络架构,VDSR 网络可以学习低分辨率和高分辨率图像之间的映射,之所以有可能进行这种映射,是因为低分辨率和高分辨率图像具有相似的图像内容,并且主要在高频细节上有所不同。
VDSR 采用残差学习策略,该策略能够估计残差图像。在超分辨率的上下文中,残差图像是高分辨率参考图像与匹配参考图像大小的低分辨率图像之间的差别,其中,匹配大小的过程使用双立方插值的方法进行操作。
VDSR 网络从彩色图像的亮度提取残差图像,图像的亮度通道通过红色、绿色和蓝色像素值的线性组合表示每个像素的亮度。相比之下,图像的两个色度通道(Cb和Cr)是代表颜色差异信息的红色、绿色和蓝色像素值的不同线性组合。VDSR 仅使用亮度通道进行训练,因为人类感知对亮度变化比对颜色变化更敏感。
如果Y<高残差>是高分辨率图像的亮度,Y<低残差>是使用双立方插值升级的低分辨率图像,然后 VDSR 网络的输入为Y<低残差>和网络从训练数据学会预测Y<残余> = Y<高残差> - Y<低残差>
在 VDSR 网络学习估计残差图像后,可以通过将估计的残差图像添加到上采样的低分辨率图像,然后将图像转换回 RGB 颜色空间来重建高分辨率图像。
比例因子将参考图像的大小与低分辨率图像的大小关联。随着比例因子的增加,SISR 的重建效果变差,因为低分辨率图像会丢失有关高频图像内容的更多信息,VDSR 使用大感受野解决此问题,本示例使用比例扩增训练具有多个比例因子的 VDSR 网络,缩放扩充可提高较大比例因子的结果,因为网络可以从较小的比例因子中利用图像上下文。
下载 IAPR TC-12 基准,它包括 20,000 张静止自然图像 [2],数据集包括人、动物、城市等的照片,数据文件的大小为 ±1.8 GB。如果不想下载训练数据集,可以通过在命令行键入来加载预先训练的 VDSR 网络。然后,直接转到本示例中的 VDSR 网络执行单图像超分辨率部分。
命令行键入的语句为:load('trainedVDSR-Epoch-100-ScaleFactors-234.mat')
imagesDir = tempdir;
url = 'http://www-i6.informatik.rwth-aachen.de/imageclef/resources/iaprtc12.tgz';
downloadIAPRTC12Data(url,imagesDir);
trainImagesDir = fullfile(imagesDir,'iaprtc12','images','02');
exts = {'.jpg','.bmp','.png'};
pristineImages = imageDatastore(trainImagesDir,'FileExtensions',exts);
numel(pristineImages.Files)
upsampledDirName = [trainImagesDir filesep 'upsampledImages'];
residualDirName = [trainImagesDir filesep 'residualImages'];
scaleFactors = [2 3 4];
createVDSRTrainingSet(pristineImages,scaleFactors,upsampledDirName,residualDirName);
upsampledImages = imageDatastore(upsampledDirName,'FileExtensions','.mat','ReadFcn',@matRead);
residualImages = imageDatastore(residualDirName,'FileExtensions','.mat','ReadFcn',@matRead);
augmenter = imageDataAugmenter( ...
'RandRotation',@()randi([0,1],1)*90, ...
'RandXReflection',true);
patchSize = [41 41];
patchesPerImage = 64;
dsTrain = randomPatchExtractionDatastore(upsampledImages,residualImages,patchSize, ...
"DataAugmentation",augmenter,"PatchesPerImage",patchesPerImage);
inputBatch = preview(dsTrain);
disp(inputBatch)
此示例使用深度学习工具箱中的 41 个独立图层定义 VDSR 网络,包括:
第一层 在图像补丁上操作。修补程序大小基于网络接受字段,即影响网络中最顶层响应的空间图像区域。理想情况下,网络接受域与图像大小相同,以便该字段可以看到图像中的所有高级要素。在这种情况下,对于具有D卷积层的网络,imageInputLayer接受域是 (2D+1)
networkDepth = 20;
firstLayer = imageInputLayer([41 41 1],'Name','InputLayer','Normalization','none');
convLayer = convolution2dLayer(3,64,'Padding',1, ...
'WeightsInitializer','he','BiasInitializer','zeros','Name','Conv1');
relLayer = reluLayer('Name','ReLU1');
middleLayers = [convLayer relLayer];
for layerNumber = 2:networkDepth-1
convLayer = convolution2dLayer(3,64,'Padding',[1 1], ...
'WeightsInitializer','he','BiasInitializer','zeros', ...
'Name',['Conv' num2str(layerNumber)]);
relLayer = reluLayer('Name',['ReLU' num2str(layerNumber)]);
middleLayers = [middleLayers convLayer relLayer];
end
convLayer = convolution2dLayer(3,1,'Padding',[1 1], ...
'WeightsInitializer','he','BiasInitializer','zeros', ...
'NumChannels',64,'Name',['Conv' num2str(networkDepth)]);
finalLayers = [convLayer regressionLayer('Name','FinalRegressionLayer')];
layers = [firstLayer middleLayers finalLayers];
layers = vdsrLayers;
maxEpochs = 100;
epochIntervals = 1;
initLearningRate = 0.1;
learningRateFactor = 0.1;
l2reg = 0.0001;
miniBatchSize = 64;
options = trainingOptions('sgdm', ...
'Momentum',0.9, ...
'InitialLearnRate',initLearningRate, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',10, ...
'LearnRateDropFactor',learningRateFactor, ...
'L2Regularization',l2reg, ...
'MaxEpochs',maxEpochs, ...
'MiniBatchSize',miniBatchSize, ...
'GradientThresholdMethod','l2norm', ...
'GradientThreshold',0.01, ...
'Plots','training-progress', ...
'Verbose',false);
doTraining = false;
if doTraining
modelDateTime = datestr(now,'dd-mmm-yyyy-HH-MM-SS');
net = trainNetwork(dsTrain,layers,options);
save(['trainedVDSR-' modelDateTime '-Epoch-' num2str(maxEpochs*epochIntervals) '-ScaleFactors-' num2str(234) '.mat'],'net','options');
else
load('trainedVDSR-Epoch-100-ScaleFactors-234.mat');
end
若要使用 VDSR 网络执行单个图像超分辨率 (SISR),按照此示例的其余步骤操作。本示例的其余部分演示如何:
exts = {'.jpg','.png'};
fileNames = {'sherlock.jpg','car2.jpg','fabric.png','greens.jpg','hands1.jpg','kobi.png', ...
'lighthouse.png','micromarket.jpg','office_4.jpg','onion.png','pears.png','yellowlily.jpg', ...
'indiancorn.jpg','flamingos.jpg','sevilla.jpg','llama.jpg','parkavenue.jpg', ...
'peacock.jpg','car1.jpg','strawberries.jpg','wagon.jpg'};
filePath = [fullfile(matlabroot,'toolbox','images','imdata') filesep];
filePathNames = strcat(filePath,fileNames);
testImages = imageDatastore(filePathNames,'FileExtensions',exts);
montage(testImages)
indx = 1; % Index of image to read from the test image datastore
Ireference = readimage(testImages,indx);
Ireference = im2double(Ireference);
imshow(Ireference)
title('High-Resolution Reference Image')
scaleFactor = 0.25;
Ilowres = imresize(Ireference,scaleFactor,'bicubic');
imshow(Ilowres)
title('Low-Resolution Image')
[nrows,ncols,np] = size(Ireference);
Ibicubic = imresize(Ilowres,[nrows ncols],'bicubic');
imshow(Ibicubic)
title('High-Resolution Image Obtained Using Bicubic Interpolation')
回想一下,,因为人类感知对亮度变化比对颜色变化更敏感。
Iycbcr = rgb2ycbcr(Ilowres);
Iy = Iycbcr(:,:,1);
Icb = Iycbcr(:,:,2);
Icr = Iycbcr(:,:,3);
Iy_bicubic = imresize(Iy,[nrows ncols],'bicubic');
Icb_bicubic = imresize(Icb,[nrows ncols],'bicubic');
Icr_bicubic = imresize(Icr,[nrows ncols],'bicubic');
Iresidual = activations(net,Iy_bicubic,41);
Iresidual = double(Iresidual);
imshow(Iresidual,[])
title('Residual Image from VDSR')
Isr = Iy_bicubic + Iresidual;
Ivdsr = ycbcr2rgb(cat(3,Isr,Icb_bicubic,Icr_bicubic));
imshow(Ivdsr)
title('High-Resolution Image Obtained Using VDSR')
roi = [320 30 480 400];
montage({imcrop(Ibicubic,roi),imcrop(Ivdsr,roi)})
title('High-Resolution Results Using Bicubic Interpolation (Left) vs. VDSR (Right)');
bicubicPSNR = psnr(Ibicubic,Ireference)
vdsrPSNR = psnr(Ivdsr,Ireference)
bicubicSSIM = ssim(Ibicubic,Ireference)
vdsrSSIM = ssim(Ivdsr,Ireference)
bicubicNIQE = niqe(Ibicubic)
vdsrNIQE = niqe(Ivdsr)
scaleFactors = [2 3 4];
superResolutionMetrics(net,testImages,scaleFactors);
Results for Scale factor 2
Average PSNR for Bicubic = 31.809683
Average PSNR for VDSR = 31.921784
Average SSIM for Bicubic = 0.938194
Average SSIM for VDSR = 0.949404
Results for Scale factor 3
Average PSNR for Bicubic = 28.170441
Average PSNR for VDSR = 28.563952
Average SSIM for Bicubic = 0.884381
Average SSIM for VDSR = 0.895830
Results for Scale factor 4
Average PSNR for Bicubic = 27.010839
Average PSNR for VDSR = 27.837260
Average SSIM for Bicubic = 0.861604
Average SSIM for VDSR = 0.877132
VDSR 在每个比例因子的指标分数比双立方插值更好。