PSPNet测试代码解读

PSPNet测试代码在原文《Pyramid Scene Parsing Network》作者的GitHub网站上https://github.com/hszhao/PSPNet,下载下来解压后找到evaluation文件夹,里面的六个.m文件(Matlab代码)就是测试时用的代码,如下图所示:

PSPNet测试代码解读_第1张图片

1.代码解读

上图中的run.sh是执行文件,其代码解读如下:

matlab -nodisplay -r "eval_all;exit" 2>&1 | tee matlab.log

其中,-nodisplay和-r均为matlab命令,前者表示启动jvm,但不启动desktop,不启动任何显示相关的,效果如下图:

PSPNet测试代码解读_第2张图片

后者表示执行后面的代码,且代码间需要用';'分开,也即形如上述的"eval_all;exit"。

代码中的'2>&1' 的意思就是将标准错误重定向到标准输出(1表示标准输出,2表示标准错误)。

tee matlab.log表示将数据重定向到文件matlab.log,即程序运行过程中的所有输出都会写入这个日志里(包括报错的信息)。

从里面的代码中也可以看出,测试开始进入的是eval_all.m文件,由此我们先从这个文件开始分析测试代码。

eval_all.m文件中的源代码如下:

%{
Variables need to be modified: data_root, eval_list;
and the default GPUs used for evaluation are with ID [0:3],
modify variable 'gpu_id_array' if needed.
%}

close all; clc; clear;
addpath('../matlab'); %add matcaffe path
addpath('visualizationCode');
data_name = 'ADE20K'; %set to 'VOC2012' or 'cityscapes' for relevant datasets

switch data_name
    case 'ADE20K'
        isVal = true; %evaluation on valset
        step = 500; %equals to number of images divide num of GPUs in testing e.g. 500=2000/4
        data_root = '/data2/hszhao/dataset/ADEChallengeData2016'; %root path of dataset
        eval_list = 'list/ADE20K_val.txt'; %evaluation list, refer to lists in folder 'samplelist'
        save_root = 'mc_result/ADE20K/val/pspnet50_473/'; %root path to store the result image
        model_weights = 'model/pspnet50_ADE20K.caffemodel';
        model_deploy = 'prototxt/pspnet50_ADE20K_473.prototxt';
        fea_cha = 150; %number of classes
        base_size = 512; %based size for scaling
        crop_size = 473; %crop size fed into network
        data_class = 'objectName150.mat'; %class name
        data_colormap = 'color150.mat'; %color map
    case 'VOC2012'
        isVal = false; %evaluation on testset
        step = 364; %364=1456/4
        data_root = '/data2/hszhao/dataset/VOC2012';
        eval_list = 'list/VOC2012_test.txt';
        save_root = 'mc_result/VOC2012/test/pspnet101_473/';
        model_weights = 'model/pspnet101_VOC2012.caffemodel';
        model_deploy = 'prototxt/pspnet101_VOC2012_473.prototxt';
        fea_cha = 21;
        base_size = 512;
        crop_size = 473;
        data_class = 'objectName21.mat';
        data_colormap = 'colormapvoc.mat';
    case 'cityscapes'
        isVal = true;
        step = 125; %125=500/4
        data_root = '/data2/hszhao/dataset/cityscapes';
        eval_list = 'list/cityscapes_val.txt';
        save_root = 'mc_result/cityscapes/val/pspnet101_713/';
        model_weights = 'model/pspnet101_cityscapes.caffemodel';
        model_deploy = 'prototxt/pspnet101_cityscapes_713.prototxt';
        fea_cha = 19;
        base_size = 2048;
        crop_size = 713;
        data_class = 'objectName19.mat';
        data_colormap = 'colormapcs.mat';
end
skipsize = 0; %skip serveal images in the list

is_save_feat = false; %set to true if final feature map is needed (not suggested for storage consuming)
save_gray_folder = [save_root 'gray/']; %path for predicted gray image
save_color_folder = [save_root 'color/']; %path for predicted color image
save_feat_folder = [save_root 'feat/']; %path for predicted feature map
scale_array = [1]; %set to [0.5 0.75 1 1.25 1.5 1.75] for multi-scale testing
mean_r = 123.68; %means to be subtracted and the given values are used in our training stage
mean_g = 116.779;
mean_b = 103.939;

acc = double.empty;
iou = double.empty;
gpu_id_array = [0:3]; %multi-GPUs for parfor testing, if number of GPUs is changed, remember to change the variable 'step'
runID = 1;
gpu_num = size(gpu_id_array,2);
index_array = [(runID-1)*gpu_num+1:runID*gpu_num];

parfor i = 1:gpu_num %change 'parfor' to 'for' if singe GPU testing is used
  eval_sub(data_name,data_root,eval_list,model_weights,model_deploy,fea_cha,base_size,crop_size,data_class,data_colormap, ...
           is_save_feat,save_gray_folder,save_color_folder,save_feat_folder,gpu_id_array(i),index_array(i),step,skipsize,scale_array,mean_r,mean_g,mean_b);
end
if(isVal)
   eval_acc(data_name,data_root,eval_list,save_gray_folder,data_class,fea_cha);
end

解读如下:

%{
运行次代码前,有些参数需要修正:data_root(数据存放根目录),eval_list(测试/验证集索引路径),
gpu_id_array(所用到的GPU索引,需要根据自己的GPU数目进行修改,并修改相应的step参数),
其余一些路径适当修改。
Variables need to be modified: data_root, eval_list;
and the default GPUs used for evaluation are with ID [0:3],
modify variable 'gpu_id_array' if needed.
%}

close all; clc; clear;
addpath('/home/b622/PSPNet-master/matlab'); %matcaffe所在路径(此处读者需要修改为自己的路径,最好为绝对路径)
addpath('visualizationCode');   %可视化代码所在路径(其实是要调用每一类的颜色信息,并不是代码)
data_name = 'VOC2012'; %按照自己要测试的数据集进行测试(我用的VOC2012测试集,该测试集需要注册才能下载)

switch data_name
    case 'ADE20K'
        isVal = true; %evaluation on valset
        step = 500; %equals to number of images divide num of GPUs in testing e.g. 500=2000/4
        data_root = '/data2/hszhao/dataset/ADEChallengeData2016'; %root path of dataset
        eval_list = 'list/ADE20K_val.txt'; %evaluation list, refer to lists in folder 'samplelist'
        save_root = 'mc_result/ADE20K/val/pspnet50_473/'; %root path to store the result image
        model_weights = 'model/pspnet50_ADE20K.caffemodel';
        model_deploy = 'prototxt/pspnet50_ADE20K_473.prototxt';
        fea_cha = 150; %number of classes
        base_size = 512; %based size for scaling
        crop_size = 473; %crop size fed into network
        data_class = 'objectName150.mat'; %class name
        data_colormap = 'color150.mat'; %color map
    case 'VOC2012'
        isVal = false; %false的原因是VOC2012的测试集是没有标注过的,即无法得到标记y,故只能展现分割效果,却无法验证精度
        step = 728; %728=1456/2 由于我这边只有两块GPU,故将测试集一分为二,每块GPU测试一半
        %注意路径要严格按照以下样例填写:
        %data_root是存放数据集的根目录路径,最后不能添加符号'/',因为测试集索引文件,例如PSPNet-master/evaluation/samplelist下的
        %VOC2012_test.txt索引文件,其每一张测试图的索引格式形如'/JPEGImages/2008_000006.jpg'
        %故data_root的路径下必须有文件夹JPEGImages,且最后不能有'/'
        %eval_listu即测试集索引文件,需放在data_root目录下
        data_root = '/media/b622/My Passport/VOC2012'; %修改为自己存放数据的根目录
        eval_list = '/VOC2012_test.txt'; 
        save_root = '/media/b622/My Passport/VOC2012test/'; %自行设置,但要注意最后需有'/'
        model_weights = '/media/b622/My Passport/SPSNet/pspnet101_VOC2012.caffemodel'; %模型存放路径(在PSPNet-master/evaluation/model下能找到)
        model_deploy = 'prototxt/pspnet101_VOC2012_473.prototxt'; %模型的deploy.prototxt存放路径(在PSPNet-master/evaluation/prototxt下能找到)
        fea_cha = 21; %VOC2012共21类
        base_size = 512; %图像的基本大小(可以在此参数上实现多尺寸测试)
        crop_size = 473; %裁剪大小(由于训练的网络的输入图像大小为473×473,故需要对图像进行裁剪,详细见后面其余文件的代码分析)
        data_class = 'objectName21.mat';  %objectName21.mat里存放每一类的名称
        data_colormap = 'colormapvoc.mat'; %colormapvoc.mat为调色板,存放每一类颜色的RGB信息(Matlab下归一化到0-1)
    case 'cityscapes'
        isVal = true;
        step = 125; %125=500/4
        data_root = '/data2/hszhao/dataset/cityscapes';
        eval_list = 'list/cityscapes_val.txt';
        save_root = 'mc_result/cityscapes/val/pspnet101_713/';
        model_weights = '/media/b622/My Passport/SPSNet/pspnet101_cityscapes.caffemodel';
        model_deploy = 'prototxt/pspnet101_cityscapes_713.prototxt';
        fea_cha = 19;
        base_size = 2048;
        crop_size = 713;
        data_class = 'objectName19.mat';
        data_colormap = 'colormapcs.mat';
end
skipsize = 0; %skip serveal images in the list(此处设置为不跳过任何测试图片)

is_save_feat = false; %set to true if final feature map is needed (not suggested for storage consuming)是否保存特征数据
save_gray_folder = [save_root 'gray/']; %path for predicted gray image 预测图(灰度形式)保存路径
save_color_folder = [save_root 'color/']; %path for predicted color image 预测图(彩色图)保存路径
save_feat_folder = [save_root 'feat/']; %path for predicted feature map 预测特征图保持路径(实际上是数据,不是图)
scale_array = [1]; %set to [0.5 0.75 1 1.25 1.5 1.75] for multi-scale testing 即在多个尺寸在测试,这里设置为原尺寸
%训练阶段所使用的训练集RGB均值(减去均值可以提高训练速度)
mean_r = 123.68; %means to be subtracted and the given values are used in our training stage 
mean_g = 116.779;
mean_b = 103.939;

acc = double.empty;
iou = double.empty;
gpu_id_array = [0:1]; %只有两块GPU,所以设置为0-1,修改此处后,记得修改参数'step'
runID = 1;
gpu_num = size(gpu_id_array,2);
index_array = [(runID-1)*gpu_num+1:runID*gpu_num]; %转化为Matlab的索引(Matlab的索引从1开始)

%parfor能够开启多个线程来并行循环,如果为单个GPU,则需将parfor改为串行的for
parfor i = 1:gpu_num 
  eval_sub(data_name,data_root,eval_list,model_weights,model_deploy,fea_cha,base_size,crop_size,data_class,data_colormap, ...
           is_save_feat,save_gray_folder,save_color_folder,save_feat_folder,gpu_id_array(i),index_array(i),step,skipsize,scale_array,mean_r,mean_g,mean_b);
end
if(isVal)
   eval_acc(data_name,data_root,eval_list,save_gray_folder,data_class,fea_cha); %调用eval_acc对验证集进行准确度和平均交并比的计算
end

eval_all.m文件中最后几句中用到eval_sub()函数,此函数在eval_sub.m中,该函数的解读如下:

function eval_sub(data_name,data_root,eval_list,model_weights,model_deploy,fea_cha,base_size,crop_size,data_class,data_colormap, ...
		  is_save_feat,save_gray_folder,save_color_folder,save_feat_folder,gpu_id,index,step,skipsize,scale_array,mean_r,mean_g,mean_b)
list = importdata(fullfile(data_root,eval_list));  %fullfile相当于两个字符串连接
load(data_class); %加载类别(实际是每一种类别的名称,如airplane)
load(data_colormap);  %加载调色板,加载后会存在对应的调色板变量名
if(~isdir(save_gray_folder))  %不存在则创建
    mkdir(save_gray_folder);
end
if(~isdir(save_color_folder))
    mkdir(save_color_folder);
end
if(~isdir(save_feat_folder) && is_save_feat)
    mkdir(save_feat_folder);
end

phase = 'test'; %run with phase test (so that dropout isn't applied),test时不应用dropout
if ~exist(model_weights, 'file')
  error('Model missing!');
end
caffe.reset_all();
caffe.set_mode_gpu();
caffe.set_device(gpu_id);  %根据gpu_id启用相应的GPU
net = caffe.Net(model_deploy, model_weights, phase);

for i = skipsize+(index-1)*step+1:skipsize+index*step
    fprintf(1, 'processing %d (%d)...\n', i, numel(list));
    str = strsplit(list{i});
    img = imread(fullfile(data_root,str{1}));
    if(size(img,3) < 3) %for gray image 如果为灰度图,则扩展为三通道一样的图
    	im_r = img;
    	im_g = img;
    	im_b = img;
	img = cat(3,im_r,im_g,im_b); %cat函数用于联接数组
    end
    ori_rows = size(img,1);  %原始长
    ori_cols = size(img,2);  %原始宽
    data_all = zeros(ori_rows,ori_cols,fea_cha,'single');
    for j = 1:size(scale_array,2)
        long_size = base_size*scale_array(j) + 1;
        new_rows = long_size;
        new_cols = long_size;
        %归一化长和宽到设定的base_size
        if ori_rows > ori_cols
            new_cols = round(long_size/single(ori_rows)*ori_cols);
        else
            new_rows = round(long_size/single(ori_cols)*ori_rows);
        end 
        img_scale = imresize(img,[new_rows new_cols],'bilinear'); %双线性插值调整图像大小
        data_all = data_all + scale_process(net,img_scale,fea_cha,crop_size,ori_rows,ori_cols,mean_r,mean_g,mean_b);
    end
    
    data_all = data_all/size(scale_array,2);
    data = data_all; %already exp process

    img_fn = strsplit(str{1},'/');  %对图片路径按'/'切分(图片路径包含图片名字)
    img_fn = img_fn{end}; %取最后一个字符串(实际上是图片名字,包含扩展名)
    img_fn = img_fn(1:end-4); %去掉扩展名,只保留剩余下来的部分,例如/JPEGImages/2008_000006.jpg最后只保留2008_000006

    %max(data,[],3)取出data中fea_cha层的最大值及其对应的标号(例如VOC2012有21个类,则最后的data有21层大小为   
    %[ori_rows,ori_cols]的预测值,选出每一个像素点所对应的21层中的最大预测值,其所对应的标号(也即类别)即为该像素点的归属)
    [~,imPred] = max(data,[],3);  %imPred保存每一个像素点所对应的类别
    imPred = uint8(imPred);  %转化为8位无符号整数
    
    switch data_name
        case 'ADE20K'
            rgbPred = colorEncode(imPred, colors);
            imwrite(imPred,[save_gray_folder img_fn '.png']);
            imwrite(rgbPred,[save_color_folder img_fn '.png']);
        case 'VOC2012'
            imPred = imPred - 1; %VOC2010数据集的类别标号是0-20共21类,但imPred中的1对应VOC2012中的0(其余依次对应),故全减1
            imwrite(imPred,[save_gray_folder img_fn '.png']);
            imwrite(imPred,colormapvoc,[save_color_folder img_fn '.png']); %根据调色板colormapvoc进行上色,然后保存
        case 'cityscapes'
            imPred = imPred - 1;
            imwrite(imPred,[save_gray_folder img_fn '.png']);
            imwrite(imPred,colormapcs,[save_color_folder img_fn '.png']);
    end

    if(is_save_feat)
        save([save_feat_folder img_fn],'data');
    end
end
caffe.reset_all();
end

其中的scale_process()函数在scale_process.m文件中,解读如下:

function data_output = scale_process(net,img_scale,fea_cha,crop_size,ori_rows,ori_cols,mean_r,mean_g,mean_b)
    data_output = zeros(ori_rows,ori_cols,fea_cha,'single'); %创建数组,保存测试结果
    new_rows = size(img_scale,1); 
    new_cols = size(img_scale,2);
    long_size = new_rows;
    short_size = new_cols;
    if(new_cols > long_size)
       long_size = new_cols;
       short_size = new_rows;
    end
    if(long_size <= crop_size)
        %利用pre_img()进行填充到和crop_size一样大小,再减去均值,并转化为caffe的blob存储格式
        input_data = pre_img(img_scale,crop_size,mean_r,mean_g,mean_b); 
        score = caffe_process(input_data,net); %前向传播计算出预测值(预测值共21层)
        score = score(1:new_rows,1:new_cols,:); %因为是在pre_img中是'post'后向填充,所以取[1:new_rows,1:new_cols,:]即可
    else %当原始图片大小大于设定的裁剪图片大小(crop_size*crop_size)时,需要进行裁剪分块就行测试,最后合到一块儿
        stride_rate = 2/3;
        stride = ceil(crop_size*stride_rate); %裁剪步长
        img_pad = img_scale;
        if(short_size < crop_size) %如果长边大于crop_size,而短边小于crop_size,则需要对短边进行填充
          if(new_rows < crop_size) %如果Height是短边,对Height进行填充(填充方法与pre_img中一致)
              im_r = padarray(img_pad(:,:,1),[crop_size-new_rows,0],mean_r,'post');
              im_g = padarray(img_pad(:,:,2),[crop_size-new_rows,0],mean_g,'post');
              im_b = padarray(img_pad(:,:,3),[crop_size-new_rows,0],mean_b,'post');
              img_pad = cat(3,im_r,im_g,im_b);
          end
          if(new_cols < crop_size) %如果Width是短边,对Width进行填充(填充方法与pre_img中一致)
              im_r = padarray(img_pad(:,:,1),[0,crop_size-new_cols],mean_r,'post');
              im_g = padarray(img_pad(:,:,2),[0,crop_size-new_cols],mean_g,'post');
              im_b = padarray(img_pad(:,:,3),[0,crop_size-new_cols],mean_b,'post');
              img_pad = cat(3,im_r,im_g,im_b);
          end
        end
        pad_rows = size(img_pad,1);
        pad_cols = size(img_pad,2);
        h_grid = ceil(single(pad_rows-crop_size)/stride) + 1;
        w_grid = ceil(single(pad_cols-crop_size)/stride) + 1;
        data_scale = zeros(pad_rows,pad_cols,fea_cha,'single');
        count_scale = zeros(pad_rows,pad_cols,fea_cha,'single');
        %根据裁剪步长进行裁剪,从而前向传播计算预测值
        for grid_yidx=1:h_grid
            for grid_xidx=1:w_grid
                s_x = (grid_xidx-1) * stride + 1; %裁剪起始坐标的x值(start_x)
                s_y = (grid_yidx-1) * stride + 1; %裁剪起始坐标的y值(start_y)
                e_x = min(s_x + crop_size - 1, pad_cols); %裁剪终止坐标的x值(end_x)
                e_y = min(s_y + crop_size - 1, pad_rows); %裁剪终止坐标的y值(end_y)
                s_x = e_x - crop_size + 1; %目的是使得最终裁剪出来的图像大小为crop_size*crop_size,故重新计算裁剪的起始坐标
                s_y = e_y - crop_size + 1;
                img_sub = img_pad(s_y:e_y,s_x:e_x,:); %进行裁剪
                count_scale(s_y:e_y,s_x:e_x,:) = count_scale(s_y:e_y,s_x:e_x,:) + 1; %由于前后裁剪部分会有重叠,故要统计一下每一个像素点被测试了几次
                input_data = pre_img(img_sub,crop_size,mean_r,mean_g,mean_b); %执行处理
                data_scale(s_y:e_y,s_x:e_x,:) = data_scale(s_y:e_y,s_x:e_x,:) + caffe_process(input_data,net); %执行预测
            end
        end
        score = data_scale./count_scale; %求出每个像素点预测均值
        score = score(1:new_rows,1:new_cols,:); %因为是在pre_img中是'post'后向填充,所以取[1:new_rows,1:new_cols,:]即可
    end

    data_output = imresize(score,[ori_rows ori_cols],'bilinear');  %仍旧采用双线性插值返回到原图像大小
    data_output = bsxfun(@rdivide, data_output, sum(data_output, 3)); %进行归一化,使得每个像素点的21个预测值之和为1(此句语句主要针对于长边大于crop_size的情况)
end

scale_process()函数中所调用的pre_img(),caffe_process()函数的解读如下:

(1)pre_img()函数

function im_pad = pre_img(im,crop_size,mean_r,mean_g,mean_b)
    row = size(im,1);
    col = size(im,2);
    im_pad = single(im); %转换为单精度
    if(size(im_pad,3) < 3) %如果为灰度图,则转换为三通道一样的图
        im_r = im_pad;
        im_g = im_pad;
        im_b = im_pad;
        im_pad = cat(3,im_r,im_g,im_b);
    end
    if(row < crop_size)
        %padarray是matlab中用于填充的函数,'post'是后向填充
        %(即在最后一行后填充crop_size-row行,在最后一列后填充0列,填充数值为对应的均值mean_r/g/b)
        im_r = padarray(im_pad(:,:,1),[crop_size-row,0],mean_r,'post'); 
        im_g = padarray(im_pad(:,:,2),[crop_size-row,0],mean_g,'post');
        im_b = padarray(im_pad(:,:,3),[crop_size-row,0],mean_b,'post');
        im_pad = cat(3,im_r,im_g,im_b);
    end
    if(col < crop_size)
        im_r = padarray(im_pad(:,:,1),[0,crop_size-col],mean_r,'post');
        im_g = padarray(im_pad(:,:,2),[0,crop_size-col],mean_g,'post');
        im_b = padarray(im_pad(:,:,3),[0,crop_size-col],mean_b,'post');
        im_pad = cat(3,im_r,im_g,im_b);
    end
    im_mean = zeros(crop_size,crop_size,3,'single');
    im_mean(:,:,1) = mean_r;
    im_mean(:,:,2) = mean_g;
    im_mean(:,:,3) = mean_b;
    im_pad = single(im_pad) - im_mean; %减去均值
    im_pad = im_pad(:,:,[3 2 1]);  %从RGB转换为BGR存储(适应caffe的格式)
    im_pad = permute(im_pad,[2 1 3]); %转置图像,即调换长和宽(也是适应caffe的存储格式)
    %注:caffe中的Blob类型是(Width,Height,Channel,Number)格式存储
end

(2)caffe_process()函数

function score = ms_caffe_process(input_data,net)
    score = net.forward({input_data});  %前向传播计算预测值
    score = score{1};
    score_flip = net.forward({input_data(end:-1:1,:,:)}); %end:-1:1表示从尾到头重新排列,实质是进行翻转,即按列翻转
    score_flip = score_flip{1};
    score = score + score_flip(end:-1:1,:,:);

    score = permute(score, [2 1 3]); %恢复到原来的HeightxWidth格式
    %逐元素计算指数值,由于所给的网络模型(例如pspnet101_VOC2012_473.prototxt)没有softmax层,
    %由此手动计算softmax值,达到分类效果
    %softmax计算公式pxi=exp(xi)/sum(exp(xi),i=1:21), i=1,2,...,21
    score = exp(score);
    score = bsxfun(@rdivide, score, sum(score, 3)); %bsxfun调用matlab和C的混编函数rdivide(即右除,也是逐元素的)
end

以上过程中的图像裁剪思想如下:

PSPNet训练VOC2012时,采用的输入图像大小为固定的473*473,而VOC2012中的数据集本身大小并不固定,所以需要对小于473*473的图像进行补零填充;对大于473*473的图像进行裁剪,测试过程中的具体裁剪操作如下:

(1)当测试图像的长或宽或两者都小于473时,按下图进行补零填充(其中红色部分为原图,白色部分为补零填充部分,且是后向填充)

                                                  PSPNet测试代码解读_第3张图片

(2)当有一边大于473时,若另外一边小于473,则对这边进行后向补零填充(如下图);若另一边也大于473,则保持不变

                                       PSPNet测试代码解读_第4张图片

在填充完的基础上需要对(2)中的情况进行裁剪,使得每一块都为标准的473*473大小,具体裁剪方法是将473*473的窗口进行滑动,依此取出图像上的每一部分,而滑动的步长由代码中的参数'stride'决定,且不足的回退直到大小为473*473,如下图所示(共裁剪出4块区域):

     PSPNet测试代码解读_第5张图片

对每一块裁剪出的区域进行测试预测,重叠部分除以重叠次数(即上图中最中间的那块区域重叠了4次,故最后预测结果求和后除以4)。

2.执行测试及测试结果

打开终端,切换到run.sh文件所在的目录,输入以下语句执行测试:

./run.sh

PSPNet测试代码解读_第6张图片

执行过程如下(部分截图):

PSPNet测试代码解读_第7张图片

分割效果如下:

PSPNet测试代码解读_第8张图片

你可能感兴趣的:(PSPNet)