前言:
本次实验是练习convolution和pooling的使用,更深一层的理解怎样对大的图片采用convolution得到每个特征的输出结果,然后采用pooling方法对这些结果进行计算,使之具有平移不变等特性。实验参考的是斯坦福网页教程:Exercise:Convolution and Pooling。也可以参考前面的博客:Deep learning:十七(Linear Decoders,Convolution和Pooling),且本次试验是在前面博文Deep learning:二十二(linear decoder练习)的学习到的特征提取网络上进行的。
实验基础:
首先来看看整个训练和测试过程的大概流程:从本文可以更清楚的看到,在训练阶段,是对小的patches进行whitening的。由于输入的数据是大的图片,所以每次进行convolution时都需要进行whitening和网络的权值计算,这样每一个学习到的隐含层节点的特征对每一张图片都可以得到一张稍小的特征图片,接着对这张特征图片进行均值pooling(在这之前,程序中有一些代码来测试convolution和pooling代码的正确性)。有了这些特征值以及标注值,就可以用softmax来训练多分类器了。
在测试阶段是对大图片采取convolution的,每次convolution的图像块也同样需要用训练时的whitening参数进行预处理,分别经过convolution和pooling提取特征,这和前面的训练过程一样。然后用训练好的softmax分类器就可进行预测了。
上文copy自: http://www.cnblogs.com/tornadomeet/archive/2013/04/09/3009830.html实验流程图:
阶段一:线性解码特征学习
阶段二:卷积特征提取
阶段三:池化特征提取
将57*57的卷积特征进行池化,池化后池化特征维数为3*3;floor(57/19)=3;对数据进行降维处理
阶段四:运用最后池化特征进行softmax训练
Convolutional and pooling networkexercise
主函数说明(代码部分还有一些详细注释):
Step0:参数说明
imageDim = 64; % 样本图像维数
imageChannels = 3; % 图像的基色树,rgb三基色
patchDim = 8; % 小patch的维数(用于特征学习,卷积模板)
numPatches = 50000; % 特征学习的样本数
visibleSize = patchDim *patchDim * imageChannels; %输入单元数
outputSize = visibleSize; % 输出单元数,用于特征学习
hiddenSize = 400; % 隐单元个数
epsilon = 0.1; % epsilon for ZCA whitening
poolDim = 19; %池化区域的维数
Step1:训练一个带线性解码的网络学习特征
见:http://blog.csdn.net/whiteinblue/article/details/21939087
Step2:进行convolution和pooling,并检验
Step2a:
convolvedFeatures = cnnConvolve();详细分析,见下文cnnConvolve分析
Step2b检验:
由于卷积的过程本质上就是一次神经网络的前馈计算过程,所以应用网络前馈计算函数feedForwardAutoencoder计算的结果应该和卷积的结果相同。所以可用两种做差来检验。
Step2c:pooling
pooledFeatures = cnnPool(poolDim, convolvedFeatures);详细分析见后面函数分析
Step2d:检验pooling
初始化一个1到64顺序排列的矩阵,然后用均值计算每个4*4模块的pooling均值,然后和cnnpool的计算值比较。
Step3:Convolute和 pool原始图像,进行Convolution和pooling的特征提取
由于样本数据较大,对样本进行分批处理;每批采用50个特征进行样本图像的特征提取。
Step4:运用池化的特征进行softmax分类训练
Step5:精度测试
cnnConvolve.m:函数说明
convolvedFeatures =cnnConvolve(patchDim, numFeatures, images,W, b, ZCAWhite, meanPatch)
参数说明:
输入参数:
patchdim:小patch的维度,卷积块的大小。
numFeatures:特征个数,和神经元个数相同,每个隐单元都学习一个特征
images:被卷积的数据,此处为图像矩阵。
images(r, c, channel, image number)
W,b:为网络参数
ZCAWhite:为数据白化预处理的转换矩阵
meanPatch:为数据每个维度的均值,用于0均值化处理
输出参数:convolvedFeatures为卷积结果矩阵,矩阵有4个维度的大矩阵
numFeatures * numImages *(imageDim -patchDim +1) * (imageDim - patchDim +1)
Size(convolvedFeatures)=400 8 57 57 ;
由于每个特征都要分别对样本图像进行卷积运算,共计有400特征,每个样本需要被400个特征卷积;卷积后的图片大小为57=64-8+1;一共有8个样本图片。
Size(convolvedFeatures,1),首个维度为特征个数,共计400个
Size(convolvedFeatures,2),维度2为样本图片个数,共计8个
Size(convolvedFeatures,3),维度3为,卷积后图片的行数
Size(convolvedFeatures,4),维度3为,卷积后图片的列数
convolvedFeatures(featureNum, imageNum,imageRow, imageCol)
Step0:参数初始化
numImages = size(images, 4);%第4维的大小,即图片的样本数,为8
imageDim = size(images, 1);%第1维的大小,即图片的行数
imageChannels = size(images, 3);%第3维的大小,即图片的通道数
%Size(trainImages)= 64 64 3 2000
%Images=convImages =trainImages(:, :, :, 1:8);
训练数据trainImages为4维向量,里面有2000个样本,每个样本为64*64的3通道图片。
Step1:预处理权值矩阵
WT = W*ZCAWhite;%等效的网络参数
b_mean = b -WT*meanPatch;%针对未均值化的输入数据需要加入该项
由于矩阵W是针对原始数据x白话后的数据x’进行系数编码学习得到的系数矩阵,所以W是针对x’的特征系数矩阵。而此处的输入数据为原始数据x,未经过0均值和白化处理;所以需要把W转换为WT,对原始数据进行处理。
Step3:卷积计算
由于输出参数维数为Size(convolvedFeatures)=400 8 57 57 ;所以需要多重循环计算。本实验为3层循环结构。
1、循环体
for imageNum =1:numImages %循环每个样本,共计8个
for featureNum =1:numFeatures %循环每个特征,共计400个
% 对三个基色(rgb)分别进行卷积计算
convolvedImage = zeros(imageDim -patchDim +1, imageDim - patchDim +1);
for channel =1:imageChannels %
2、提取卷积特征
offset=(channel-1)*patchSize;
feature = reshape(WT(featureNum,offset+1:offset+patchSize),patchDim, patchDim);%取一个权值图像块出来
feature = flipud(fliplr(squeeze(feature)));% squeeze对于二维矩阵无效,而feature为8*8的方阵,squeeze对feature无效怎么还处理呢。
im =squeeze(images(:, :, channel, imageNum));% 提取某个样本的某个基色的数据
3、进行卷积计算
convolvedoneChannel = conv2(im, feature,'valid');
convolvedImage = convolvedImage+ convolvedoneChannel;
%直接把3通道的值加起来,理由:3通道相当于有3个feature-map,类似于cnn第2层以后的输入。
4、应用sigmoid函数处理
convolvedImage =sigmoid(convolvedImage+b_mean(featureNum));
5、卷积矩阵放入输出参数中
convolvedFeatures(featureNum, imageNum, :,:) = convolvedImage;
相关函数说明:
squeeze函数:
B=squeeze(A) 返回和矩阵A相同元素但所有单一维都移除的矩阵B,单一维是满足size(A,dim)=1的维。squeeze命令对二维数组是不起作用的;
Flipr函数:使矩阵X沿垂直轴左右翻转
flipud函数:可以实现矩阵的上下翻转
cnnpool函数说明
pooledFeatures= cnnPool(poolDim, convolvedFeatures)
1.参数说明
输入参数:poolDim,池化片段的维数,
ConvolutedFeatures,卷积特征
输出参数:pooledFeatures= zeros(numFeatures, numImages,resultDim, resultDim);
四维数据矩阵,特征个数,样本图片个数,池化结果维数= floor(convolvedDim / poolDim);%floor为取整函数,取不大于某个小数的整数,floor(3.7)=3
2.函数体池化说明
for imageNum =1:numImages %遍历样本
for featureNum =1:numFeatures %遍历特征
for poolRow =1:resultDim %池化后矩阵的行数
offsetRow = 1+(poolRow-1)*poolDim;
for poolCol =1:resultDim %池化后矩阵的列数
offsetCol = 1+(poolCol-1)*poolDim;
patch =convolvedFeatures(featureNum,imageNum,offsetRow:offsetRow+poolDim-1,...
offsetCol:offsetCol+poolDim-1);
%在卷积特征图像中,提取出一个patch
pooledFeatures(featureNum,imageNum,poolRow,poolCol)= mean(patch(:));%使用均值pool
end
end
end
end
完整的流程代码(部分matlab带注释):
%% CS294A/CS294W Convolutional Neural Networks Exercise
% Instructions
% ------------
%
% This file contains code that helps you get started on the
% convolutional neural networks exercise. In this exercise, you will only
% need to modify cnnConvolve.m and cnnPool.m. You will not need to modify
% this file.
addpath '../library/'
%%======================================================================
%% STEP 0: Initialization
% Here we initialize some parameters used for the exercise.
imageDim = 64; % image dimension
imageChannels = 3; % number of channels (rgb, so 3)
patchDim = 8; % patch dimension
numPatches = 50000; % number of patches
visibleSize = patchDim * patchDim * imageChannels; % number of input units
outputSize = visibleSize; % number of output units
hiddenSize = 400; % number of hidden units
epsilon = 0.1; % epsilon for ZCA whitening
poolDim = 19; % dimension of pooling region
%%======================================================================
%% STEP 1: Train a sparse autoencoder (with a linear decoder) to learn
% features from color patches. If you have completed the linear decoder
% execise, use the features that you have obtained from that exercise,
% loading them into optTheta. Recall that we have to keep around the
% parameters used in whitening (i.e., the ZCA whitening matrix and the
% meanPatch)
% --------------------------- YOUR CODE HERE --------------------------
% Train the sparse autoencoder and fill the following variables with
% the optimal parameters:
load '../linear_decoder_exercise/STL10Features.mat'
% --------------------------------------------------------------------
% Display and check to see that the features look good
W = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);
b = optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
% displayColorNetwork( (W*ZCAWhite)');
%%======================================================================
%% STEP 2: Implement and test convolution and pooling
% In this step, you will implement convolution and pooling, and test them
% on a small part of the data set to ensure that you have implemented
% these two functions correctly. In the next step, you will actually
% convolve and pool the features with the STL10 images.
%% STEP 2a: Implement convolution
% Implement convolution in the function cnnConvolve in cnnConvolve.m
% Note that we have to preprocess the images in the exact same way
% we preprocessed the patches before we can obtain the feature activations.
load '../data/stl10_matlab/stlTrainSubset.mat' % loads numTrainImages, trainImages, trainLabels
%% Use only the first 8 images for testing
convImages = trainImages(:, :, :, 1:8);
% NOTE: Implement cnnConvolve in cnnConvolve.m first!
convolvedFeatures = cnnConvolve(patchDim, hiddenSize, convImages, W, b, ZCAWhite, meanPatch);
%% STEP 2b: Checking your convolution
% To ensure that you have convolved the features correctly, we have
% provided some code to compare the results of your convolution with
% activations from the sparse autoencoder
% For 1000 random points
for i = 1:1000
featureNum = randi([1, hiddenSize]);
imageNum = randi([1, 8]);
imageRow = randi([1, imageDim - patchDim + 1]);
imageCol = randi([1, imageDim - patchDim + 1]);
patch = convImages(imageRow:imageRow + patchDim - 1, imageCol:imageCol + patchDim - 1, :, imageNum);
patch = patch(:);
patch = patch - meanPatch;
patch = ZCAWhite * patch;
features = feedForwardAutoencoder(optTheta, hiddenSize, visibleSize, patch);
if abs(features(featureNum, 1) - convolvedFeatures(featureNum, imageNum, imageRow, imageCol)) > 1e-9
fprintf('Convolved feature does not match activation from autoencoder\n');
fprintf('Feature Number : %d\n', featureNum);
fprintf('Image Number : %d\n', imageNum);
fprintf('Image Row : %d\n', imageRow);
fprintf('Image Column : %d\n', imageCol);
fprintf('Convolved feature : %0.5f\n', convolvedFeatures(featureNum, imageNum, imageRow, imageCol));
fprintf('Sparse AE feature : %0.5f\n', features(featureNum, 1));
error('Convolved feature does not match activation from autoencoder');
end
end
disp('Congratulations! Your convolution code passed the test.');
%% STEP 2c: Implement pooling
% Implement pooling in the function cnnPool in cnnPool.m
% NOTE: Implement cnnPool in cnnPool.m first!
pooledFeatures = cnnPool(poolDim, convolvedFeatures);
%% STEP 2d: Checking your pooling
% To ensure that you have implemented pooling, we will use your pooling
% function to pool over a test matrix and check the results.
testMatrix = reshape(1:64, 8, 8);
expectedMatrix = [mean(mean(testMatrix(1:4, 1:4))) mean(mean(testMatrix(1:4, 5:8))); ...
mean(mean(testMatrix(5:8, 1:4))) mean(mean(testMatrix(5:8, 5:8))); ];
testMatrix = reshape(testMatrix, 1, 1, 8, 8);%模拟四维向量convolutedFeatures
pooledFeatures = squeeze(cnnPool(4, testMatrix));
if ~isequal(pooledFeatures, expectedMatrix)
disp('Pooling incorrect');
disp('Expected');
disp(expectedMatrix);
disp('Got');
disp(pooledFeatures);
error('Pooling incorrect');
else
disp('Congratulations! Your pooling code passed the test.');
end
%%======================================================================
%% STEP 3: Convolve and pool with the dataset
% In this step, you will convolve each of the features you learned with
% the full large images to obtain the convolved features. You will then
% pool the convolved features to obtain the pooled features for
% classification.
%
% Because the convolved features matrix is very large, we will do the
% convolution and pooling 50 features at a time to avoid running out of
% memory. Reduce this number if necessary
stepSize = 50;
assert(mod(hiddenSize, stepSize) == 0, 'stepSize should divide hiddenSize');
load '../data/stl10_matlab/stlTrainSubset.mat' % loads numTrainImages, trainImages, trainLabels
load '../data/stl10_matlab/stlTestSubset.mat' % loads numTestImages, testImages, testLabels
pooledFeaturesTrain = zeros(hiddenSize, numTrainImages, ...
floor((imageDim - patchDim + 1) / poolDim), ...
floor((imageDim - patchDim + 1) / poolDim) );%定义4维矩阵,特征数(隐层节点数),样本数,池化后矩阵维数行,列
pooledFeaturesTest = zeros(hiddenSize, numTestImages, ...
floor((imageDim - patchDim + 1) / poolDim), ...
floor((imageDim - patchDim + 1) / poolDim) );%定义4维矩阵,特征数(隐层节点数),样本数,池化后矩阵维数行,列 floor((imageDim - patchDim + 1) / poolDim)=floor(57/19)
tic();
for convPart = 1:(hiddenSize / stepSize)
featureStart = (convPart - 1) * stepSize + 1;
featureEnd = convPart * stepSize;
fprintf('Step %d: features %d to %d\n', convPart, featureStart, featureEnd);
Wt = W(featureStart:featureEnd, :);%提取stepsize个特征权值,前stepsize个隐单元的权值
bt = b(featureStart:featureEnd);
fprintf('Convolving and pooling train images\n');
convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
trainImages, Wt, bt, ZCAWhite, meanPatch);
toc();
pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;%池化后的图像特征矩阵(numFeatures, numImages, resultDim, resultDim);
toc();
clear convolvedFeaturesThis pooledFeaturesThis;
fprintf('Convolving and pooling test images\n');
convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
testImages, Wt, bt, ZCAWhite, meanPatch);
toc();
pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
pooledFeaturesTest(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;
toc();
clear convolvedFeaturesThis pooledFeaturesThis;
end
% You might want to save the pooled features since convolution and pooling takes a long time
save('cnnPooledFeatures.mat', 'pooledFeaturesTrain', 'pooledFeaturesTest');
toc();
%%======================================================================
%% STEP 4: Use pooled features for classification
% Now, you will use your pooled features to train a softmax classifier,
% using softmaxTrain from the softmax exercise.
% Training the softmax classifer for 1000 iterations should take less than
% 10 minutes.
% Add the path to your softmax solution, if necessary
% addpath /path/to/solution/
% Setup parameters for softmax
softmaxLambda = 1e-4;
numClasses = 4;
% Reshape the pooledFeatures to form an input vector for softmax
softmaxX = permute(pooledFeaturesTrain, [1 3 4 2]);
%permute函数,按照向量order指定的顺序重排A的各维 softmaxX=(numFeatures,row,col,numImages)
softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,...
numTrainImages);
%numel(pooledFeaturesTrain) / numTrainImages为numFeatures* resultDim*resultDim;400*(57/19)*(57/19);每个样本有400个池化特征图片表示
softmaxY = trainLabels;
options = struct;
options.maxIter = 200;
softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,...
numClasses, softmaxLambda, softmaxX, softmaxY, options);
%%======================================================================
%% STEP 5: Test classifer
% Now you will test your trained classifer against the test images
softmaxX = permute(pooledFeaturesTest, [1 3 4 2]);
softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages);
softmaxY = testLabels;
[pred] = softmaxPredict(softmaxModel, softmaxX);
acc = (pred(:) == softmaxY(:));
acc = sum(acc) / size(acc, 1);
fprintf('Accuracy: %2.3f%%\n', acc * 100);
% You should expect to get an accuracy of around 80% on the test images.