教程地址:http://deeplearning.stanford.edu/wiki/index.php/UFLDL%E6%95%99%E7%A8%8B
练习地址:http://deeplearning.stanford.edu/wiki/index.php/Exercise:Convolution_and_Pooling
局部连接,权重共享,池化,多层。
卷积神经网络结构可以看教程或者
http://blog.csdn.net/zouxy09/article/details/8781543
(1)局部连接与权值共享:
之前使用的mnist数据集28*28小图片使用全连接计算不会太慢,但如果是自然图片98*98,你需要设计近10 的 4 次方(100*100)个输入单元,假设你要学习 100 个特征,那么就有 10 的 6 次方个参数需要去学习。不管是前向传播还是反向传播都会很慢。
所以:每个隐含单元仅仅只能连接输入单元的一部分。例如,每个隐含单元仅仅连接输入图像的一小片相邻区域。这就是局部连接。
局部连接与权值共享不单单是为了减少计算量,这种方式是受人脑视觉神经系统启发设计的。设计原因有俩点:
1.自然图像中局部信息是高度相关的,形成局部motifs(这个词不知道该翻译成什么好,有点只可意会不可言传的感觉);
2.图像的局部统计特性对位置具有不变性,也就是说如果motifs出现在图像的某个地方,它也可以出现在任何其他地方。所以权值共享可以在图片的各个地方挖掘出相同的模式(一个feature map就挖掘一种模式)。
绿色为原图5*5,黄色为某个卷积核 3*3,红色3*3就是得到的一个feature map,为原图与此卷积核卷积的结果(也可以说原图被提取的特征)
注意有:feature map的3*3=(5-3+1)*(5-3+1),等式右边5为绿色长宽,3为黄色长宽
(2)池化:
虽然前面已经减少了很多参数,但是计算量还是很大,并且容易出现过拟合 (over-fitting),所以需要池化。
池化一般是最大值池化和平均值池化。
池化是平移不变性 (translation invariant)的关键。这就意味着即使图像经历了一个小的平移或者形变之后,依然会产生相同的 (池化的) 特征。
(3)网络结构:
输入层:为 64*64的图片,每张图RGB3个通道,64*64的大小,一共2000张训练样本。
卷积层:卷积核为 8*8,由前一个练习训练得到 W(400*192),b(400*1),其中192=8*8*3,代表8*8小图像块3通道的所有元素的权重,400表示400个不同的卷积核,即最终生成400个不同的feature map(每个大小为57*57,57=64-8+1)。
池化层:400个57*57的feature map被池化成400个3*3。
输出层:softmax层输入为400*3*3,输出4类(airplane, car, cat, dog)。
(1)Taking the preprocessing steps into account, the feature activations that you should compute is, whereT is the whitening matrix and is the mean patch. Expanding this, you obtain, which suggests that you should convolve the images withWT rather thanW as earlier, and you should add, rather than justb to convolvedFeatures, before finally applying the sigmoid function.
(2)im = squeeze(images(:, :, channel, imageNum));
这句可以看出squeeze函数就是去除维数为1的维度,如这句中channel,imageNum这两个维度维数都为1,被去除。
(3)conv2(im,feature,'valid');
conv2函数第3个参数shape=valid时,卷积时不考虑边界补零,即使输出矩阵维度为(imageDim - patchDim + 1, imageDim - patchDim + 1)
cnnConvolve.m
function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch) %cnnConvolve Returns the convolution of the features given by W and b with %the given images % % Parameters: % patchDim - patch (feature) dimension % numFeatures - number of features % images - large images to convolve with, matrix in the form % images(r, c, channel, image number) % W, b - W, b for features from the sparse autoencoder % ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for % preprocessing % % Returns: % convolvedFeatures - matrix of convolved features in the form % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) numImages = size(images, 4); imageDim = size(images, 1); imageChannels = size(images, 3); convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); % Instructions: % Convolve every feature with every large image here to produce the % numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1) % matrix convolvedFeatures, such that % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the % value of the convolved featureNum feature for the imageNum image over % the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1) % % Expected running times: % Convolving with 100 images should take less than 3 minutes % Convolving with 5000 images should take around an hour % (So to save time when testing, you should convolve with less images, as % described earlier) % -------------------- YOUR CODE HERE -------------------- % Precompute the matrices that will be used during the convolution. Recall % that you need to take into account the whitening and mean subtraction % steps WT=W*ZCAWhite; %(400*192)*(192*192) 400是hiddenSize代表400种不同的feature map B=b-WT*meanPatch;%(400*1) 这两步是根据教程step2a末尾的建议 % -------------------------------------------------------- convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); for imageNum = 1:numImages for featureNum = 1:numFeatures % convolution of image with feature matrix for each channel convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1); for channel = 1:3 % Obtain the feature (patchDim x patchDim) needed during the convolution % ---- YOUR CODE HERE ---- %feature = zeros(8,8); % You should replace this temp=patchDim*patchDim; feature=reshape(WT(featureNum,1+(channel-1)*temp:channel*temp),patchDim,patchDim); % ------------------------ % Flip the feature matrix because of the definition of convolution, as explained later feature = flipud(fliplr(squeeze(feature))); % Obtain the image im = squeeze(images(:, :, channel, imageNum));%squeeze会消除只有1维的维度 % Convolve "feature" with "im", adding the result to convolvedImage % 3通道都卷积后需要求和得到convolvedImage % be sure to do a 'valid' convolution % ---- YOUR CODE HERE ---- convolvedChannel=conv2(im,feature,'valid');%shape=valid时,卷积时不考虑边界补零 convolvedImage=convolvedImage+convolvedChannel; % ------------------------ end % Subtract the bias unit (correcting for the mean subtraction as well) % Then, apply the sigmoid function to get the hidden activation % ---- YOUR CODE HERE ---- convolvedImage=sigmoid(convolvedImage+B(featureNum)); %根据教程step2a末尾的建议 % ------------------------ % The convolved feature is the sum of the convolved values for all channels convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage; end end end function sigm = sigmoid(x) sigm = 1 ./ (1 + exp(-x)); end
运行自带的测试代码,显示:Congratulations! Your convolution code passed the test. 卷积代码无误!
池化代码较为容易,使用卷积中类似for循环即可。
%采用和cnnConvolve.m中的循环顺序 pooledDim=floor(convolvedDim/poolDim); for imageNum = 1:numImages for featureNum = 1:numFeatures for poolRow = 1:pooledDim for poolCol = 1:pooledDim poolMatrix = convolvedFeatures(featureNum, imageNum, (poolRow-1)*poolDim+1:poolRow*poolDim, (poolCol-1)*poolDim+1:poolCol*poolDim); pooledFeatures(featureNum, imageNum, poolRow, poolCol) = mean(poolMatrix(:)); end end end end
测试结果:Congratulations! Your pooling code passed the test.
(1)后面的代码NG已经给出,设置了stepSize = 50; 是为了避免内存溢出,分400/50=8次来计算,每次算50种feature map。如果电脑运行起来卡,把这个值改小些,改成10,20之类的。
即各维度调整了顺序,读数据时下标改变了。
(3)注意把需要的文件拷贝过来,程序跑完挺废时间的,所以保存好中间变量:
save('cnnPooledFeatures.mat', 'pooledFeaturesTrain', 'pooledFeaturesTest');
(4)结果: