Theano入门——卷积神经网络

Theano入门——卷积神经网络



1.卷积神经网络介绍

参考链接[1]和[3]。


2.Theano实现

(1)conv2d函数
2D卷积。
(2)dimshuffle(*pattern)函数
置换维度的序列。pattern的举例如下:
('x')->0维标量变为1维向量
(0,1)->和2维向量们相同
(1,0)->第1,2维互换
('x',0)->1维向量从N变成1*N
(0,'x')->1维向量从N变成N*1
(2,0,1)->A*B*C变成C*A*B
(0,'x',1)->A*B变成A*1*B
(1,'x',0)->A*B变成B*1*A
(1,)->删除0维,从1*A变为A
(3)max_pool_2d函数
2D下采样。
(4)CNN简介
后面参考Machine Vision的主干内容,参考链接[3]。
链接[3]中对CNN的大致理解很清晰,如下:
接触的最多的卷积应该是高斯核,用于对图像进行平滑,或者是实现在不同尺度下的运算等。这里的卷积和高斯核是同一个类型,就是定义了一个卷积的尺度,然后卷积的具体参数就是神经网络中的参数,通过计算神经网络的参数,相当于学到了多个卷积的参数,而每个卷积可以看成是对图像进行特征提取(一个特征核),CNN网络就可以看成是前面的几层都是在提取图像的特征,最后一层softmax用于对提取的特征进行分类。所有CNN的特征是自学习(相对于SIFT,SURF)。
(5)卷积
对于output=conv2d(input,W)函数的输入维度含义:
input——(batches,feature,I_h,I_w),分别表示块大小(每个块中的样本数目),特征图数目,图像高度和图像宽度。
W——(filters,feature,f_h,f_w),分别表示滤波器数目,特征图数目,滤波器高度和滤波器宽度。
output——(batches,filters,I_h-f_h+1,I_w-f_w+1)。
(6)池化
max_pool_2d(input,(p_h,p_w))计算区域里所有位置的最大值作为特征输出。p_h和p_w分别为区域高度和区域宽度。
(7)已知变量设置
x维度为(样本数目,1,32,32)
w_c1维度为(4,1,3,3)
b_c1维度为4维行向量
w_c2维度为(8,4,3,3)
b_c2维度为8维行向量
b_h3维度为100维行向量
w_o维度为(100,10)
b_o维度为10维行向量
batch_size(块大小)为50
(8)CNN具体网络结构
Machine Vision对MNIST数据集作出分析,这里同理对CIFAR-10数据集的分析来加深理解。
1)输入层:输入图像维度为(50,1,32,32)
2)卷积层1:输入为输入层的每小块数据,输出经过卷积和池化。
卷积输入:batches=50,feature=1,I_h=32,I_l=32,filters=4,f_h=3,f_w=3
卷积输出:(50,4,32-3+1=30,32-3+1=30)
池化输入:p_h=3,p_l=3
池化输出:(50,4,30/3=10,30/3=10)
3)卷积层2:输入为卷积层1的输出,输出经过卷积和池化。
卷积输入:batches=50,feature=4,I_h=10,I_w=10,filters=8,f_h=3,f_w=3
卷积输出:(50,8,10-3+1=8,10-3+1=8)
池化输入:p_h=2,p_l=2
池化输出:(50,8,8/2=4,8/2=4)
4)全连接层:输入为卷积层2的输出。
输入:1维向量,大小为(8*4*4=128)
输出:等于设定值b_h3的维度,100
5)Softmax层:输入为全连接层的输出。
输入:100
输出:分类可能值的集合大小,10

import theano
import theano.tensor as T
import numpy as np
import matplotlib.pyplot as plt
plt.ion()

import load_mnist
import load_cifar

from theano.tensor.nnet.conv import conv2d
from theano.tensor.signal.downsample import max_pool_2d


# load data
x_train, t_train, x_test, t_test = load_cifar.cifar10(dtype=theano.config.floatX)
# x_train, x_test, t_train, t_test = load_mnist.mnist(onehot=True)
labels_test = np.argmax(t_test, axis=1)


# reshape data
x_train = x_train.reshape((x_train.shape[0], 1, 32, 32))
x_test = x_test.reshape((x_test.shape[0], 1, 32, 32))


# define symbolic Theano variables
x = T.tensor4()
t = T.matrix()


# define model: neural network
def floatX(x):
    return np.asarray(x, dtype=theano.config.floatX)

def init_weights(shape):
    return theano.shared(floatX(np.random.randn(*shape) * 0.1))

def momentum(cost, params, learning_rate, momentum):
    grads = theano.grad(cost, params)
    updates = []
    
    for p, g in zip(params, grads):
        mparam_i = theano.shared(np.zeros(p.get_value().shape, dtype=theano.config.floatX))
        v = momentum * mparam_i - learning_rate * g
        updates.append((mparam_i, v))
        updates.append((p, p + v))

    return updates

def model(x, w_c1, b_c1, w_c2, b_c2, w_h3, b_h3, w_o, b_o):
    c1 = T.maximum(0, conv2d(x, w_c1) + b_c1.dimshuffle('x', 0, 'x', 'x'))
    p1 = max_pool_2d(c1, (3, 3))

    c2 = T.maximum(0, conv2d(p1, w_c2) + b_c2.dimshuffle('x', 0, 'x', 'x'))
    p2 = max_pool_2d(c2, (2, 2))

    p2_flat = p2.flatten(2)
    h3 = T.maximum(0, T.dot(p2_flat, w_h3) + b_h3)
    p_y_given_x = T.nnet.softmax(T.dot(h3, w_o) + b_o)
    return p_y_given_x

w_c1 = init_weights((4, 1, 3, 3))
b_c1 = init_weights((4,))
w_c2 = init_weights((8, 4, 3, 3))
b_c2 = init_weights((8,))
w_h3 = init_weights((8 * 4 * 4, 100))
b_h3 = init_weights((100,))
w_o = init_weights((100, 10))
b_o = init_weights((10,))

params = [w_c1, b_c1, w_c2, b_c2, w_h3, b_h3, w_o, b_o]

p_y_given_x = model(x, *params)
y = T.argmax(p_y_given_x, axis=1)

cost = T.mean(T.nnet.categorical_crossentropy(p_y_given_x, t))

updates = momentum(cost, params, learning_rate=0.01, momentum=0.9)


# compile theano functions
train = theano.function([x, t], cost, updates=updates, allow_input_downcast=True)
predict = theano.function([x], y, allow_input_downcast=True)


# train model
batch_size = 50

for i in range(50):
    for start in range(0, len(x_train), batch_size):
        x_batch = x_train[start:start + batch_size]
        t_batch = t_train[start:start + batch_size]
        cost = train(x_batch, t_batch)

    predictions_test = predict(x_test)
    accuracy = np.mean(predictions_test == labels_test)
    print "epoch %d - accuracy: %.4f" % (i+1, accuracy)

(9)CNN具体网络结构(彩色图片)
输入图像的维度变为(50,3,32,32),卷积层1的卷积输入的feature由1变为3。

(10)Dropout
参考链接[5]。
1)含义
Dropout以概率p随机丢掉隐含层单元。
原因:防止训练数据时隐含层神经元的相互作用;隐含层单元不再依赖其它隐含层单元;每个神经元都会学到有用的特征。同时,可以在合理的时间内训练大规模网络。
2)训练
方法:
SGD
Mini-Batches
交叉熵目标函数。
修改惩罚项:为每个隐含层单元的权重向量的L2模设置上限;如果约束不满足,则重新归一化;防止权重增加过大;允许在训练初期用高学习率,训练期间逐渐降低;对权重空间作更彻底的搜索。
3)测试
平均网络:包含所有的隐含层单元,但每个单元的值都是原值的概率p倍。
用很小的学习率和dropout微调模型要比标准反向传播微调效果更好。
4)这里
对模型的全连接层的输入采用了dropout函数。
众所周知Dropout效果不错,但也并不总是这样。链接[6]中有这么一句话:
In vision tasks, input features are commonly dense, while in our task input features are sparse and labels are noisy. In the dense setting, dropout serves to separate effects from strongly correlated features, resulting in a more robust classifier. But in our sparse, noisy setting adding in dropout appears to simply reduce the amount of data available for learning.
5)问题
链接[6]中lamthep问到用到dropout的许多文章中都把dropout用在全连接层并怀疑原因是卷积层产生的特征已经是稀疏的。
ZygmuntZ答复引用了下面作者的话:
a.支持dropout应用在隐含层
Nitish Srivastava: "The additional gain in performance obtained by adding dropout in the convolutional layers besides doing dropout in the fully connected layers suggests that the utility of dropout is not limited to densely connected neural networks but can be more generally applied to other specialized architectures."
b.反对dropout应用在隐含层
Matt Zeiler:"A drawback to dropout is that it does not seem to have the same benefits for convolutional layers."
c.其它
zhaoyangyang:"The dropout with make the training time much longer, if applied at each layer, it might be too long to train."
卷积层不像全连接层那样需要正则化;卷积层正则化弊大于利(卷积层产生的特征已经是稀疏的,只会增加计算代价)。


3.实验结果

灰度图像和彩色图像的准确率经过50次迭代大约分别为53%和55%,灰度图像Dropout后大约为50%,有时也会有57%。

4.参考链接

[1]http://blog.csdn.net/stdcoutzyx/article/details/41596663
[2]http://deeplearning.net/software/theano/library/tensor/basic.html
[3]http://www.cnblogs.com/cvision/p/CNN.html
[4]https://github.com/benanne/theano-tutorial/blob/master/6_convnet.py
[5]http://blog.csdn.net/chlele0105/article/details/20863245
[6]http://fastml.com/regularizing-neural-networks-with-dropout-and-with-dropconnect/
[7]https://ift6266h13.wordpress.com/2013/03/10/dropout-in-convolutional-layers-and-relu-vs-tanh/

你可能感兴趣的:(cnn,theano)