《Keras快速上手》总共十章,关于deep learning的部分可以直接从第六章开始。
关于Keras的安装书中推荐是从微软的服务端仓库进行克隆,安装一个git并将它添加到环境变量中,随后从git中克隆过来。
关于配置的部分,Keras选择CNTK,Tensorflow,Theano之一作为计算平台,书中推荐CNTK基于多项测试中CNTK的计算速度最快。计算后台可以在.keras文件中的json文件中进行配置。
第六章讲述CNN,直接以mnist作为例子引入。
Sequantial序列模型是Model通用模型中的一个子类,涵盖了常见的层与层之间依次连接的模型,将模型中的隐藏层依次.add进模型即可,十分直观。Dense表示常见的全连接层。Mnist数据只有一个Channel,故卷积与池化层使用2维的:
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten
from keras.layers.convolutional import Conv2D,MaxPooling2D
读入数据后将数据重塑成四维张量形式,添加样本数与通道数。再将数据中的样本值进行归一化:
(x_train,y_train),(x_test,y_test)=mnist.load_data()
x_train=x_train.reshape(x_train.shape[0],28,28,1).astype('float32')
x_test=x_test.reshape(x_test.shape[0],28,28,1).astype('float32')
x_train/=255
x_test/=255
标签是0-9的数字,为了对应softmax层的输出,将标签改写为十维的one-hot向量:
def tran_y(y):
y_ohe=np.zeros(10)
y_ohe[y]=1
return y_ohe
y_train_ohe=np.array([tran_y(y_train[i]) for i in range(len(y_train))])
y_test_ohe=np.array([tran_y(y_test[i]) for i in range(len(y_test))])
搭建网络。添加三层的卷积层。每层卷积层中使用3*3的filter,步长为1,same padding,Relu为activation function,池化层中使用2*2的最大池化。在相邻的卷积层中添加dropout层减小过拟合:
model=Sequential()
model.add(Conv2D(filters=64,kernel_size=(3,3),strides=(1,1),padding='same',input_shape=(28,28,1),activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.5))
model.add(Conv2D(128,kernel_size=(3,3),strides=(1,1),padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.5))
model.add(Conv2D(256,kernel_size=(3,3),strides=(1,1),padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.5))
卷积层之后用Flatten()将输出展开成28*28=784维的向量以输入进全连接层。在三层全连接层之后接入softmax作为activation function的输出层:
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dense(64,activation='relu'))
model.add(Dense(32,activation='relu'))
model.add(Dense(10,activation='softmax'))
最后进行训练,损失函数使用常用的交叉熵损失函数,优化方法书中采用的是adagrad,mini_batch的大小一般选用2的幂次,书中采用的是128:
model.compile(loss='categorical_crossentropy',optimizer='adagrad',metrics=['accuracy'])
model.fit(x_train,y_train_ohe,validation_data=(x_test,y_test_ohe),epochs=20,batch_size=128)
训练结果如下:
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
60000/60000 [==============================] - 134s 2ms/step - loss: 0.4763 - acc: 0.8359 - val_loss: 0.0722 - val_acc: 0.9762
Epoch 2/20
60000/60000 [==============================] - 134s 2ms/step - loss: 0.1153 - acc: 0.9645 - val_loss: 0.0427 - val_acc: 0.9875
Epoch 3/20
60000/60000 [==============================] - 135s 2ms/step - loss: 0.0881 - acc: 0.9722 - val_loss: 0.0369 - val_acc: 0.9883
Epoch 4/20
60000/60000 [==============================] - 140s 2ms/step - loss: 0.0772 - acc: 0.9763 - val_loss: 0.0308 - val_acc: 0.9903
Epoch 5/20
60000/60000 [==============================] - 147s 2ms/step - loss: 0.0659 - acc: 0.9795 - val_loss: 0.0274 - val_acc: 0.9902
Epoch 6/20
60000/60000 [==============================] - 138s 2ms/step - loss: 0.0615 - acc: 0.9807 - val_loss: 0.0269 - val_acc: 0.9905
Epoch 7/20
60000/60000 [==============================] - 139s 2ms/step - loss: 0.0572 - acc: 0.9818 - val_loss: 0.0253 - val_acc: 0.9901
Epoch 8/20
60000/60000 [==============================] - 138s 2ms/step - loss: 0.0532 - acc: 0.9826 - val_loss: 0.0249 - val_acc: 0.9915
Epoch 9/20
60000/60000 [==============================] - 138s 2ms/step - loss: 0.0474 - acc: 0.9854 - val_loss: 0.0225 - val_acc: 0.9919
Epoch 10/20
60000/60000 [==============================] - 140s 2ms/step - loss: 0.0459 - acc: 0.9854 - val_loss: 0.0217 - val_acc: 0.9925
Epoch 11/20
60000/60000 [==============================] - 140s 2ms/step - loss: 0.0440 - acc: 0.9864 - val_loss: 0.0208 - val_acc: 0.9929
Epoch 12/20
60000/60000 [==============================] - 141s 2ms/step - loss: 0.0438 - acc: 0.9860 - val_loss: 0.0223 - val_acc: 0.9921
Epoch 13/20
60000/60000 [==============================] - 137s 2ms/step - loss: 0.0395 - acc: 0.9874 - val_loss: 0.0201 - val_acc: 0.9930
Epoch 14/20
60000/60000 [==============================] - 136s 2ms/step - loss: 0.0399 - acc: 0.9871 - val_loss: 0.0194 - val_acc: 0.9935
Epoch 15/20
60000/60000 [==============================] - 142s 2ms/step - loss: 0.0390 - acc: 0.9876 - val_loss: 0.0195 - val_acc: 0.9931
Epoch 16/20
60000/60000 [==============================] - 136s 2ms/step - loss: 0.0367 - acc: 0.9881 - val_loss: 0.0182 - val_acc: 0.9936
Epoch 17/20
60000/60000 [==============================] - 137s 2ms/step - loss: 0.0364 - acc: 0.9886 - val_loss: 0.0188 - val_acc: 0.9935
Epoch 18/20
60000/60000 [==============================] - 138s 2ms/step - loss: 0.0362 - acc: 0.9882 - val_loss: 0.0177 - val_acc: 0.9933
Epoch 19/20
60000/60000 [==============================] - 136s 2ms/step - loss: 0.0325 - acc: 0.9898 - val_loss: 0.0175 - val_acc: 0.9935
Epoch 20/20
60000/60000 [==============================] - 135s 2ms/step - loss: 0.0329 - acc: 0.9895 - val_loss: 0.0174 - val_acc: 0.9939
四个周期之后验证集上的正确分类率已经达到99%,20个周期之后接近达到99.4%。