卷积计算过程
感受野
全零填充(padding)
TF描述卷积计算层
批标准化(batch normalization,BN)
池化(pooling)
舍弃(dropout)
卷积神经网络
Cifar10数据集
卷积神经网络搭建示例
实现LeNet,AlexNet,VGGNet,InceptionNet,ResNet五个经典卷积网络
输入特征图的深度(channel数)绝对了卷积核的深度。
如灰度图是1channel,则卷积核可以选择5*5*1或3*3*1的卷积核等
彩色图是3channel,则卷积核可以选择5*5*3或3*3*3的卷积核等
当前卷积核的个数决定了当前层输出特征图的深度
卷积核的每一个参数(w,b)都是可训练的
卷积神经网络各输出特征图中的每个像素点,在原始输入图片上映射区域的大小
通常用一个3*3的卷积核代替一个5*5的卷积核
原图像是5*5的,经过和卷积核的卷积运算之后会成为3*3的尺寸,为了使卷积运算之后仍为5*5的,那么需要全零填充:在原5*5的图像周围填充0,使卷积之后仍为5*5
使用全零填充时,输出特征图的边长=输入边长/步长 (不能整除就向上取整)
不使用全零填充时,输出特征图的边长=(输入边长-卷积核边长+1)/步长 (不能整除就向上取整)
tensorflow中使用全零填充:padding=‘same’,不使用全零填充:padding=‘valid’
tf.keras.layers.Conv2D(
filters=卷积核个数,
kernel_size=卷积核尺寸, # 正方形写核长整数,或(核高h,核宽w)
strides=滑动步长, # 横纵向相同写步长整数,或(纵向步长h,横向步长w),默认为1
padding='same' or 'valid', # 使用是否全零填充是same,否则是valid,默认为valid
activation='relu' or 'sigmoid' or 'tanh' or 'softmax'等, # 如果后面有BN,则不用写activation
input_shape=(高,宽,通道数) # 输入特征图维度,可省略
)
如
model=tf.keras.models.Sequential([
Conv2D(6,5,padding='valid',activation='sigmoid'),
MaxPool2D(2,2), # 最大池化,就是选取2*2中的最大数作为结果
Conv2D(6,(5,5),padding='valid',activation='sigmoid'),
MaxPool2D(2,(2,2)),
Conv2D(filters=6,kernel_size=(5,5),padding='valid',activation='sigmoid'),
MaxPool2D(pool_size=(2,2),strides=2),
Flatten(),
Dense(10,activation='softmax')
])
标准化:使数据符合0均值,1标准差的分布
批标准化:对一个batch做标准化,常用在卷积操作和激活操作之间
BN操作使原本偏移的特征数据重新拉回到0均值
BN操作会引入γ和β两个参数,保证BN操作后网络仍具有非线性表达能力,γ和β会与其他待训练参数一同被训练优化
BN层位于卷积层之后,激活层之前
卷积层->批标准化层->激活层
TF描述批标准化:tf.keras.layers.BatchNormalization()
model=tf.keras.models.Sequential([
Conv2D(filters=6,kernel_size=(5,5),padding='valid'), # 卷积层
BatchNormalization(), # BN层
Activation('relu'), # 激活层
MaxPool2D(pool_size=(2,2),strides=2), # 池化层
Dropout(0.2), # dropout层
])
tf.keras.layers.MaxPool2D(
pool_size=池化核尺寸, #正方形写核长整数,或(核高h,核宽w)
strides=池化步长, #步长整数,或(纵向步长,横向步长),默认为pool_size
padding='valid' or 'same' #使用全零填充与否
)
tf.keras.layers.AveragePool2D(
pool_size=池化核尺寸, #正方形写核长整数,或(核高h,核宽w)
strides=池化步长, #步长整数,或(纵向步长,横向步长),默认为pool_size
padding='valid' or 'same' #使用全零填充与否
)
如
MaxPool2D(pool_size=(2,2),strides=2,padding=‘same’)
为了缓解过拟合,在NN训练过程中,常把隐藏层的部分神经元按照一定比例从神经网络中临时舍弃;在使用神经网络时,再把所有神经元恢复到神经网络中
TF描述舍弃:tf.keras.layers.Dropout(舍弃的概率)
如
Dropout(0.2), # 舍弃概率是0.2
借助卷积核对输入特征进行特征提取,再把提取到的特征送入全连接网络进行识别预测
卷积就是CBAPD(见上图)
1.import模块
2.导入数据集,划分train,test
3.定义网络结构, class MyModel(Model):
4.model.compile(…)
5.断点续训、模型保存、model.fit(…)
6.可训练参数保存到txt文件中
7.acc/loss可视化
(注:断点续训、模型保存、可训练参数保存到txt文件中 这三步可省略)
以下以cifar10数据集为例进行演示
(cifar10数据集有5万张32*32像素点的彩色图片,用于训练
有1万张32*32像素点的彩色图片,用于测试)
import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model
np.set_printoptions(threshold=np.inf)
cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
class Baseline(Model):
def __init__(self):
super(Baseline, self).__init__()
self.c1 = Conv2D(filters=6, kernel_size=(5, 5), padding='same') # 卷积层
self.b1 = BatchNormalization() # BN层
self.a1 = Activation('relu') # 激活层
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same') # 池化层
self.d1 = Dropout(0.2) # dropout层
self.flatten = Flatten()
self.f1 = Dense(128, activation='relu')
self.d2 = Dropout(0.2)
self.f2 = Dense(10, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.p1(x)
x = self.d1(x)
x = self.flatten(x)
x = self.f1(x)
x = self.d2(x)
y = self.f2(x)
return y
model = Baseline()
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['sparse_categorical_accuracy'])
# 断点续训、模型保存
checkpoint_save_path = "./checkpoint/Baseline.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
callbacks=[cp_callback])
model.summary()
# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
file.write(str(v.name) + '\n')
file.write(str(v.shape) + '\n')
file.write(str(v.numpy()) + '\n')
file.close()
############################################### show ###############################################
# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
以下为运行两次的结果(由于使用了断点续训,第二次训练会在第一次的基础上继续训练)
LeNet网络结构:
只需把上述CNN的class自定义网络结构换成如下所示即可
class LeNet5(Model):
def __init__(self):
super(LeNet5, self).__init__()
self.c1 = Conv2D(filters=6, kernel_size=(5, 5),
activation='sigmoid')
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2)
self.c2 = Conv2D(filters=16, kernel_size=(5, 5),
activation='sigmoid')
self.p2 = MaxPool2D(pool_size=(2, 2), strides=2)
self.flatten = Flatten()
self.f1 = Dense(120, activation='sigmoid')
self.f2 = Dense(84, activation='sigmoid')
self.f3 = Dense(10, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.p1(x)
x = self.c2(x)
x = self.p2(x)
x = self.flatten(x)
x = self.f1(x)
x = self.f2(x)
y = self.f3(x)
return y
model = LeNet5()
其他全部一致
class AlexNet8(Model):
def __init__(self):
super(AlexNet8, self).__init__()
self.c1 = Conv2D(filters=96, kernel_size=(3, 3))
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.p1 = MaxPool2D(pool_size=(3, 3), strides=2)
self.c2 = Conv2D(filters=256, kernel_size=(3, 3))
self.b2 = BatchNormalization()
self.a2 = Activation('relu')
self.p2 = MaxPool2D(pool_size=(3, 3), strides=2)
self.c3 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',
activation='relu')
self.c4 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',
activation='relu')
self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same',
activation='relu')
self.p3 = MaxPool2D(pool_size=(3, 3), strides=2)
self.flatten = Flatten()
self.f1 = Dense(2048, activation='relu')
self.d1 = Dropout(0.5)
self.f2 = Dense(2048, activation='relu')
self.d2 = Dropout(0.5)
self.f3 = Dense(10, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.p1(x)
x = self.c2(x)
x = self.b2(x)
x = self.a2(x)
x = self.p2(x)
x = self.c3(x)
x = self.c4(x)
x = self.c5(x)
x = self.p3(x)
x = self.flatten(x)
x = self.f1(x)
x = self.d1(x)
x = self.f2(x)
x = self.d2(x)
y = self.f3(x)
return y
model = AlexNet8()
class VGG16(Model):
def __init__(self):
super(VGG16, self).__init__()
self.c1 = Conv2D(filters=64, kernel_size=(3, 3), padding='same') # 卷积层1
self.b1 = BatchNormalization() # BN层1
self.a1 = Activation('relu') # 激活层1
self.c2 = Conv2D(filters=64, kernel_size=(3, 3), padding='same', )
self.b2 = BatchNormalization() # BN层1
self.a2 = Activation('relu') # 激活层1
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d1 = Dropout(0.2) # dropout层
self.c3 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
self.b3 = BatchNormalization() # BN层1
self.a3 = Activation('relu') # 激活层1
self.c4 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
self.b4 = BatchNormalization() # BN层1
self.a4 = Activation('relu') # 激活层1
self.p2 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d2 = Dropout(0.2) # dropout层
self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b5 = BatchNormalization() # BN层1
self.a5 = Activation('relu') # 激活层1
self.c6 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b6 = BatchNormalization() # BN层1
self.a6 = Activation('relu') # 激活层1
self.c7 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b7 = BatchNormalization()
self.a7 = Activation('relu')
self.p3 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d3 = Dropout(0.2)
self.c8 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b8 = BatchNormalization() # BN层1
self.a8 = Activation('relu') # 激活层1
self.c9 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b9 = BatchNormalization() # BN层1
self.a9 = Activation('relu') # 激活层1
self.c10 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b10 = BatchNormalization()
self.a10 = Activation('relu')
self.p4 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d4 = Dropout(0.2)
self.c11 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b11 = BatchNormalization() # BN层1
self.a11 = Activation('relu') # 激活层1
self.c12 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b12 = BatchNormalization() # BN层1
self.a12 = Activation('relu') # 激活层1
self.c13 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b13 = BatchNormalization()
self.a13 = Activation('relu')
self.p5 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d5 = Dropout(0.2)
self.flatten = Flatten()
self.f1 = Dense(512, activation='relu')
self.d6 = Dropout(0.2)
self.f2 = Dense(512, activation='relu')
self.d7 = Dropout(0.2)
self.f3 = Dense(10, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.c2(x)
x = self.b2(x)
x = self.a2(x)
x = self.p1(x)
x = self.d1(x)
x = self.c3(x)
x = self.b3(x)
x = self.a3(x)
x = self.c4(x)
x = self.b4(x)
x = self.a4(x)
x = self.p2(x)
x = self.d2(x)
x = self.c5(x)
x = self.b5(x)
x = self.a5(x)
x = self.c6(x)
x = self.b6(x)
x = self.a6(x)
x = self.c7(x)
x = self.b7(x)
x = self.a7(x)
x = self.p3(x)
x = self.d3(x)
x = self.c8(x)
x = self.b8(x)
x = self.a8(x)
x = self.c9(x)
x = self.b9(x)
x = self.a9(x)
x = self.c10(x)
x = self.b10(x)
x = self.a10(x)
x = self.p4(x)
x = self.d4(x)
x = self.c11(x)
x = self.b11(x)
x = self.a11(x)
x = self.c12(x)
x = self.b12(x)
x = self.a12(x)
x = self.c13(x)
x = self.b13(x)
x = self.a13(x)
x = self.p5(x)
x = self.d5(x)
x = self.flatten(x)
x = self.f1(x)
x = self.d6(x)
x = self.f2(x)
x = self.d7(x)
y = self.f3(x)
return y
model = VGG16()
如上图,InceptionNet是由一个卷积层+四个Inception结构块+所有通道进行平均池化的池化层+Dense层组成
四个Inception结构块的上面两个组成一个block,下面两个组成一个block
block中的第一个Inception结构块卷积步长是2
第二个Inception结构块卷积步长是1
Inception结构块的结构
输入经过四路进入卷积连接器(Filter concatenation)
上图中的Inception结构块的每一个卷积层都是执行的CBA操作,如
所以可将CBA操作封装到一个类ConvBNRelu中:
class ConvBNRelu(Model):
def __init__(self, ch, kernelsz=3, strides=1, padding='same'): # ch是卷积核个数,kernelsz是卷积核尺寸,strides是步长
super(ConvBNRelu, self).__init__()
# 将Conv2D,BatchNormalization,Activation封装到一个model里面
self.model = tf.keras.models.Sequential([
Conv2D(ch, kernelsz, strides=strides, padding=padding),
BatchNormalization(),
Activation('relu')
])
def call(self, x):
x = self.model(x, training=False) #在training=False时,BN通过整个训练集计算均值、方差去做批归一化,training=True时,通过当前batch的均值、方差去做批归一化。推理时 training=False效果好
return x
有了ConvBNRelu,可以方便的搭建出Inception结构块:
class InceptionBlk(Model):
def __init__(self, ch, strides=1):
super(InceptionBlk, self).__init__()
self.ch = ch
self.strides = strides
# c1是Inception结构块的第一个分支,使用一次ConvBNRelu
self.c1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
# c2_1,c2_2是Inception结构块的第二个分支,使用两次ConvBNRelu
self.c2_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
self.c2_2 = ConvBNRelu(ch, kernelsz=3, strides=1)
self.c3_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
self.c3_2 = ConvBNRelu(ch, kernelsz=5, strides=1)
# 第四个分支先池化,再卷积
self.p4_1 = MaxPool2D(3, strides=1, padding='same')
self.c4_2 = ConvBNRelu(ch, kernelsz=1, strides=strides)
def call(self, x):
x1 = self.c1(x)
x2_1 = self.c2_1(x)
x2_2 = self.c2_2(x2_1)
x3_1 = self.c3_1(x)
x3_2 = self.c3_2(x3_1)
x4_1 = self.p4_1(x)
x4_2 = self.c4_2(x4_1)
# x1, x2_2, x3_2, x4_2是四个分支的输出,使用tf.concat函数将这四个沿深度方向堆叠在一起
x = tf.concat([x1, x2_2, x3_2, x4_2], axis=3)
return x
有了Inception结构块,便可以搭建出InceptionNet了:
class Inception10(Model):
def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs):
super(Inception10, self).__init__(**kwargs)
self.in_channels = init_ch
self.out_channels = init_ch
self.num_blocks = num_blocks
self.init_ch = init_ch
# 第一层是一个只有CBA操作的卷积层,可以直接调用ConvBNRelu
self.c1 = ConvBNRelu(init_ch)
self.blocks = tf.keras.models.Sequential()
# 外层循环是循环两个block
for block_id in range(num_blocks):
# 内层循环是循环一个block里面的两个Incption结构块,第一个结构块的卷积步长是2,第二个结构块的卷积步长是1
for layer_id in range(2):
if layer_id == 0:
block = InceptionBlk(self.out_channels, strides=2)
else:
block = InceptionBlk(self.out_channels, strides=1)
self.blocks.add(block)
# enlarger out_channels per block
# 第一个结构块的卷积步长是2,这使得第一个Inception结构块输出特征图尺寸减半,因此把输出特征图深度加深,尽可能保证特征抽取中信息的承载量一致
self.out_channels *= 2
self.p1 = GlobalAveragePooling2D()
self.f1 = Dense(num_classes, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y
model = Inception10(num_blocks=2, num_classes=10) # 实例化,指定block是2,这个问题是10分类的
将以上三块代码合起来即是InceptionNet的网络结构
如果GPU性能较好,可以把batch_size从32改为128,512,1024等,增加一次性喂入神经网络的数据,以增加运算速度
上述四种CNN,通过加深网络层数,结果会越来越好,但是增大到一定程度,再增加层数,会使神经网络模型退化,因为后面的特征会丢失前面特征的原本模样
因此,将前面的特征x直接跳过两个卷积层,与两个卷积层的输出F(x)相加得到H(x)。这种操作可以缓解神经网络模型层数堆叠导致的模型退化,使我们能够增加更多的层数
(注:ResNet中的+是特征图对应元素相加,矩阵值的每个对应元素相加,
InceptionNet中的+是沿深度方向叠加,增加特征图的层数)
ResNet块有两种情况:
一种是下图的实线所示,两层卷积之后没有改变特征图的维度,可以直接H(x)=F(x)+x
另一种是虚线所示,两层卷积之后改变了特征图的维度,需要借助1*1的卷积来调整x的维度,使W(x)与F(x)维度一致
下图即是ResNet的结构:
RetNet由一个卷积层+8个RetNet块+所有通道进行平均池化的池化层+Dense层组成
一共18层网络
每个ResNet块有两种情况:虚线和实线
如上图结构,可将ResNet块封装到类ResnetBlock中:
class ResnetBlock(Model):
def __init__(self, filters, strides=1, residual_path=False):
super(ResnetBlock, self).__init__()
self.filters = filters
self.strides = strides
self.residual_path = residual_path
self.c1 = Conv2D(filters, (3, 3), strides=strides, padding='same', use_bias=False)
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.c2 = Conv2D(filters, (3, 3), strides=1, padding='same', use_bias=False)
self.b2 = BatchNormalization()
# residual_path为True时,对输入进行下采样,即用1x1的卷积核做卷积操作,保证x能和F(x)维度相同,顺利相加
# 如果是实线,不执行以下if语句,如果是虚线,则执行以下if语句,将W(x)核F(x)维度一致
if residual_path:
self.down_c1 = Conv2D(filters, (1, 1), strides=strides, padding='same', use_bias=False)
self.down_b1 = BatchNormalization()
self.a2 = Activation('relu')
def call(self, inputs):
residual = inputs # residual等于输入值本身,即residual=x
# 将输入通过卷积、BN层、激活层,计算F(x)
x = self.c1(inputs)
x = self.b1(x)
x = self.a1(x)
x = self.c2(x)
y = self.b2(x)
# 如果是实线,不执行以下if语句,如果是虚线,则执行以下if语句,将W(x)核F(x)维度一致
if self.residual_path:
residual = self.down_c1(inputs)
residual = self.down_b1(residual)
out = self.a2(y + residual) # 最后输出的是两部分的和,即F(x)+x或F(x)+Wx,再过激活函数
return out
进而可设计ResNet网络结构:
class ResNet18(Model):
def __init__(self, block_list, initial_filters=64): # block_list表示每个block有几个卷积层
super(ResNet18, self).__init__()
self.num_blocks = len(block_list) # 共有几个block
self.block_list = block_list
self.out_filters = initial_filters
# 第一个卷积层:CBA
self.c1 = Conv2D(self.out_filters, (3, 3), strides=1, padding='same', use_bias=False)
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.blocks = tf.keras.models.Sequential()
# 构建ResNet网络结构
# 外层循环层数由参数列表的循环个数决定,如model = ResNet18([2, 2, 2, 2]),则循环4次,有4个橙色块,即8个ResNet块
for block_id in range(len(block_list)): # 第几个橙色块
for layer_id in range(block_list[block_id]): # 第几个ResNet块
if block_id != 0 and layer_id == 0: # 除第一个橙色块,以下三个橙色块的第一个ResNet块外,都是虚线,定义residual_path=True
block = ResnetBlock(self.out_filters, strides=2, residual_path=True)
else: # 第一个橙色块,以下三个橙色块的第一个ResNet块是实线,定义residual_path=False
block = ResnetBlock(self.out_filters, residual_path=False)
self.blocks.add(block) # 将构建好的block加入resnet
self.out_filters *= 2 # 下一个block的卷积核数是上一个block的2倍
# 全局池化
self.p1 = tf.keras.layers.GlobalAveragePooling2D()
# Dense层
self.f1 = tf.keras.layers.Dense(10, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2())
def call(self, inputs):
x = self.c1(inputs)
x = self.b1(x)
x = self.a1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y
model = ResNet18([2, 2, 2, 2])
为加速收敛,可把batch_size改为128进行训练。
LeNet:卷积网络的开篇之作,共享卷积核,减少网络参数
AlexNet:使用relu激活函数,提升训练速度,使用Dropout,缓解过拟合
VGGNet:小尺寸卷积核减少参数,网络结构规整,适合硬件的并行加速
InceptionNet:一层内使用不同尺寸卷积核,提升感知力,使用BN,缓解梯度消失
ResNet:层间残差跳连,引入前方信息,缓解模型退化,使神经网络层数加深成为可能