在深度学习的学习过程中,由于计算资源有限或者训练集较小,但我们又想获得较好较稳定的结果,那么一些已经训练好的模型会对我们有很大帮助,比如 Alex Net, google net, VGG net, ResNet等,那我们怎么对这些已经训练好的模型进行fine-tune来提高准确率呢? 在这篇博客中,我们使用已经训练好的ResNet50网络模型,该模型基于imagenet数据集,实现了对1000种物体的分类。
步骤如下:
1. 下载ResNet50不包含全连接层的模型参数到本地(resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5);
2. 定义好ResNet50的网络结构;
3. 将预训练的模型参数加载到我们所定义的网络结构中;
4. 更改全连接层结构,便于对我们的分类任务进行处
5. 或者根据需要解冻最后几个block,然后以很低的学习率开始训练。我们只选择最后一个block进行训练,是因为训练样本很少,而ResNet50模型层数很多,全部训练肯 定不能训练好,会过拟合。 其次fine-tune时由于是在一个已经训练好的模型上进行的,故权值更新应该是一个小范围的,以免破坏预训练好的特征。
Step1:下载权重数据
地址:点击这里
Step2:定义ResNet50的网络结构
def identity_block(X, f, filters, stage, block):
# defining name basis
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
# Retrieve Filters
F1, F2, F3 = filters
# Save the input value. You'll need this later to add back to the main path.
X_shortcut = X
# First component of main path
X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2a')(X)
X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
X = Activation('relu')(X)
# Second component of main path (≈3 lines)
X = Conv2D(filters= F2, kernel_size=(f,f),strides=(1,1),padding='same',name=conv_name_base + '2b')(X)
X = BatchNormalization(axis=3,name=bn_name_base+'2b')(X)
X = Activation('relu')(X)
# Third component of main path (≈2 lines)
X = Conv2D(filters=F3,kernel_size=(1,1),strides=(1,1),padding='valid',name=conv_name_base+'2c')(X)
X = BatchNormalization(axis=3,name=bn_name_base+'2c')(X)
# Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
X = Add()([X, X_shortcut])
X = Activation('relu')(X)
return X
def convolutional_block(X, f, filters, stage, block, s = 2):
# defining name basis
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
# Retrieve Filters
F1, F2, F3 = filters
# Save the input value
X_shortcut = X
##### MAIN PATH #####
# First component of main path
X = Conv2D(F1, (1, 1), strides = (s,s),padding='valid',name = conv_name_base + '2a')(X)
X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
X = Activation('relu')(X)
# Second component of main path (≈3 lines)
X = Conv2D(F2,(f,f),strides=(1,1),padding='same',name=conv_name_base+'2b')(X)
X = BatchNormalization(axis=3,name=bn_name_base+'2b')(X)
X = Activation('relu')(X)
# Third component of main path (≈2 lines)
X = Conv2D(F3,(1,1),strides=(1,1),padding='valid',name=conv_name_base+'2c')(X)
X = BatchNormalization(axis=3,name=bn_name_base+'2c')(X)
##### SHORTCUT PATH #### (≈2 lines)
X_shortcut = Conv2D(F3,(1,1),strides=(s,s),padding='valid',name=conv_name_base+'1')(X_shortcut)
X_shortcut = BatchNormalization(axis=3,name =bn_name_base+'1')(X_shortcut)
# Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
X = Add()([X,X_shortcut])
X = Activation('relu')(X)
return X
# GRADED FUNCTION: ResNet50
def ResNet50(input_shape = (64, 64, 3), classes = 30):
# Define the input as a tensor with shape input_shape
X_input = Input(input_shape)
# Zero-Padding
X = ZeroPadding2D((3, 3))(X_input)
# Stage 1
X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1')(X)
X = BatchNormalization(axis = 3, name = 'bn_conv1')(X)
X = Activation('relu')(X)
X = MaxPooling2D((3, 3), strides=(2, 2))(X)
# Stage 2
X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, block='a', s = 1)
X = identity_block(X, 3, [64, 64, 256], stage=2, block='b')
X = identity_block(X, 3, [64, 64, 256], stage=2, block='c')
### START CODE HERE ###
# Stage 3 (≈4 lines)
X = convolutional_block(X, f = 3,filters= [128,128,512],stage=3,block='a',s=2)
X = identity_block(X,3,[128,128,512],stage=3,block='b')
X = identity_block(X,3,[128,128,512],stage=3,block='c')
X = identity_block(X,3,[128,128,512],stage=3,block='d')
# Stage 4 (≈6 lines)
X = convolutional_block(X,f=3,filters=[256,256,1024],stage=4,block='a',s=2)
X = identity_block(X,3,[256,256,1024],stage=4,block='b')
X = identity_block(X,3,[256,256,1024],stage=4,block='c')
X = identity_block(X,3,[256,256,1024],stage=4,block='d')
X = identity_block(X,3,[256,256,1024],stage=4,block='e')
X = identity_block(X,3,[256,256,1024],stage=4,block='f')
# Stage 5 (≈3 lines)
X = convolutional_block(X, f = 3,filters= [512,512,2048],stage=5,block='a',s=2)
X = identity_block(X,3,[512,512,2048],stage=5,block='b')
X = identity_block(X,3,[512,512,2048],stage=5,block='c')
# AVGPOOL (≈1 line). Use "X = AveragePooling2D(...)(X)"
X = AveragePooling2D((2,2),strides=(2,2))(X)
# output layer
X = Flatten()(X)
model = Model(inputs = X_input, outputs = X, name='ResNet50')
return model
base_model = ResNet50(input_shape=(224,224,3),classes=30)
base_model.load_weights('resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5')
X = base_model.output
predictions = Dense(30, activation='softmax')(X)
model = Model(inputs=pigModel.input, outputs=predictions)
model.compile(optimizer='Adam', loss='categorical_crossentropy',metrics=['accuracy'])
es = EarlyStopping(monitor='val_loss', patience=1)
model.fit(x=X_train,y=Y_train,epochs=20,batch_size=32,validation_data=(X_val, Y_val),callbacks=[es])