原文地址:
https://github.com/Microsoft/AutonomousDrivingCookbook/blob/master/AirSimE2EDeepLearning/TrainModel.ipynb
现在我们已经对正在处理的数据有了一定的了解,可以开始设计模型了。在这篇文章中,我们定义网络结构并训练模型。我们还将讨论一些对数据的转换,以响应这篇文章的数据探索部分中所做的观察。
让我们添加一些库和路径。
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential, Model
from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense, Lambda, Input, concatenate
from keras.layers.normalization import BatchNormalization
from keras.layers.advanced_activations import ELU
from keras.optimizers import Adam, SGD, Adamax, Nadam
from keras.callbacks import ReduceLROnPlateau, ModelCheckpoint, CSVLogger, EarlyStopping
import keras.backend as K
from keras.preprocessing import image
from keras_tqdm import TQDMNotebookCallback
import json
import os
import numpy as np
import pandas as pd
from Generator import DriveDataGenerator
from Cooking import checkAndCreateDir
import h5py
from PIL import Image, ImageDraw
import math
import matplotlib.pyplot as plt
# << The directory containing the cooked data from the previous step >>
COOKED_DATA_DIR = 'data_cooked/'
# << The directory in which the model output will be placed >>
MODEL_OUTPUT_DIR = 'model'
Using TensorFlow backend.
让我们从探索阶段读取数据集。如果它们不存在,运行前一篇文章中的代码片段来生成它们。
train_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'train.h5'), 'r')
eval_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'eval.h5'), 'r')
test_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'test.h5'), 'r')
num_train_examples = train_dataset['image'].shape[0]
num_eval_examples = eval_dataset['image'].shape[0]
num_test_examples = test_dataset['image'].shape[0]
batch_size=32
对于图像数据,将整个数据集加载到内存中开销太大。幸运的是,Keras有数据生成器的概念。DataGenerator只不过是一个迭代器,它将以块的形式从磁盘读取数据。这将保证了CPU和GPU的并行运算,增加吞吐量。
我们在探索阶段做了一点观察。现在,让我们想出一种能把它们包含进入我们的训练算法的策略:
虽然Keras有一些针对图像的标准内置转换,但对于我们的目的来说,它们还不够。例如,当在标准ImageDataGenerator中使用horizontal_flip = True时,labels的符号不会倒过来。幸运的是,我们只需要扩展ImageDataGenerator类并实现我们自己的转换逻辑。这部分代码是在Generator.py中完成的——它很简单,但是太长了,不能包含在这篇文章中。
在这里,我们使用以下参数初始化生成器:
思考练习1.1 尝试打乱这些参数,看看是否可以得到更好的结果。
data_generator = DriveDataGenerator(rescale=1./255., horizontal_flip=True, brighten_range=0.4)
train_generator = data_generator.flow\
(train_dataset['image'], train_dataset['previous_state'], train_dataset['label'], batch_size=batch_size, zero_drop_percentage=0.95, roi=[76,135,0,255])
eval_generator = data_generator.flow\
(eval_dataset['image'], eval_dataset['previous_state'], eval_dataset['label'], batch_size=batch_size, zero_drop_percentage=0.95, roi=[76,135,0,255])
让我们来看一个示例批处理。转向角由图中红线表示:
def draw_image_with_label(img, label, prediction=None):
theta = label * 0.69 #Steering range for the car is +- 40 degrees -> 0.69 radians
line_length = 50
line_thickness = 3
label_line_color = (255, 0, 0)
prediction_line_color = (0, 0, 255)
pil_image = image.array_to_img(img, K.image_data_format(), scale=True)
print('Actual Steering Angle = {0}'.format(label))
draw_image = pil_image.copy()
image_draw = ImageDraw.Draw(draw_image)
first_point = (int(img.shape[1]/2),img.shape[0])
second_point = (int((img.shape[1]/2) + (line_length * math.sin(theta))), int(img.shape[0] - (line_length * math.cos(theta))))
image_draw.line([first_point, second_point], fill=label_line_color, width=line_thickness)
if (prediction is not None):
print('Predicted Steering Angle = {0}'.format(prediction))
print('L1 Error: {0}'.format(abs(prediction-label)))
theta = prediction * 0.69
second_point = (int((img.shape[1]/2) + (line_length * math.sin(theta))), int(img.shape[0] - (line_length * math.cos(theta))))
image_draw.line([first_point, second_point], fill=prediction_line_color, width=line_thickness)
del image_draw
plt.imshow(draw_image)
plt.show()
[sample_batch_train_data, sample_batch_test_data] = next(train_generator)
for i in range(0, 3, 1):
draw_image_with_label(sample_batch_train_data[0][i], sample_batch_test_data[i])
实际转向角 = [-0.28374567]
实际转向角 = [-0.03775833]
实际转向角 = [ 0.12664133]
下一步,让我们定义网络结构。我们将使用convolutional / max pooling层的标准组合来处理图像(我们不能详细讨论每一层的作用,但是如果你不明白发生了什么,你一定要看看readme文件中提到的那本书)。然后,我们把车辆的最后一个已知转态作为额外特征注入稠密层。层大小和优化参数是通过实验确定的 - 试着调整它们,看看会发生什么!
image_input_shape = sample_batch_train_data[0].shape[1:]
state_input_shape = sample_batch_train_data[1].shape[1:]
activation = 'relu'
#Create the convolutional stacks
pic_input = Input(shape=image_input_shape)
img_stack = Conv2D(16, (3, 3), name="convolution0", padding='same', activation=activation)(pic_input)
img_stack = MaxPooling2D(pool_size=(2,2))(img_stack)
img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution1')(img_stack)
img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack)
img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution2')(img_stack)
img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack)
img_stack = Flatten()(img_stack)
img_stack = Dropout(0.2)(img_stack)
#Inject the state input
state_input = Input(shape=state_input_shape)
merged = concatenate([img_stack, state_input])
# Add a few dense layers to finish the model
merged = Dense(64, activation=activation, name='dense0')(merged)
merged = Dropout(0.2)(merged)
merged = Dense(10, activation=activation, name='dense2')(merged)
merged = Dropout(0.2)(merged)
merged = Dense(1, name='output')(merged)
adam = Nadam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model = Model(inputs=[pic_input, state_input], outputs=merged)
model.compile(optimizer=adam, loss='mse')
让我们看一下模型的总结
model.summary()
Layer (type) | Output Shape | Param # | Connected to |
---|---|---|---|
input_1 (InputLayer) | (None, 59, 255, 3) | 0 | |
convolution0 (Conv2D) | (None, 59, 255, 16) | 488 | input_1[0][0] |
max_pooling2d_1 (MaxPooling2D) | (None, 29, 127, 16) | 0 | convolution0[0][0] |
convolution1 (Conv2D) | (None, 29, 127, 32) | 4640 | max_pooling2d_1[0][0] |
max_pooling2d_2 (MaxPooling2D) | (None, 14, 63, 32) | 0 | convolution1[0][0] |
convolution2 (Conv2D) | (None, 14, 63, 32) | 9248 | max_pooling2d_2[0][0] |
max_pooling2d_3 (MaxPooling2D) | (None, 7, 31, 32) | 0 | convolution2[0][0] |
flatten_1 (Flatten) | (None, 6944) | 0 | max_pooling2d_3[0][0] |
dropout_1 (Dropout) | (None, 6944) | 0 | flatten_1[0][0] |
input_2 (InputLayer) | (None, 4) | 0 | |
concatenate_1 (Concatenate) | (None, 6948) | 0 | dropout_1[0][0] input_2[0][0] |
dense0 (Dense) | (None, 64) | 444736 | concatenate_1[0][0] |
dropout_2 (Dropout) | (None, 64) | 0 | dense0[0][0] |
dense2 (Dense) | (None, 10) | 650 | dropout_2[0][0] |
dropout_3 (Dropout) | (None, 10) | 0 | dense2[0][0] |
output (Dense) | (None, 1) | 11 | dropout_3[0][0] |
Total params: 459,733
Trainable params: 459,733
Non-trainable params: 0
这已经是很多参数了! 所幸的是,我们有我们的数据增加策略,所以网络有收敛的机会。试着增加或者移除一些层或者改变他们的宽度,来看看这回对网络中训练参数的数量有什么影响。
Keras 的一个很棒的特点是指出回掉函数。这些函数在每一个训练epoch 中被执行。我们会定义一些回掉函数:
plateau_callback = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, min_lr=0.0001, verbose=1)
checkpoint_filepath = os.path.join(MODEL_OUTPUT_DIR, 'models', '{0}_model.{1}-{2}.h5'.format('model', '{epoch:02d}', '{val_loss:.7f}'))
checkAndCreateDir(checkpoint_filepath)
checkpoint_callback = ModelCheckpoint(checkpoint_filepath, save_best_only=True, verbose=1)
csv_callback = CSVLogger(os.path.join(MODEL_OUTPUT_DIR, 'training_log.csv'))
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=10, verbose=1)
callbacks=[plateau_callback, csv_callback, checkpoint_callback, early_stopping_callback, TQDMNotebookCallback()]
现在是时候训练模型了!使用缺省设置,这个模型在一块NVidia GTX970 GPU 上训练时间大约为45分钟。注意: 有时候模型在验证损失超过7个epoch 保持不变时候将被阻塞。如果不运行,这个模型应该终止,此时验证损失在大概0.0003 左右。
history = model.fit_generator(train_generator, steps_per_epoch=num_train_examples//batch_size, epochs=500, callbacks=callbacks,\
validation_data=eval_generator, validation_steps=num_eval_examples//batch_size, verbose=2)
现在我们仔细检查下。我们将载入一些训练图片来比较标注和预测。这些应该非常接近某个数值如果我们的模型很好的进行了学习。
[sample_batch_train_data, sample_batch_test_data] = next(train_generator)
predictions = model.predict([sample_batch_train_data[0], sample_batch_train_data[1]])
for i in range(0, 3, 1):
draw_image_with_label(sample_batch_train_data[0][i], sample_batch_test_data[i], predictions[i])
Actual Steering Angle = [-0.035522]
Predicted Steering Angle = [-0.0003692]
L1 Error: [ 0.0351528]
Actual Steering Angle = [ 0.12993667]
Predicted Steering Angle = [-0.0003692]
L1 Error: [ 0.13030587]
Actual Steering Angle = [-0.09872733]
Predicted Steering Angle = [-0.0003692]
L1 Error: [ 0.09835813]
Looks good!下一篇在AirSim中运行车辆。
github项目地址:
https://github.com/Microsoft/AutonomousDrivingCookbook