停车场车位识别 opencv(二)-----模型训练 Keras框架 VGGNet

停车场车位识别,基于keras框架训练网络模型

本项目理论和源码来自唐宇迪opencv项目实战

上一节已经对视频截取的停车场图像进行一系列的数组图像处理操作,生成了车位中不同状态的图片数据集,接着我们用此数据集,寻拉你个深度学习的模型,目的是当输入一张停车场的图片能够识别出当前有多少空车位以及车位的位置。

先上代码
train.py

import os
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.callbacks import ModelCheckpoint, LearningRateScheduler, TensorBoard, EarlyStopping
from keras.layers.core import Flatten
from keras.layers.core import Dense

files_train = 0
files_validation = 0

cwd = os.getcwd()
folder = 'train_data/train'
for sub_folder in os.listdir(folder):
    path, dirs, files = next(os.walk(os.path.join(folder,sub_folder)))
    files_train += len(files)


folder = 'train_data/test'
for sub_folder in os.listdir(folder):
    path, dirs, files = next(os.walk(os.path.join(folder,sub_folder)))
    files_validation += len(files)
# 访问系统中训练数据和测试数据,返回样本个数
print(files_train,files_validation)
img_width, img_height = 48, 48
train_data_dir = "train_data/train"
validation_data_dir = "train_data/test"

nb_train_samples = files_train
nb_validation_samples = files_validation
batch_size = 32
epochs = 15
num_classes = 2
model = applications.VGG16(weights='imagenet', include_top=False, input_shape = (img_width, img_height, 3))

for layer in model.layers[:10]:
    layer.trainable = False


x = model.output
x = Flatten()(x)
predictions = Dense(num_classes, activation="softmax")(x)


model_final = Model(input = model.input, output = predictions)

# 编译 :定义损失函数、优化器和性能指标
model_final.compile(loss="categorical_crossentropy",
                    optimizer=optimizers.SGD(lr=0.0001, momentum=0.9),
                    metrics=["accuracy"]) 


train_datagen = ImageDataGenerator(
rescale=1./255,
horizontal_flip=True,
fill_mode="nearest",
zoom_range=0.1,
width_shift_range=0.1,
height_shift_range=0.1,
rotation_range=5)

test_datagen = ImageDataGenerator(
rescale = 1./255,
horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.1,
width_shift_range = 0.1,
height_shift_range=0.1,
rotation_range=5)

train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "categorical")

validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size = (img_height, img_width),
class_mode = "categorical")
# 使用的模型
checkpoint = ModelCheckpoint("car1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)
early = EarlyStopping(monitor='val_acc', min_delta=0, patience=10, verbose=1, mode='auto')


history_object = model_final.fit_generator(
train_generator,
samples_per_epoch = nb_train_samples,
epochs = epochs,
validation_data = validation_generator,
nb_val_samples = nb_validation_samples,
callbacks = [checkpoint, early])

训练神经网络之前需要数据预处理,上一部分的主要内容就是数据预处理,这里再把图像分类网络训练步骤中的图像预处理说一下。
别人的东西

1.图像预处理

我们已经的到的数据集是图片的格式,我们这里的数据预处理是要将图片格式的数据转化成计算机能够读懂的数据,并将其送入到神经网络中,用于模型的训练。
keras 图像处理官方文档
在程序中以训练数据的预处理为例:
ImageDataGenerator通过实时数据增强生成张量图像数据的epoch。数据将epoch不断循环。

train_datagen = ImageDataGenerator(
rescale=1./255,
horizontal_flip=True,
fill_mode="nearest",
zoom_range=0.1,
width_shift_range=0.1,
height_shift_range=0.1,
width_shift_range=5)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "categorical")

rescale是缩放因子,默认值是0或‘None’,本实例中乘以1/255,缩放的操作优先于其他操作。
horizontal_flip表示随机水平翻转,bool值。
fill_mode表示边缘填充方式,默认是nearest,各种填充方式:

  • ‘constant’: kkkkkkkk|abcd|kkkkkkkk (cval=k)
  • ‘nearest’: aaaaaaaa|abcd|dddddddd
  • ‘reflect’: abcddcba|abcd|dcbaabcd
  • ‘wrap’: abcdabcd|abcd|abcdabcd

zoom_range表示随机缩放范围,本例中是一个浮点数那么[lower, upper] = [1-0.1, 1+0.1]
width_shift_range表示宽度变换的范围, 本例中float那么除以总宽度的值。
height_shift_range表示除以总宽度的值。
width_shift_range表示随机旋转的角度范围。

2.训练网络模型

batch_size = 32
epochs = 15
num_classes = 2
model = applications.VGG16(weights='imagenet', include_top=False, input_shape = (img_width, img_height, 3))

每次输入32张图片,循环迭代15次,最后一层有两个节点,即最后分为两类,体现在本实例中就是车位的两种状态(empty和occupy)。
而且使用的模型是已经训练好的VGG16的网络权重。

然后编译(损失函数、优化器、评价指标):

model_final.compile(loss="categorical_crossentropy",
                    optimizer=optimizers.SGD(lr=0.0001, momentum=0.9),
                    metrics=["accuracy"]) 

迭代:

history_object = model_final.fit_generator(
train_generator,
samples_per_epoch = nb_train_samples,
epochs = epochs,
validation_data = validation_generator,
nb_val_samples = nb_validation_samples,
callbacks = [checkpoint, early])

把训练好的模型保存到指定路径下。那么模型训练的任务完成。

3.测试

park_test.py

from __future__ import division
import matplotlib.pyplot as plt
import cv2
import os, glob
import numpy as np
from PIL import Image
from keras.applications.imagenet_utils import preprocess_input
from keras.models import load_model
from keras.preprocessing import image
from Parking import Parking
import pickle
cwd = os.getcwd()


def img_process(test_images,park):
    white_yellow_images = list(map(park.select_rgb_white_yellow, test_images))
    park.show_images(white_yellow_images)
    
    gray_images = list(map(park.convert_gray_scale, white_yellow_images))
    park.show_images(gray_images)
    
    edge_images = list(map(lambda image: park.detect_edges(image), gray_images))
    park.show_images(edge_images)
    
    roi_images = list(map(park.select_region, edge_images))
    park.show_images(roi_images)
    
    list_of_lines = list(map(park.hough_lines, roi_images))
    
    line_images = []
    for image, lines in zip(test_images, list_of_lines):
        line_images.append(park.draw_lines(image, lines)) 
    park.show_images(line_images)
    
    rect_images = []
    rect_coords = []
    for image, lines in zip(test_images, list_of_lines):
        new_image, rects = park.identify_blocks(image, lines)
        rect_images.append(new_image)
        rect_coords.append(rects)
        
    park.show_images(rect_images)
    
    delineated = []
    spot_pos = []
    for image, rects in zip(test_images, rect_coords):
        new_image, spot_dict = park.draw_parking(image, rects)
        delineated.append(new_image)
        spot_pos.append(spot_dict)
        
    park.show_images(delineated)
    final_spot_dict = spot_pos[1]
    print(len(final_spot_dict))

    with open('spot_dict.pickle', 'wb') as handle:
        pickle.dump(final_spot_dict, handle, protocol=pickle.HIGHEST_PROTOCOL)
    
    park.save_images_for_cnn(test_images[0],final_spot_dict)
    
    return final_spot_dict


def keras_model(weights_path):
    model = load_model(weights_path)
    return model


def img_test(test_images, final_spot_dict, model, class_dictionary):
    for i in range (len(test_images)):
        predicted_images = park.predict_on_image(test_images[i], final_spot_dict, model, class_dictionary)


def video_test(video_name, final_spot_dict, model, class_dictionary):
    name = video_name
    cap = cv2.VideoCapture(name)
    park.predict_on_video(name, final_spot_dict, model, class_dictionary, ret=True)
    

if __name__ == '__main__':
    test_images = [plt.imread(path) for path in glob.glob('test_images/*.jpg')]
    weights_path = 'car1.h5'
    video_name = 'parking_video.mp4'
    class_dictionary = {}
    class_dictionary[0] = 'empty'
    class_dictionary[1] = 'occupied'
    park = Parking()
    park.show_images(test_images)
    final_spot_dict = img_process(test_images,park)
    model = keras_model(weights_path)
    img_test(test_images,final_spot_dict,model,class_dictionary)
    video_test(video_name,final_spot_dict,model,class_dictionary)

测试结果:

4.Keras内置的预训练网络—VGG

预训练网络是指,在大型数据集上训练后(多为图像分类任务)保存的节点参数,用于我们的小型数据集的任务中。
如果原始的数据集已经足够大,足够一般,通过预训练学习到的空间上的特征层次结构就能有效地在我们的小型的模型上工作,因此这些特征对许多计算机视觉问题都很有用,尽管这些新问题和原任务相比可能涉及完全不同的类别。
VGG网络结构:

# Block 1
    x = layers.Conv2D(64, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block1_conv1')(img_input)
    x = layers.Conv2D(64, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block1_conv2')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

    # Block 2
    x = layers.Conv2D(128, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block2_conv1')(x)
    x = layers.Conv2D(128, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block2_conv2')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

    # Block 3
    x = layers.Conv2D(256, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block3_conv1')(x)
    x = layers.Conv2D(256, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block3_conv2')(x)
    x = layers.Conv2D(256, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block3_conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

    # Block 4
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block4_conv1')(x)
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block4_conv2')(x)
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block4_conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

    # Block 5
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block5_conv1')(x)
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block5_conv2')(x)
    x = layers.Conv2D(512, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block5_conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
        if include_top:
        # Classification block
        x = layers.Flatten(name='flatten')(x)
        x = layers.Dense(4096, activation='relu', name='fc1')(x)
        x = layers.Dense(4096, activation='relu', name='fc2')(x)
        x = layers.Dense(classes, activation='softmax', name='predictions')(x)
    else:
        if pooling == 'avg':
            x = layers.GlobalAveragePooling2D()(x)
        elif pooling == 'max':
            x = layers.GlobalMaxPooling2D()(x)

    # Ensure that the model takes into account
    # any potential predecessors of `input_tensor`.
    if input_tensor is not None:
        inputs = keras_utils.get_source_inputs(input_tensor)
    else:
        inputs = img_input
    # Create model.
    model = models.Model(inputs, x, name='vgg16')
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0

至此停车场车位识别任务完成。
谢谢为我提供帮助的博客,你的辛勤耕耘让我受益匪浅。

https://blog.csdn.net/weixin_44510615/article/details/89237472

停车场车位识别 opencv(二)-----模型训练 Keras框架 VGGNet_第1张图片

你可能感兴趣的:(停车场车位识别 opencv(二)-----模型训练 Keras框架 VGGNet)