优达学城无人驾驶工程师——P2交通路牌识别

这次是P2项目——交通路牌识别,用到的是简单的卷积网络,2层的卷积层加上4层全连接层,因为用的数据集的图片大小是32x32的,所以不用很复杂的神经网络。

数据地址在这里:https://s3-us-west-1.amazonaws.com/udacity-selfdrivingcar/traffic-signs-data.zip

直接粘贴到迅雷下载就好了。

下载好后解压是有3个文件,test.p train.p valid.p

我这次做到的准确率是93%,网上还有大神做到了98%,https://github.com/kenshiro-o/CarND-Traffic-Sign-Classifier-Project,这是他的github,大家可以看看。

下面开始我的代码

首先先读取数据

# Load pickled data
import pickle

# TODO: Fill this in based on where you saved the training and testing data

training_file = 'train.p'
validation_file='valid.p'
testing_file = 'test.p'

with open(training_file, mode='rb') as f:
    train = pickle.load(f)
with open(validation_file, mode='rb') as f:
    valid = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)
    
X_train, y_train = train['features'], train['labels']
X_valid, y_valid = valid['features'], valid['labels']
X_test, y_test = test['features'], test['labels']

然后分析数据

### Replace each question mark with the appropriate value. 
### Use python, pandas or numpy methods rather than hard coding the results

# TODO: Number of training examples
n_train = len(X_train)

# TODO: Number of validation examples
n_validation = len(X_valid)

# TODO: Number of testing examples.
n_test = len(X_test)

# TODO: What's the shape of an traffic sign image?
image_shape = X_train[0].shape

# TODO: How many unique classes/labels there are in the dataset.
n_classes = len(set(y_train))

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)

可以看到train有3万多张图片 大小是32x32x3 总共有43种分类

下面是随机show一张图片和分析train,valid,test中各各类别图片的数量

### Data exploration visualization code goes here.
### Feel free to use as many code cells as needed.
import matplotlib.pyplot as plt
from collections import Counter
import numpy as np
# Visualizations will be shown in the notebook.
%matplotlib inline
index=np.random.randint(n_train)
plt.imshow(X_train[index],cmap='gray')
print(y_train[index])

sign_count_test = Counter(y_train)

range_x = np.array(range(n_classes))
range_y = [sign_count_test[i] for i in range_x]
plt.figure(figsize=(9,5))
plt.bar(range_x,range_y)
plt.xticks(list(range(n_classes)))
plt.xlabel("class")
plt.ylabel("numbers")
plt.title("the train data distribution")
plt.show

sign_count_valid = Counter(y_valid)

range_x = np.array(range(n_classes))
range_y = [sign_count_valid[i] for i in range_x]
plt.figure(figsize=(9,5))
plt.bar(range_x,range_y)
plt.xticks(list(range(n_classes)))
plt.xlabel("class")
plt.ylabel("numbers")
plt.title("the valid data distribution")
plt.show

sign_count_test = Counter(y_test)

range_x = np.array(range(n_classes))
range_y = [sign_count_test[i] for i in range_x]
plt.figure(figsize=(9,5))
plt.bar(range_x,range_y)
plt.xticks(list(range(n_classes)))
plt.xlabel("class")
plt.ylabel("numbers")
plt.title("the test data distribution")
plt.show

优达学城无人驾驶工程师——P2交通路牌识别_第1张图片

下面开始一些图像的预处理,有normalized,grayscale等等都是一些很简单的函数

### Preprocess the data here. It is required to normalize the data. Other preprocessing steps could include 
### converting to grayscale, etc.
### Feel free to use as many code cells as needed.
import cv2
def normalized(images):
    xmin = np.min(images)
    xmax = np.max(images)
    image = (images - xmin)/(xmax-xmin)
    return image

def grayscale(image):
    gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)
    return np.expand_dims(gray,axis = 2)

def preprocess(images):
    list1 = []
    for image in images:
        list1.append(grayscale(image))
    list1 = np.array(list1)
    return normalized(list1)

def one_hot(images):
    list1 = np.zeros((len(images),n_classes))
    for i,label in enumerate(images):
        list1[i][label] = 1
    return list1
import tensorflow as tf
from sklearn.utils import shuffle
X_train,y_train = shuffle(X_train,y_train)
X_train = preprocess(X_train)
y_train = one_hot(y_train)
X_valid = preprocess(X_valid)
y_valid = one_hot(y_valid)
X_test = preprocess(X_test)
y_test = one_hot(y_test)
print(X_train.shape,y_train.shape)
这里的输出为: (34799, 32, 32, 1) (34799, 43)

可以看到 我们通过预处理把原来是3通道的图变成灰度图,然后把y_train,y_valid,y_test变成one_hot格式,为后面做准备

下面就是开始设计我们的神经网络层了

### Define your architecture here.
### Feel free to use as many code cells as needed.
inputs = tf.placeholder(tf.float32,shape = [None,32,32,1],name = 'inputs')
labels = tf.placeholder(tf.int32,shape = [None,n_classes],name = 'labels')
from tensorflow.contrib.layers import flatten

def model(images,add_dropout = True):
    mu = 0
    sigma = 0.1
    dropout = 0.5
    #0.5和0.75算出来的精确度都差不多
    
    conv1_W = tf.Variable(tf.truncated_normal(shape = (5,5,1,12),mean = mu,stddev=sigma))
    conv1_b = tf.Variable(tf.zeros(12))
    conv1 = tf.nn.conv2d(images,conv1_W,strides=[1,1,1,1],padding='VALID') + conv1_b
    conv1 = tf.nn.relu(conv1)
    conv1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1],strides=[1,2,2,1],padding='VALID',name = 'conv1')
    #14x14x12
    
    conv2_W = tf.Variable(tf.truncated_normal(shape = (5,5,12,24),mean = mu,stddev=sigma))
    conv2_b = tf.Variable(tf.zeros(24))
    conv2 = tf.nn.conv2d(conv1,conv2_W,strides = [1,1,1,1],padding='VALID',name = 'conv2') + conv2_b
    conv2 = tf.nn.relu(conv2)
    conv2 = tf.nn.max_pool(conv2,ksize=[1,2,2,1],strides=[1,2,2,1],padding='VALID')
    #5,5,24
    
    fc0 = flatten(conv2)
    #600
    
    fc1_W = tf.Variable(tf.truncated_normal(shape = (600,400),mean = mu,stddev = sigma))
    fc1_b = tf.Variable(tf.zeros(400))
    fc1 = tf.matmul(fc0,fc1_W) + fc1_b
    fc1 = tf.nn.relu(fc1)
    
    if add_dropout:
        fc1 = tf.nn.dropout(fc1,dropout)
        
    fc2_W  = tf.Variable(tf.truncated_normal(shape=(400, 120), mean = mu, stddev = sigma))
    fc2_b  = tf.Variable(tf.zeros(120))
    fc2    = tf.matmul(fc1, fc2_W) + fc2_b
    fc2 = tf.nn.relu(fc2)
    
    if add_dropout:
        fc2 = tf.nn.dropout(fc2,dropout)
    
    fc3_W = tf.Variable(tf.truncated_normal(shape = (120,84),mean = mu,stddev=sigma))
    fc3_b = tf.Variable(tf.zeros(84))
    fc3 = tf.matmul(fc2,fc3_W) +fc3_b
    fc3 = tf.nn.relu(fc3)
    
    if add_dropout:
        fc3 = tf.nn.dropout(fc3,dropout)
    
    fc4_w = tf.Variable(tf.truncated_normal(shape = (84,43),mean=mu,stddev=sigma))
    fc4_b = tf.Variable(tf.zeros(43))
    logits = tf.matmul(fc3,fc4_w) + fc4_b
    
    return logits

这里我们的输入是32x32x1,第一层是5x5x12的网路层,输出是28x28x12,然后跟着一个max_pooling,变成14X14X12

第二层是5x5x24,输出是10x10x24,后面也是跟着一个max_pooling,变成5x5x24。

下面开始时全连接层,因为5x5x24=600,所以我的全连接层就是(600,400),(400,120),(120,84),(84,43)

然后加上0.5的dropout,我试了试0.5和0.75最后的accuracy都差不多

下面开始训练神经网络了

def get_batches(X,y,batch_size=128):
    length=len(X)
    n_batches=length//batch_size+1
    for i in range(n_batches):
        yield X[batch_size*i:min(length,batch_size*(i+1))], y[batch_size*i:min(length,batch_size*(i+1))]
### Train your model here.
### Calculate and report the accuracy on the training and validation set.
### Once a final model architecture is selected, 
### the accuracy on the test set should be calculated and reported as well.
### Feel free to use as many code cells as needed.
EPOCHS = 40
max_acc = 0
save_model_path = 'Traffice_sign_classifier'
logits = model(inputs)
logits = tf.identity(logits,name = 'logits')
cost = tf.reduce_mean(tf.losses.softmax_cross_entropy(labels,logits),name = 'cost')
#使用tf.nn.softmax_cross_entropy_with_logits创建交叉熵损失
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
#这句里的minimize不知道什么意思
correct_pred = tf.equal(tf.argmax(labels,1),tf.argmax(logits,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred,tf.float32),name = 'accuracy')

with tf.Session()  as sess:
    sess.run(tf.global_variables_initializer())
    for epoch in range(EPOCHS):
        for i,(X,y) in enumerate(get_batches(X_train,y_train)):
            sess.run(optimizer,feed_dict = {inputs:X,labels:y})
            if i%50==0:
                valid_acc=sess.run(accuracy, feed_dict={inputs:X_valid, labels:y_valid})
                train_acc=sess.run(accuracy, feed_dict={inputs:X, labels:y})
        print('epoch : ',epoch+1,' training accuracy is : ',train_acc,' valid accuracy is :',valid_acc)
        if valid_acc > max_acc:
            max_acc = valid_acc
            saver = tf.train.Saver(max_to_keep=1)
            #模型保存,先要创建一个Saver对象:如 saver=tf.train.Saver()
            #当然,如果你只想保存最后一代的模型,则只需要将max_to_keep设置为1即可
            save_path = saver.save(sess,save_model_path)
优达学城无人驾驶工程师——P2交通路牌识别_第2张图片
最后在测试集上进行测试
import pandas as pd
loaded_graph = tf.Graph()
save_model_path = './Traffice_sign_classifier'
with tf.Session(graph=loaded_graph) as sees:
    loader = tf.train.import_meta_graph(save_model_path + '.meta')
    loader.restore(sees, save_model_path)
    loaded_inputs=loaded_graph.get_tensor_by_name('inputs:0')
    loaded_labels=loaded_graph.get_tensor_by_name('labels:0')
    loaded_logits = loaded_graph.get_tensor_by_name('logits:0')
    loaded_acc=loaded_graph.get_tensor_by_name('accuracy:0')
    
    test_acc=sees.run(loaded_acc,feed_dict={loaded_inputs:X_test,loaded_labels:y_test})
    print('The test accuracy is:',test_acc)

到这里就算是完结了,谢谢大家看到最后。


你可能感兴趣的:(无人驾驶)