这次是P2项目——交通路牌识别,用到的是简单的卷积网络,2层的卷积层加上4层全连接层,因为用的数据集的图片大小是32x32的,所以不用很复杂的神经网络。
数据地址在这里:https://s3-us-west-1.amazonaws.com/udacity-selfdrivingcar/traffic-signs-data.zip
直接粘贴到迅雷下载就好了。
下载好后解压是有3个文件,test.p train.p valid.p
我这次做到的准确率是93%,网上还有大神做到了98%,https://github.com/kenshiro-o/CarND-Traffic-Sign-Classifier-Project,这是他的github,大家可以看看。
下面开始我的代码
首先先读取数据
# Load pickled data
import pickle
# TODO: Fill this in based on where you saved the training and testing data
training_file = 'train.p'
validation_file='valid.p'
testing_file = 'test.p'
with open(training_file, mode='rb') as f:
train = pickle.load(f)
with open(validation_file, mode='rb') as f:
valid = pickle.load(f)
with open(testing_file, mode='rb') as f:
test = pickle.load(f)
X_train, y_train = train['features'], train['labels']
X_valid, y_valid = valid['features'], valid['labels']
X_test, y_test = test['features'], test['labels']
然后分析数据
### Replace each question mark with the appropriate value.
### Use python, pandas or numpy methods rather than hard coding the results
# TODO: Number of training examples
n_train = len(X_train)
# TODO: Number of validation examples
n_validation = len(X_valid)
# TODO: Number of testing examples.
n_test = len(X_test)
# TODO: What's the shape of an traffic sign image?
image_shape = X_train[0].shape
# TODO: How many unique classes/labels there are in the dataset.
n_classes = len(set(y_train))
print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
可以看到train有3万多张图片 大小是32x32x3 总共有43种分类
下面是随机show一张图片和分析train,valid,test中各各类别图片的数量
### Data exploration visualization code goes here.
### Feel free to use as many code cells as needed.
import matplotlib.pyplot as plt
from collections import Counter
import numpy as np
# Visualizations will be shown in the notebook.
%matplotlib inline
index=np.random.randint(n_train)
plt.imshow(X_train[index],cmap='gray')
print(y_train[index])
sign_count_test = Counter(y_train)
range_x = np.array(range(n_classes))
range_y = [sign_count_test[i] for i in range_x]
plt.figure(figsize=(9,5))
plt.bar(range_x,range_y)
plt.xticks(list(range(n_classes)))
plt.xlabel("class")
plt.ylabel("numbers")
plt.title("the train data distribution")
plt.show
sign_count_valid = Counter(y_valid)
range_x = np.array(range(n_classes))
range_y = [sign_count_valid[i] for i in range_x]
plt.figure(figsize=(9,5))
plt.bar(range_x,range_y)
plt.xticks(list(range(n_classes)))
plt.xlabel("class")
plt.ylabel("numbers")
plt.title("the valid data distribution")
plt.show
sign_count_test = Counter(y_test)
range_x = np.array(range(n_classes))
range_y = [sign_count_test[i] for i in range_x]
plt.figure(figsize=(9,5))
plt.bar(range_x,range_y)
plt.xticks(list(range(n_classes)))
plt.xlabel("class")
plt.ylabel("numbers")
plt.title("the test data distribution")
plt.show
下面开始一些图像的预处理,有normalized,grayscale等等都是一些很简单的函数
### Preprocess the data here. It is required to normalize the data. Other preprocessing steps could include
### converting to grayscale, etc.
### Feel free to use as many code cells as needed.
import cv2
def normalized(images):
xmin = np.min(images)
xmax = np.max(images)
image = (images - xmin)/(xmax-xmin)
return image
def grayscale(image):
gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)
return np.expand_dims(gray,axis = 2)
def preprocess(images):
list1 = []
for image in images:
list1.append(grayscale(image))
list1 = np.array(list1)
return normalized(list1)
def one_hot(images):
list1 = np.zeros((len(images),n_classes))
for i,label in enumerate(images):
list1[i][label] = 1
return list1
import tensorflow as tf
from sklearn.utils import shuffle
X_train,y_train = shuffle(X_train,y_train)
X_train = preprocess(X_train)
y_train = one_hot(y_train)
X_valid = preprocess(X_valid)
y_valid = one_hot(y_valid)
X_test = preprocess(X_test)
y_test = one_hot(y_test)
print(X_train.shape,y_train.shape)
这里的输出为:
(34799, 32, 32, 1) (34799, 43)
可以看到 我们通过预处理把原来是3通道的图变成灰度图,然后把y_train,y_valid,y_test变成one_hot格式,为后面做准备
下面就是开始设计我们的神经网络层了
### Define your architecture here.
### Feel free to use as many code cells as needed.
inputs = tf.placeholder(tf.float32,shape = [None,32,32,1],name = 'inputs')
labels = tf.placeholder(tf.int32,shape = [None,n_classes],name = 'labels')
from tensorflow.contrib.layers import flatten
def model(images,add_dropout = True):
mu = 0
sigma = 0.1
dropout = 0.5
#0.5和0.75算出来的精确度都差不多
conv1_W = tf.Variable(tf.truncated_normal(shape = (5,5,1,12),mean = mu,stddev=sigma))
conv1_b = tf.Variable(tf.zeros(12))
conv1 = tf.nn.conv2d(images,conv1_W,strides=[1,1,1,1],padding='VALID') + conv1_b
conv1 = tf.nn.relu(conv1)
conv1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1],strides=[1,2,2,1],padding='VALID',name = 'conv1')
#14x14x12
conv2_W = tf.Variable(tf.truncated_normal(shape = (5,5,12,24),mean = mu,stddev=sigma))
conv2_b = tf.Variable(tf.zeros(24))
conv2 = tf.nn.conv2d(conv1,conv2_W,strides = [1,1,1,1],padding='VALID',name = 'conv2') + conv2_b
conv2 = tf.nn.relu(conv2)
conv2 = tf.nn.max_pool(conv2,ksize=[1,2,2,1],strides=[1,2,2,1],padding='VALID')
#5,5,24
fc0 = flatten(conv2)
#600
fc1_W = tf.Variable(tf.truncated_normal(shape = (600,400),mean = mu,stddev = sigma))
fc1_b = tf.Variable(tf.zeros(400))
fc1 = tf.matmul(fc0,fc1_W) + fc1_b
fc1 = tf.nn.relu(fc1)
if add_dropout:
fc1 = tf.nn.dropout(fc1,dropout)
fc2_W = tf.Variable(tf.truncated_normal(shape=(400, 120), mean = mu, stddev = sigma))
fc2_b = tf.Variable(tf.zeros(120))
fc2 = tf.matmul(fc1, fc2_W) + fc2_b
fc2 = tf.nn.relu(fc2)
if add_dropout:
fc2 = tf.nn.dropout(fc2,dropout)
fc3_W = tf.Variable(tf.truncated_normal(shape = (120,84),mean = mu,stddev=sigma))
fc3_b = tf.Variable(tf.zeros(84))
fc3 = tf.matmul(fc2,fc3_W) +fc3_b
fc3 = tf.nn.relu(fc3)
if add_dropout:
fc3 = tf.nn.dropout(fc3,dropout)
fc4_w = tf.Variable(tf.truncated_normal(shape = (84,43),mean=mu,stddev=sigma))
fc4_b = tf.Variable(tf.zeros(43))
logits = tf.matmul(fc3,fc4_w) + fc4_b
return logits
这里我们的输入是32x32x1,第一层是5x5x12的网路层,输出是28x28x12,然后跟着一个max_pooling,变成14X14X12
第二层是5x5x24,输出是10x10x24,后面也是跟着一个max_pooling,变成5x5x24。
下面开始时全连接层,因为5x5x24=600,所以我的全连接层就是(600,400),(400,120),(120,84),(84,43)
然后加上0.5的dropout,我试了试0.5和0.75最后的accuracy都差不多
下面开始训练神经网络了
def get_batches(X,y,batch_size=128):
length=len(X)
n_batches=length//batch_size+1
for i in range(n_batches):
yield X[batch_size*i:min(length,batch_size*(i+1))], y[batch_size*i:min(length,batch_size*(i+1))]
### Train your model here.
### Calculate and report the accuracy on the training and validation set.
### Once a final model architecture is selected,
### the accuracy on the test set should be calculated and reported as well.
### Feel free to use as many code cells as needed.
EPOCHS = 40
max_acc = 0
save_model_path = 'Traffice_sign_classifier'
logits = model(inputs)
logits = tf.identity(logits,name = 'logits')
cost = tf.reduce_mean(tf.losses.softmax_cross_entropy(labels,logits),name = 'cost')
#使用tf.nn.softmax_cross_entropy_with_logits创建交叉熵损失
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
#这句里的minimize不知道什么意思
correct_pred = tf.equal(tf.argmax(labels,1),tf.argmax(logits,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred,tf.float32),name = 'accuracy')
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(EPOCHS):
for i,(X,y) in enumerate(get_batches(X_train,y_train)):
sess.run(optimizer,feed_dict = {inputs:X,labels:y})
if i%50==0:
valid_acc=sess.run(accuracy, feed_dict={inputs:X_valid, labels:y_valid})
train_acc=sess.run(accuracy, feed_dict={inputs:X, labels:y})
print('epoch : ',epoch+1,' training accuracy is : ',train_acc,' valid accuracy is :',valid_acc)
if valid_acc > max_acc:
max_acc = valid_acc
saver = tf.train.Saver(max_to_keep=1)
#模型保存,先要创建一个Saver对象:如 saver=tf.train.Saver()
#当然,如果你只想保存最后一代的模型,则只需要将max_to_keep设置为1即可
save_path = saver.save(sess,save_model_path)
import pandas as pd
loaded_graph = tf.Graph()
save_model_path = './Traffice_sign_classifier'
with tf.Session(graph=loaded_graph) as sees: loader = tf.train.import_meta_graph(save_model_path + '.meta') loader.restore(sees, save_model_path) loaded_inputs=loaded_graph.get_tensor_by_name('inputs:0') loaded_labels=loaded_graph.get_tensor_by_name('labels:0') loaded_logits = loaded_graph.get_tensor_by_name('logits:0') loaded_acc=loaded_graph.get_tensor_by_name('accuracy:0') test_acc=sees.run(loaded_acc,feed_dict={loaded_inputs:X_test,loaded_labels:y_test}) print('The test accuracy is:',test_acc)到这里就算是完结了,谢谢大家看到最后。