内容:
了解数据集。
步骤0:导入库和数据集。
步骤1:数据预处理。
步骤2:数据可视化。
ConvNets背后的直觉。
步骤3:训练模型。
步骤4:模型评估。
动机:由于特斯拉等公司在电动汽车自动化方面的努力,无人驾驶汽车正变得非常受欢迎。为了成为5级自动驾驶汽车,这些汽车必须正确识别交通标志并遵守交通规则。在识别出这些交通标志之后,它还应该能够适当地做出正确的决定。
了解数据集:
德国交通标志基准测试是在2011年国际神经网络联合会议(IJCNN)上举行的多类单图像分类挑战。请在此处下载数据集。数据集具有以下属性:
https://www.kaggle.com/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign
单图像,多分类问题
超过40个类别
总共超过50,000张图像
大型逼真的数据库
步骤0:导入库和数据集:
在第一步中,将导入所有标准库以及将作为数据和标签存储的数据集。导入Tensorflow是为了使用Keras,cv2解决计算机视觉相关的问题以及PIL处理不同的图像文件格式。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
import tensorflow as tf
from PIL import Image
import os
data = []
labels = []
classes = 43
cur_path = os.getcwd()
for i in range(classes):
path = os.path.join(cur_path, 'train', str(i))
images = os.listdir(path)
for a in images:
try:
image = Image.open(path + '\\'+ a)
image = image.resize((30, 30))
image = np.array(image)
data.append(image)
labels.append(i)
except:
print("Error loading image")
步骤1:资料预处理:
为了处理数据,将使用numpy将其转换为数组。然后,使用形状函数验证数据集的尺寸。然后,使用train_test_split函数以80:20的比率将数据集分为训练和测试数据。Y_train和Y_test包含43个整数形式的类,不适合模型。因此,将使用to_categorical函数将其转换为二进制形式。
# Converting to array
data = np.array(data)
labels = np.array(labels)
# Dataset Dimensions - (Number of Images, Width, Length, Color channels)
print("Dataset dimensions : ",data.shape)
output:
Dataset dimensions : (39209, 30, 30, 3)
# Splitting the dataset into train and test
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(data, labels, test_size = 0.2, random_state = 42)
# Checking dimensions - (Number of Images, Width, Length, Color channels)
print("X_train shape : ", X_train.shape)
print("X_test shape : ", X_test.shape)
print("Y_train shape : ", Y_train.shape)
print("Y_test shape : ", Y_test.shape)
output:
X_train shape : (31367, 30, 30, 3)
X_test shape : (7842, 30, 30, 3)
Y_train shape : (31367,)
Y_test shape : (7842,)
# Converting integer class to binary class
from keras.utils import to_categorical
Y_train_categorical = to_categorical(Y_train, 43)
Y_test_categorical = to_categorical(Y_test, 43)
第2步:数据可视化:
将使用imshow函数使数据集中的特定图像可视化。该数据集中的图像高度为30px,宽度为30px,并具有3个颜色通道。
# Visualizing Dataset Images
i = 100
plt.imshow(X_train[i])
print("Sign category :",Y_train[i])
ConvNets背后的直觉
由于卷积神经网络能够检测和识别图像中的各种对象,因此在计算机视觉应用中非常流行。
用外行的话来说,CNN基本上是一开始就具有卷积运算的完全连接的神经网络。这些卷积运算可用于检测图像中的定义图案。它类似于人脑枕叶中的神经元。ConvNets的体系结构使用3层构建,然后堆叠形成完整的ConvNet体系结构。以下是三层:
1、卷积层。
2、池化层。
3、完全连接。
卷积层:卷积层是ConvNet的核心部分,它执行所有计算量大的任务。在整个图像中遍历特定模式的内核或过滤器,以检测特定类型的特征。该遍历的输出将导致一个称为要素图的二维数组。该特征图中的每个值都通过ReLU函数传递,以消除非线性。
池化层:该层负责减少数据量,因为它减少了计算量和处理所需的时间。有两种类型的池化:平均值池和最大值池。顾名思义,“最大”池返回最大值,“平均”池返回内核覆盖的图像部分的平均值。
完全连接:上一步收到的二维输出数组通过展平过程转换为列向量。该向量被传递到多层神经网络,该网络通过一系列时期学习使用Softmax函数对图像进行分类。
步骤3:训练模型
```python
# Importing Keras Libraries
from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
# Creating Neural network Architecture
# Initialize neural network
model = Sequential()
# Add 2 convolutional layers with 32 filters, a 5x5 window, and ReLU activation function
model.add(Conv2D(filters = 32, kernel_size = (5, 5), activation = 'relu', input_shape = X_train.shape[1:]))
model.add(Conv2D(filters = 32, kernel_size = (5, 5), activation = 'relu'))
# Add max pooling layer with a 2x2 window
model.add(MaxPool2D(pool_size = (2, 2)))
# Add dropout layer
model.add(Dropout(rate = 0.25))
# Add 2 convolutional layers with 32 filters, a 5x5 window, and ReLU activation function
model.add(Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu'))
model.add(Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu'))
# Add max pooling layer with a 2x2 window
model.add(MaxPool2D(pool_size = (2, 2)))
# Add dropout layer
model.add(Dropout(rate = 0.25))
# Add layer to flatten input
model.add(Flatten())
# Add fully connected layer of 256 units with a ReLU activation function
model.add(Dense(256, activation = 'relu'))
# Add dropout layer
model.add(Dropout(rate = 0.5))
# Add fully connected layer of 256 units with a Softmax activation function
model.add(Dense(43, activation = 'softmax'))
# Summarizing the model architecture
model.summary()
output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 26, 26, 32) 2432
_________________________________________________________________
conv2d_2 (Conv2D) (None, 22, 22, 32) 25632
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 11, 11, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 11, 11, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 9, 9, 64) 18496
_________________________________________________________________
conv2d_4 (Conv2D) (None, 7, 7, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 3, 3, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 3, 3, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 576) 0
_________________________________________________________________
dense_1 (Dense) (None, 256) 147712
_________________________________________________________________
dropout_3 (Dropout) (None, 256) 0
_________________________________________________________________
dense_2 (Dense) (None, 43) 11051
=================================================================
Total params: 242,251
Trainable params: 242,251
Non-trainable params: 0
_________________________________________________________________
# Compile neural network
model.compile(loss = "categorical_crossentropy", optimizer = "adam", metrics = ["accuracy"])
# Train neural network
history = model.fit(X_train, Y_train_categorical, batch_size = 32, epochs = 15, validation_data = (X_test, Y_test_categorical))
Output after 15 epochs:
Epoch 15/15
31367/31367 [==============================] - 98s 3ms/step - loss: 0.2169 - acc: 0.9485 - val_loss: 0.0835 - val_acc: 0.9787
步骤4:模型评估:
# Ploting graph - Epoch vs Accuracy
plt.plot(history.history['acc'], label='training accuracy')
plt.plot(history.history['val_acc'], label='val accuracy')
plt.title('Accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.grid()
plt.legend()
plt.show()
准确性与时代
# Ploting graph - Epoch vs Loss
plt.plot(history.history['loss'], label='training loss')
plt.plot(history.history['val_loss'], label='val loss')
plt.title('Loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.grid()
plt.legend()
plt.show()
损失与时代
# Calculating Accuracy Score
from sklearn.metrics import accuracy_score
y_test = pd.read_csv('Test.csv')
labels = y_test["ClassId"].values
imgs = y_test["Path"].values
data = []
for img in imgs:
image = Image.open(img)
image = image.resize((30,30))
data.append(np.array(image))
X_test = np.array(data)
pred = model.predict_classes(X_test)
from sklearn.metrics import accuracy_score
print("Accuracy Score : ",accuracy_score(labels, pred))
Output:
Accuracy Score : 0.9499604117181314
欢迎大家的阅读,如果大家有不同的意见可以发表在留言区,我们一起学习,共同进步。