准备写个系列博客介绍机器学习实战中的部分公开项目。首先从初级项目开始。
本文为初级项目第二篇:利用MNIST数据集训练手写数字分类。
项目原网址为:Deep Learning Project – Handwritten Digit Recognition using Python。
第一篇为:机器学习实战 | emojify 使用Python创建自己的表情符号(深度学习初级)
项目构想:
MNIST数字分类项目,使机器能够识别手写数字。该Python项目对于计算机视觉可能非常有用。在这里,我们将使用MNIST数据集使用卷积神经网络训练模型。
经过训练后,在GUI页面(gui.py程序)显示效果如下:左边是手写数字,通过鼠标手写键入;右边点击recognise
会提示训练结果以及识别置信度。
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape, y_train.shape)
除了常规包外,同样需要提前配置Keras
和TensorFlow
,安装命令为:
pip install keras==2.10.0
pip install TensorFlow==2.10.0
这里需要注意MNIST手写数据集导入方法,直接从Keras中加载:keras.datasets.mnist
通过mnist.load
获取训练数据和测试数据,训练数据集维度为: 60000 × 28 × 28 60000 \times 28\times28 60000×28×28,测试数据集维度为: 10000 × 28 × 28 10000 \times 28\times28 10000×28×28.
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)
# convert class vectors to binary class matrices
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
one-shot
特征,用二进制表示分类类别,比如数字0用0000表示,数字1用0001表示,数字2用0010表示。0~1
范围内。batch_size = 128
num_classes = 10
epochs = 50
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adadelta(),metrics=['accuracy'])
该项目设计了卷积神经网络(CNN)模型,包括两层卷积层、池化层、全连接层等。
函数分析:
网络结构介绍:
hist = model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(x_test, y_test))
print("The model has successfully trained")
model.save('mnist.h5')
print("Saving the model as mnist.h5")
训练成功后,会在源目录下保存mnist.h5
文件,即为权重文件。
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
train.py : 完整训练代码。
gui.py: GUI窗口,输出可互动的界面。
"""
Handwrittern digit recognition
"""
"""
1. Import the libraries and load the dataset
"""
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape, y_train.shape)
"""
2. Preprocess the data
"""
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)
# convert class vectors to binary class matrices
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
"""
3. Create the model
"""
batch_size = 128
num_classes = 10
epochs = 50
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adadelta(),metrics=['accuracy'])
"""
4. Train the model
"""
hist = model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(x_test, y_test))
print("The model has successfully trained")
model.save('mnist.h5')
print("Saving the model as mnist.h5")
"""
5. Evaluate the model
"""
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
训练结果会保存在源目录下,生成文件名为:mnist.h5
。
from keras.models import load_model
from tkinter import *
import tkinter as tk
import win32gui
from PIL import ImageGrab, Image
import numpy as np
model = load_model('mnist.h5')
def predict_digit(img):
#resize image to 28x28 pixels
img = img.resize((28, 28))
#convert rgb to grayscale
img = img.convert('L')
img = np.array(img)
#reshaping to support our model input and normalizing
img = img.reshape(1, 28, 28, 1)
img = img/255.0
#predicting the class
res = model.predict([img])[0]
return np.argmax(res), max(res)
class App(tk.Tk):
def __init__(self):
tk.Tk.__init__(self)
self.x = self.y = 0
# Creating elements
self.canvas = tk.Canvas(self, width=300, height=300, bg = "white", cursor="cross")
self.label = tk.Label(self, text="Thinking..", font=("Helvetica", 48))
self.classify_btn = tk.Button(self, text = "Recognise", command =self.classify_handwriting)
self.button_clear = tk.Button(self, text = "Clear", command = self.clear_all)
# Grid structure
self.canvas.grid(row=0, column=0, pady=2, sticky=W, )
self.label.grid(row=0, column=1,pady=2, padx=2)
self.classify_btn.grid(row=1, column=1, pady=2, padx=2)
self.button_clear.grid(row=1, column=0, pady=2)
#self.canvas.bind("", self.start_pos)
self.canvas.bind("" , self.draw_lines)
def clear_all(self):
self.canvas.delete("all")
def classify_handwriting(self):
HWND = self.canvas.winfo_id() # get the handle of the canvas
rect = win32gui.GetWindowRect(HWND) # get the coordinate of the canvas
im = ImageGrab.grab(rect)
digit, acc = predict_digit(im)
self.label.configure(text= str(digit)+', '+ str(int(acc*100))+'%')
def draw_lines(self, event):
self.x = event.x
self.y = event.y
r=8
self.canvas.create_oval(self.x-r, self.y-r, self.x + r, self.y + r, fill='black')
app = App()
mainloop()
gui.py
程序中用了tkinter
包来呈现GUI页面,具体语句这里就不再分析解释了,需要学习的话可以参考以下链接:Python GUI编程(Tkinter)
gui.py
运行后,输出页面为:
通过键盘在左侧手写字符,点击recognise
输出识别结果。
如有问题,欢迎指出和讨论。