Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)

上一篇:Tensorflow入门三-多变量线性回归(矩阵点乘叉乘转置Numpy,附练习数据集百度云资源)https://blog.csdn.net/qq_36187544/article/details/89477715

下一篇:Tensorflow入门五-卷积神经网络(断点续训实现,基本概念、tf卷积函数、cifar数据集百度云资源及tf实现)https://blog.csdn.net/qq_36187544/article/details/89736184


MNIST是一个非常有名的手写体数字识别数据集,在很多资料中,这个数据集都会被用作深度学习的入门样例。 本文分几个部分,one-hot编码,softmax,交叉熵的概念,单神经元的tensorflow实现,神经网络实现,存取模型

 

目录

one-hot编码

softmax

交叉熵损失函数

单神经元的tensorflow实现

神经网络实现并存储模型

模型读取


推荐一些免费的课程:

在网易云课堂上的课程--深度学习应用开发TensorFlow实践https://mooc.study.163.com/course/2001396000

现在是2019年4月份可免费加入课程,进行学习。若时限太久过期勿怪

网易云课堂深度学习工程师微专业,现在是免费的https://mooc.study.163.com/smartSpec/detail/1001319001.htm:

Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)_第1张图片

 


 


one-hot编码

这个向量的表示为一项属性的特征向量,也就是同一时间只有一个激活点(不为0),这个向量只有一个特征是不为0的,其他都是0。特别稀疏,这个稀疏矩阵用来组成一个多特征的训练集样本

比如0-9一共九个数字,可用长度为10的向量表示,如0表示为{1,0,0,0,0,0,0,0,0,0},1表示为{0,1,0,0,0,0,0,0,0,0}

目的在于在计算欧氏距离时,独热(one-hot)编码可保证每个特征的比重一致


softmax

归一化指数函数,是逻辑函数的一种推广,常用于神经网络多分类问题输出层。逻辑回归:https://blog.csdn.net/qq_36187544/article/details/87893025

Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)_第2张图片


交叉熵损失函数

交叉熵就是用来判定实际的输出与期望的输出的接近程度。

交叉熵来自信息论,表示两个特征的相似程度,利用交叉熵损失函数来对神经网络BP算法进行迭代。

Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)_第3张图片


单神经元的tensorflow实现

利用jupyter实现,若其他IDE把所有代码放一起运行即可,cell1:

#MNIST手写数据集识别,单个神经元

from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np

#展示图像
def plot_image(image):
    plt.imshow(image.reshape(28,28),cmap='binary')
    plt.show()

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
plot_image(mnist.train.images[0])

x = tf.placeholder(tf.float32, [None, 784],name='X')
y = tf.placeholder(tf.float32, [None, 10],name='Y')

#定义变量
W = tf.Variable(tf.random_normal([784,10]),name='W')
b = tf.Variable(tf.zeros([10]),name='b')

forward = tf.matmul(x,W)+b#前向计算
pred = tf.nn.softmax(forward)#softmax分类

#设置超参数
train_epochs = 50 #训练轮数
batch_size = 100 #批次数
total_batch = int(mnist.train.num_examples/batch_size)#一轮训练多少批次
display_step = 1 #显示粒度
learning_rate = 0.01

#定义交叉熵损失函数
loss_function = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred),reduction_indices = 1))

#选择优化器
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss_function)

#定义准确率
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))#检查预测类与实际类别是否匹配
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

for epoch in range(train_epochs):
    for batch in range(total_batch):
        xs,ys = mnist.train.next_batch(batch_size)
        sess.run(optimizer,feed_dict={x:xs,y:ys})
    #total_batch个批次训练完成后,使用验证集计算误差和准确率
    loss,acc = sess.run([loss_function,accuracy],feed_dict={x:mnist.validation.images,y:mnist.validation.labels})
    if (epoch+1) % display_step == 0:
        print("Train Epoch:","%02d"%(epoch+1),"Loss=","{:.9f}".format(loss),"Accuracy=","{:.4f}".format(acc))
print("Train Finished!")

运行结果:

Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)_第4张图片Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)_第5张图片

#测试准确率
accu_test = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
print("Test Accuracy:",accu_test)
#将独热编码转为0-9数字
prediction_result = sess.run(tf.argmax(pred,1),feed_dict={x:mnist.test.images})

#查看结果
prediction_result[0:10]

运行结果:

def plot_images_labels_prediction(images,#图像列表
                                  labels,#标签列表
                                  prediction,#预测值列表
                                  index,#从第index个开始显示
                                  num):#默认一次显示10个
    fig = plt.gcf() #获取当前图表
    fig.set_size_inches(10,12)
    if num>25:
        num = 25
    for i in range(num):
        ax = plt.subplot(5,5,i+1)
        ax.imshow(np.reshape(images[index],(28,28)),cmap='binary')
        title = "label = "+str(np.argmax(labels[index]))
        if len(prediction) > 0:
            title += ",predict= " +str(prediction[index])
        ax.set_title(title,fontsize=10)#显示title信息
        ax.set_xticks([])
        ax.set_yticks([])
        index +=1
    plt.show()
plot_images_labels_prediction(mnist.test.images,mnist.test.labels,prediction_result,10,10)

运行结果:

Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)_第6张图片


神经网络实现并存储模型

'''
MNIST手写数据集识别,两层神经网络
存储模型
'''
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from time import time
import os

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, [None, 784],name='X')
y = tf.placeholder(tf.float32, [None, 10],name='Y')

#构建模型
H1_NN = 256 #第一层神经元有256个
H2_NN = 64 #第二层神经元有64个

#各层参数
W1 = tf.Variable(tf.truncated_normal([784,H1_NN],stddev = 0.1)) #截断正态分布,取标准差在0.1区间内的参数
b1 = tf.Variable(tf.zeros([H1_NN]))
W2 = tf.Variable(tf.truncated_normal([H1_NN,H2_NN],stddev = 0.1))
b2 = tf.Variable(tf.zeros([H2_NN]))
W3 = tf.Variable(tf.truncated_normal([H2_NN,10],stddev = 0.1))
b3 = tf.Variable(tf.zeros([10]))

Y1 = tf.nn.relu(tf.matmul(x,W1)+b1)
Y2 = tf.nn.relu(tf.matmul(Y1,W2)+b2)

forward = tf.matmul(Y2,W3)+b3
pred = tf.nn.softmax(forward)

#定义交叉熵损失函数
loss_function = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=forward,labels=y))
#为了避免log(0)这种情况带来的数据不稳定现象,采用上述表达, 摒弃loss_function = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred),reduction_indices = 1))

#设置超参数
train_epochs = 50 #训练轮数
batch_size = 100 #批次数
total_batch = int(mnist.train.num_examples/batch_size)#一轮训练多少批次
display_step = 1 #显示粒度
learning_rate = 0.01

#选择优化器
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_function)

#定义准确率
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))#检查预测类与实际类别是否匹配
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

#存储模型
save_step = 10 #存储模型的粒度,即每执行多少步存储一次模型
ckpt_dir = "./ckpt_dir/"
if not os.path.exists(ckpt_dir):#如果路径不存在则创建一个路径
    os.makedirs(ckpt_dir)
saver = tf.train.Saver()

startTime = time()
sess = tf.Session()
sess.run(tf.global_variables_initializer())

for epoch in range(train_epochs):
    for batch in range(total_batch):
        xs,ys = mnist.train.next_batch(batch_size)
        sess.run(optimizer,feed_dict={x:xs,y:ys})
        # total_batch个批次训练完成后,使用验证集计算误差和准确率
    loss, acc = sess.run([loss_function, accuracy],
                         feed_dict={x: mnist.validation.images, y: mnist.validation.labels})
    if (epoch + 1) % display_step == 0:
        print("Train Epoch:", "%02d" % (epoch + 1), "Loss=", "{:.9f}".format(loss), "Accuracy=", "{:.4f}".format(acc))

    #存储模型:
    if (epoch+1)%save_step == 0:
        saver.save(sess, os.path.join(ckpt_dir,"mnist_h256_model_{}.ckpt".format(epoch+1)))
        print("mnist_h256_model_{}.ckpt saved".format(epoch+1))
saver.save(sess,os.path.join(ckpt_dir,"mnist_h256_model.ckpt"))
print("Train Finished!")

#测试准确率
accu_test = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
print("Test Accuracy:",accu_test)


#运行时间
duration = time() - startTime
print("Train Finished:","{:.2f}".format(duration))

运行结果(pycharm运行):

Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)_第7张图片

存储模型,路径与代码设置有关,一个模型用3个文件进行储存:

Tensorflow入门四-手写数据集MNIST识别神经网络并实现存取模型(one-hot编码,softmax,交叉熵)_第8张图片


模型读取

文件的读取比较简单,就是把上一个神经网络代码训练部分替换成读取:

'''
读取模型
'''
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from time import time
import os

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x = tf.placeholder(tf.float32, [None, 784],name='X')
y = tf.placeholder(tf.float32, [None, 10],name='Y')
H1_NN = 256 #第一层神经元有256个
H2_NN = 64 #第二层神经元有64个
W1 = tf.Variable(tf.truncated_normal([784,H1_NN],stddev = 0.1))
b1 = tf.Variable(tf.zeros([H1_NN]))
W2 = tf.Variable(tf.truncated_normal([H1_NN,H2_NN],stddev = 0.1))
b2 = tf.Variable(tf.zeros([H2_NN]))
W3 = tf.Variable(tf.truncated_normal([H2_NN,10],stddev = 0.1))
b3 = tf.Variable(tf.zeros([10]))
Y1 = tf.nn.relu(tf.matmul(x,W1)+b1)
Y2 = tf.nn.relu(tf.matmul(Y1,W2)+b2)
forward = tf.matmul(Y2,W3)+b3
pred = tf.nn.softmax(forward)
loss_function = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=forward,labels=y))
train_epochs = 50 #训练轮数
batch_size = 100 #批次数
total_batch = int(mnist.train.num_examples/batch_size)#一轮训练多少批次
display_step = 1 #显示粒度
learning_rate = 0.01
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_function)
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
save_step = 10 #存储模型的粒度,即每执行多少步存储一次模型
ckpt_dir = "./ckpt_dir/"
if not os.path.exists(ckpt_dir):#如果路径不存在则创建一个路径
    os.makedirs(ckpt_dir)
saver = tf.train.Saver()
startTime = time()
sess = tf.Session()
sess.run(tf.global_variables_initializer())

#训练部分省略
# for epoch in range(train_epochs):
#     for batch in range(total_batch):
#         xs,ys = mnist.train.next_batch(batch_size)
#         sess.run(optimizer,feed_dict={x:xs,y:ys})
#         # total_batch个批次训练完成后,使用验证集计算误差和准确率
#     loss, acc = sess.run([loss_function, accuracy],
#                          feed_dict={x: mnist.validation.images, y: mnist.validation.labels})
#     if (epoch + 1) % display_step == 0:
#         print("Train Epoch:", "%02d" % (epoch + 1), "Loss=", "{:.9f}".format(loss), "Accuracy=", "{:.4f}".format(acc))
#
#     #存储模型:
#     if (epoch+1)%save_step == 0:
#         saver.save(sess, os.path.join(ckpt_dir,"mnist_h256_model_{}.ckpt".format(epoch+1)))
#         print("mnist_h256_model_{}.ckpt saved".format(epoch+1))
# saver.save(sess,os.path.join(ckpt_dir,"mnist_h256_model.ckpt"))
# print("Train Finished!")

#添加读取部分
#创建saver
saver = tf.train.Saver()
ckpt_dir = "./ckpt_dir/"
ckpt = tf.train.get_checkpoint_state(ckpt_dir)
if ckpt and ckpt.model_checkpoint_path:
    saver.restore(sess,ckpt.model_checkpoint_path)


accu_test = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
print("Test Accuracy:",accu_test)
duration = time() - startTime
print("Train Finished:","{:.2f}".format(duration))

运行结果较训练过程时间长度少了一个数量级:

你可能感兴趣的:(#,TensorFlow)