上一篇:Tensorflow入门三-多变量线性回归(矩阵点乘叉乘转置Numpy,附练习数据集百度云资源)https://blog.csdn.net/qq_36187544/article/details/89477715
下一篇:Tensorflow入门五-卷积神经网络(断点续训实现,基本概念、tf卷积函数、cifar数据集百度云资源及tf实现)https://blog.csdn.net/qq_36187544/article/details/89736184
MNIST是一个非常有名的手写体数字识别数据集,在很多资料中,这个数据集都会被用作深度学习的入门样例。 本文分几个部分,one-hot编码,softmax,交叉熵的概念,单神经元的tensorflow实现,神经网络实现,存取模型
目录
one-hot编码
softmax
交叉熵损失函数
单神经元的tensorflow实现
神经网络实现并存储模型
模型读取
推荐一些免费的课程:
在网易云课堂上的课程--深度学习应用开发TensorFlow实践https://mooc.study.163.com/course/2001396000
现在是2019年4月份可免费加入课程,进行学习。若时限太久过期勿怪
网易云课堂深度学习工程师微专业,现在是免费的https://mooc.study.163.com/smartSpec/detail/1001319001.htm:
这个向量的表示为一项属性的特征向量,也就是同一时间只有一个激活点(不为0),这个向量只有一个特征是不为0的,其他都是0。特别稀疏,这个稀疏矩阵用来组成一个多特征的训练集样本
比如0-9一共九个数字,可用长度为10的向量表示,如0表示为{1,0,0,0,0,0,0,0,0,0},1表示为{0,1,0,0,0,0,0,0,0,0}
目的在于在计算欧氏距离时,独热(one-hot)编码可保证每个特征的比重一致
归一化指数函数,是逻辑函数的一种推广,常用于神经网络多分类问题输出层。逻辑回归:https://blog.csdn.net/qq_36187544/article/details/87893025
交叉熵就是用来判定实际的输出与期望的输出的接近程度。
交叉熵来自信息论,表示两个特征的相似程度,利用交叉熵损失函数来对神经网络BP算法进行迭代。
利用jupyter实现,若其他IDE把所有代码放一起运行即可,cell1:
#MNIST手写数据集识别,单个神经元
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
#展示图像
def plot_image(image):
plt.imshow(image.reshape(28,28),cmap='binary')
plt.show()
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
plot_image(mnist.train.images[0])
x = tf.placeholder(tf.float32, [None, 784],name='X')
y = tf.placeholder(tf.float32, [None, 10],name='Y')
#定义变量
W = tf.Variable(tf.random_normal([784,10]),name='W')
b = tf.Variable(tf.zeros([10]),name='b')
forward = tf.matmul(x,W)+b#前向计算
pred = tf.nn.softmax(forward)#softmax分类
#设置超参数
train_epochs = 50 #训练轮数
batch_size = 100 #批次数
total_batch = int(mnist.train.num_examples/batch_size)#一轮训练多少批次
display_step = 1 #显示粒度
learning_rate = 0.01
#定义交叉熵损失函数
loss_function = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred),reduction_indices = 1))
#选择优化器
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss_function)
#定义准确率
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))#检查预测类与实际类别是否匹配
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for epoch in range(train_epochs):
for batch in range(total_batch):
xs,ys = mnist.train.next_batch(batch_size)
sess.run(optimizer,feed_dict={x:xs,y:ys})
#total_batch个批次训练完成后,使用验证集计算误差和准确率
loss,acc = sess.run([loss_function,accuracy],feed_dict={x:mnist.validation.images,y:mnist.validation.labels})
if (epoch+1) % display_step == 0:
print("Train Epoch:","%02d"%(epoch+1),"Loss=","{:.9f}".format(loss),"Accuracy=","{:.4f}".format(acc))
print("Train Finished!")
运行结果:
#测试准确率
accu_test = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
print("Test Accuracy:",accu_test)
#将独热编码转为0-9数字
prediction_result = sess.run(tf.argmax(pred,1),feed_dict={x:mnist.test.images})
#查看结果
prediction_result[0:10]
运行结果:
def plot_images_labels_prediction(images,#图像列表
labels,#标签列表
prediction,#预测值列表
index,#从第index个开始显示
num):#默认一次显示10个
fig = plt.gcf() #获取当前图表
fig.set_size_inches(10,12)
if num>25:
num = 25
for i in range(num):
ax = plt.subplot(5,5,i+1)
ax.imshow(np.reshape(images[index],(28,28)),cmap='binary')
title = "label = "+str(np.argmax(labels[index]))
if len(prediction) > 0:
title += ",predict= " +str(prediction[index])
ax.set_title(title,fontsize=10)#显示title信息
ax.set_xticks([])
ax.set_yticks([])
index +=1
plt.show()
plot_images_labels_prediction(mnist.test.images,mnist.test.labels,prediction_result,10,10)
运行结果:
'''
MNIST手写数据集识别,两层神经网络
存储模型
'''
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from time import time
import os
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x = tf.placeholder(tf.float32, [None, 784],name='X')
y = tf.placeholder(tf.float32, [None, 10],name='Y')
#构建模型
H1_NN = 256 #第一层神经元有256个
H2_NN = 64 #第二层神经元有64个
#各层参数
W1 = tf.Variable(tf.truncated_normal([784,H1_NN],stddev = 0.1)) #截断正态分布,取标准差在0.1区间内的参数
b1 = tf.Variable(tf.zeros([H1_NN]))
W2 = tf.Variable(tf.truncated_normal([H1_NN,H2_NN],stddev = 0.1))
b2 = tf.Variable(tf.zeros([H2_NN]))
W3 = tf.Variable(tf.truncated_normal([H2_NN,10],stddev = 0.1))
b3 = tf.Variable(tf.zeros([10]))
Y1 = tf.nn.relu(tf.matmul(x,W1)+b1)
Y2 = tf.nn.relu(tf.matmul(Y1,W2)+b2)
forward = tf.matmul(Y2,W3)+b3
pred = tf.nn.softmax(forward)
#定义交叉熵损失函数
loss_function = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=forward,labels=y))
#为了避免log(0)这种情况带来的数据不稳定现象,采用上述表达, 摒弃loss_function = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred),reduction_indices = 1))
#设置超参数
train_epochs = 50 #训练轮数
batch_size = 100 #批次数
total_batch = int(mnist.train.num_examples/batch_size)#一轮训练多少批次
display_step = 1 #显示粒度
learning_rate = 0.01
#选择优化器
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_function)
#定义准确率
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))#检查预测类与实际类别是否匹配
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
#存储模型
save_step = 10 #存储模型的粒度,即每执行多少步存储一次模型
ckpt_dir = "./ckpt_dir/"
if not os.path.exists(ckpt_dir):#如果路径不存在则创建一个路径
os.makedirs(ckpt_dir)
saver = tf.train.Saver()
startTime = time()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for epoch in range(train_epochs):
for batch in range(total_batch):
xs,ys = mnist.train.next_batch(batch_size)
sess.run(optimizer,feed_dict={x:xs,y:ys})
# total_batch个批次训练完成后,使用验证集计算误差和准确率
loss, acc = sess.run([loss_function, accuracy],
feed_dict={x: mnist.validation.images, y: mnist.validation.labels})
if (epoch + 1) % display_step == 0:
print("Train Epoch:", "%02d" % (epoch + 1), "Loss=", "{:.9f}".format(loss), "Accuracy=", "{:.4f}".format(acc))
#存储模型:
if (epoch+1)%save_step == 0:
saver.save(sess, os.path.join(ckpt_dir,"mnist_h256_model_{}.ckpt".format(epoch+1)))
print("mnist_h256_model_{}.ckpt saved".format(epoch+1))
saver.save(sess,os.path.join(ckpt_dir,"mnist_h256_model.ckpt"))
print("Train Finished!")
#测试准确率
accu_test = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
print("Test Accuracy:",accu_test)
#运行时间
duration = time() - startTime
print("Train Finished:","{:.2f}".format(duration))
运行结果(pycharm运行):
存储模型,路径与代码设置有关,一个模型用3个文件进行储存:
文件的读取比较简单,就是把上一个神经网络代码训练部分替换成读取:
'''
读取模型
'''
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from time import time
import os
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x = tf.placeholder(tf.float32, [None, 784],name='X')
y = tf.placeholder(tf.float32, [None, 10],name='Y')
H1_NN = 256 #第一层神经元有256个
H2_NN = 64 #第二层神经元有64个
W1 = tf.Variable(tf.truncated_normal([784,H1_NN],stddev = 0.1))
b1 = tf.Variable(tf.zeros([H1_NN]))
W2 = tf.Variable(tf.truncated_normal([H1_NN,H2_NN],stddev = 0.1))
b2 = tf.Variable(tf.zeros([H2_NN]))
W3 = tf.Variable(tf.truncated_normal([H2_NN,10],stddev = 0.1))
b3 = tf.Variable(tf.zeros([10]))
Y1 = tf.nn.relu(tf.matmul(x,W1)+b1)
Y2 = tf.nn.relu(tf.matmul(Y1,W2)+b2)
forward = tf.matmul(Y2,W3)+b3
pred = tf.nn.softmax(forward)
loss_function = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=forward,labels=y))
train_epochs = 50 #训练轮数
batch_size = 100 #批次数
total_batch = int(mnist.train.num_examples/batch_size)#一轮训练多少批次
display_step = 1 #显示粒度
learning_rate = 0.01
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_function)
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
save_step = 10 #存储模型的粒度,即每执行多少步存储一次模型
ckpt_dir = "./ckpt_dir/"
if not os.path.exists(ckpt_dir):#如果路径不存在则创建一个路径
os.makedirs(ckpt_dir)
saver = tf.train.Saver()
startTime = time()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
#训练部分省略
# for epoch in range(train_epochs):
# for batch in range(total_batch):
# xs,ys = mnist.train.next_batch(batch_size)
# sess.run(optimizer,feed_dict={x:xs,y:ys})
# # total_batch个批次训练完成后,使用验证集计算误差和准确率
# loss, acc = sess.run([loss_function, accuracy],
# feed_dict={x: mnist.validation.images, y: mnist.validation.labels})
# if (epoch + 1) % display_step == 0:
# print("Train Epoch:", "%02d" % (epoch + 1), "Loss=", "{:.9f}".format(loss), "Accuracy=", "{:.4f}".format(acc))
#
# #存储模型:
# if (epoch+1)%save_step == 0:
# saver.save(sess, os.path.join(ckpt_dir,"mnist_h256_model_{}.ckpt".format(epoch+1)))
# print("mnist_h256_model_{}.ckpt saved".format(epoch+1))
# saver.save(sess,os.path.join(ckpt_dir,"mnist_h256_model.ckpt"))
# print("Train Finished!")
#添加读取部分
#创建saver
saver = tf.train.Saver()
ckpt_dir = "./ckpt_dir/"
ckpt = tf.train.get_checkpoint_state(ckpt_dir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess,ckpt.model_checkpoint_path)
accu_test = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
print("Test Accuracy:",accu_test)
duration = time() - startTime
print("Train Finished:","{:.2f}".format(duration))
运行结果较训练过程时间长度少了一个数量级: