整个过程分为7步:
1、导入NMIST数据集
2、分析NMIST样本特点定义变量
3、构建模型
4、训练模型并输出中间状态参数
5、测试模型
6、保存模型
7、读取模型
Minist包含每张图片,以及对应的标签,是机器学习入门数据集。可以去网上找一个npz包下载,它是npy的压缩格式
使用解压缩文件可以看到里面有4个文件
使用下面代码可以查看
import pylab
import numpy as np
path="mnist.npz"
data=np.load(path)
x_train=data["x_train"]
y_train=data["y_train"]
for i in range(15):
print(y_train[i])
pylab.imshow(x_train[i])
pylab.show()
print(x_train.shape)打印 x_train的形状,可以看出,里面有60000张,形状为28*28的图片。
先把形状变换为784维的二阶张量。原本是28*28的
对y,进行One-hot编码,因为数字最大值是10,所以deepth=10
# 变换形状
X_train_image = X_train_image.reshape(X_train_image.shape[0],X_train_image.shape[1]*X_train_image.shape[2])
X_test_image =X_test_image.reshape(X_test_image.shape[0],X_test_image.shape[1]*X_test_image.shape[2])
y_train_label = tf.one_hot(y_train_label,10)
y_test_label= tf.one_hot(y_test_label,10)
#创建 占位符
然后定义两个占位符,第二个占位符是10列,因为只有十个不同的数字,使用None表示可以输入任意数量的图像。
tf.reset_default_graph()
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])
定义学习参数,也就是权重,偏置量。
W =tf.Variable(tf.random_normal(([784,10])))
b = tf.Variable(tf.zeros(10))
建立模型 ,正向传播结构的
pred = tf.nn.softmax(tf.matmul(x,W)+b)
设置学习率,损失函数,定义反向传播结构
预测值与实际值做一次交叉熵运算,然后取平均值,结果作为正向传播的误差
clip_by_value函数用于限制值的范围,不然log(0)会出现负无穷导致计算出错。
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(tf.clip_by_value(pred,1e-10,1.0)),reduction_indices=1))
learning_rate = 0.01
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
使用数据对模型反复训练25次,每次都会使用全部训练数据。
训练过程中,每次从数据集中取100条,进行迭代。每步都打印损失值
training_epochs = 25
batch_size =100
display_step =1
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# 将张量转换为ndarray,不然不能加入feed_dict
y_train_label=y_train_label.eval(session=sess)
for epoch in range(training_epochs):
avg_cost =0.
total_batch = int(X_train_image.shape[0]/batch_size)
for i in range(total_batch):
batch_xs = X_train_image[i*batch_size:(i+1)*batch_size]
batch_ys = y_train_label[i*batch_size:(i+1)*batch_size]
_,c = sess.run([optimizer,cost],feed_dict={x:batch_xs, y:batch_ys})
avg_cost +=c/total_batch
if (epoch+1) % display_step==0:
print("Epoch:",'%04d'%(epoch+1),"cost=","{:.9f}".format(avg_cost))
print("Finished!")
训练效果:
Epoch: 0001 cost= 18.489341068
Epoch: 0002 cost= 15.906391881
Epoch: 0003 cost= 14.711791927
Epoch: 0004 cost= 12.701335023
Epoch: 0005 cost= 11.691185404
Epoch: 0006 cost= 11.192687522
Epoch: 0007 cost= 11.009756290
Epoch: 0008 cost= 10.756890025
Epoch: 0009 cost= 10.591532230
Epoch: 0010 cost= 10.575708136
Epoch: 0011 cost= 10.485140076
Epoch: 0012 cost= 10.465657041
Epoch: 0013 cost= 10.429923476
Epoch: 0014 cost= 10.307853111
Epoch: 0015 cost= 10.254472695
Epoch: 0016 cost= 10.196132116
Epoch: 0017 cost= 10.148237507
Epoch: 0018 cost= 10.154881346
Epoch: 0019 cost= 10.155736175
Epoch: 0020 cost= 10.051871705
Epoch: 0021 cost= 9.980237563
Epoch: 0022 cost= 9.993254958
Epoch: 0023 cost= 10.002847915
Epoch: 0024 cost= 9.956430147
Epoch: 0025 cost= 9.944062105
Finished!
计算准确度
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
acc = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print("Accuracy:",acc.eval({x:X_test_image,y:y_test_label}))
对测试数据,准确度为0.6. 这个有波动,每次运行都有偏差,但是最高没超过0.7。书上达到了0.83的准确度,这点很奇怪,可能是数据集不同的原因。
Accuracy: 0.6029
训练并保存模型的完整代码:
import pylab
import numpy as np
import tensorflow as tf
import re
tf.reset_default_graph()
path="mnist.npz"
data=np.load(path)
X_train_image,y_train_label = data['x_train'],data['y_train']
X_test_image,y_test_label = data['x_test'],data['y_test']
# 变换形状
X_train_image = X_train_image.reshape(X_train_image.shape[0],X_train_image.shape[1]*X_train_image.shape[2])
X_test_image =X_test_image.reshape(X_test_image.shape[0],X_test_image.shape[1]*X_test_image.shape[2])
y_train_label = tf.one_hot(y_train_label, 10,1,0)
y_test_label = tf.one_hot(y_test_label, 10,1,0)
#创建 占位符
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])
W =tf.Variable(tf.random_normal(([784,10])))
b = tf.Variable(tf.zeros([10]))
z =tf.matmul(x,W)+b
# 建立模型 ,正向传播结构的
pred = tf.nn.softmax(z)
# 设置学习率,损失函数,定义反向传播结构
#预测值与实际值做一次交叉熵运算,然后取平均值,结果作为正向传播的误差
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(tf.clip_by_value(pred,1e-10,1.0)),reduction_indices=1))
learning_rate = 0.01
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
acc = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
training_epochs = 25
batch_size =100
display_step =1
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# 将张量转换为ndarray,不然不能加入feed_dict
y_train_label=y_train_label.eval(session=sess)
y_test_label = y_test_label.eval(session=sess)
for epoch in range(training_epochs):
avg_cost =0.
total_batch = int(X_train_image.shape[0]/batch_size)
for i in range(total_batch):
batch_xs = X_train_image[i*batch_size:(i+1)*batch_size]
batch_ys = y_train_label[i*batch_size:(i+1)*batch_size]
_,c ,ACC= sess.run([optimizer,cost,acc],feed_dict={x:batch_xs, y:batch_ys})
avg_cost +=c/total_batch
if (epoch+1) % display_step==0:
print("Epoch:",'%04d'%(epoch+1),"cost=","{:.9f}".format(avg_cost),"ACC=",ACC)
saver.save(sess, "a/myModel")
print("Accuracy:",acc.eval({x:X_test_image,y:y_test_label}))
print("Finished!")
使用保存的模型恢复session,并进行预测的完整代码:
import pylab
import numpy as np
import tensorflow as tf
import re
path="mnist.npz"
data=np.load(path)
X_test_image,y_test_label = data['x_test'],data['y_test']
# 变换形状
X_test_image =X_test_image.reshape(X_test_image.shape[0],X_test_image.shape[1]*X_test_image.shape[2])
y_test_label = tf.one_hot(y_test_label, 10,1,0)
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])
W =tf.Variable(tf.random_normal(([784,10])))
b = tf.Variable(tf.zeros([10]))
z =tf.matmul(x,W)+b
# 建立模型 ,正向传播结构的
pred = tf.nn.softmax(z)
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.restore(sess,"a/myModel")
y_test_label = y_test_label.eval(session=sess)
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("Accuracy:", acc.eval({x: X_test_image, y: y_test_label}))
准确度为0.66,和保存模型时的准确度一致。所以训练一次,后面直接用就可以了。
Accuracy: 0.6646