TensorFlow学习笔记(九)——全连神经网络

一、加载MNIST数据集

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
mnist = input_data.read_data_sets("D:/公用程序部分/tensor/MNIST_data",one_hot = True)
#input_data.read_data_sets(train——dir,fake_data=False,one_hot = True,dtype=dtypes.float32,reshape=True,validation_size=5000)

加载后的数据集如下:
TensorFlow学习笔记(九)——全连神经网络_第1张图片

read_data_sets()函数返回的类提供了next_batch()函数,它可以从所有的数据中读取一小部分作为一个batch

二、全连接神经网络

通过对数据集进行训练,得到预测能力,然后判断预测结果是否准确并打印准确率

import tensorflow as tf
import tensorflow.compat.v1 as tf1
from tensorflow.examples.tutorials.mnist import  input_data
tf.compat.v1.disable_eager_execution()
mnist = input_data.read_data_sets("D:/公用程序部分/tensor/MNIST_data",one_hot = True)
batch_sizes = 100
#每一轮训练的batch大小
learning_rate = 0.8
#初始的学习率
learning_rate_deacay = 0.999
#学习率的衰减
max_steps = 30000
#最大的训练步数
training_step = tf.Variable(0,trainable=False)
#定义变量,用于储存训练轮数
def hidden_layer(input_tensor,weights1,biases1,weights2,biases2,layer_name):
    layer1 = tf.nn.relu(tf.matmul(input_tensor,weights1)+biases1)
    return tf.matmul(layer1,weights2)+biases2
#定义隐藏层和输出层的计算方式
x = tf1.placeholder(tf.float32,[None,784],name="x-input")
y_ = tf1.placeholder(tf.float32,[None,10],name="y-output")
#分配必要的内存给x,y_,每一个x有28×28个参数,y_对应0~9十个数
weights1 = tf.Variable(tf1.truncated_normal([784,500],stddev=0.1))
biases1 = tf.Variable(tf.constant(0.1,shape = [500]))
weights2 = tf.Variable(tf1.truncated_normal([500,10],stddev=0.1))
biases2 = tf.Variable(tf.constant(0.1,shape = [10]))
y = hidden_layer(x, weights1 ,biases1 ,weights2 ,biases2 ,'y' )
#定义权重参数变量,偏置参数变量,输出y的值

average_class = tf.train.ExponentialMovingAverage(0.99, training_step)
#定义原变量的滑动平均影子变量
average_op = average_class.apply(tf1.trainable_variables())
#传递 trainable=True 时,Variable() 构造函数会自动将新变量添加到图形集合 GraphKeys.TRAINABLE_VARIABLES 中.这个便利函数返回该集合的内容.
#apply间接调动()内的函数
average_y = hidden_layer(x, average_class.average(weights1), average_class.average(biases1), average_class.average(weights2), average_class.average(biases2), 'average_y')
#滑动平均获得的输出的滑动平均值
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits= y,labels= tf.argmax (y_,1))
#计算预测输出值的交叉熵损失
regularizer = tf.keras.regularizers.l2(0.0001)
#定义l2正则化运算
regularization = (regularizer(weights1)+regularizer(weights2))/2
#对两个权重参数进行l2正则化
loss = tf.reduce_mean(cross_entropy)+regularization
#计算交叉熵总损失
laerning_rate = tf1.train.exponential_decay(learning_rate ,training_step ,mnist.train.num_examples/batch_sizes,learning_rate_deacay )
#用指数衰减法设置学习率,学习率在学习过程中不断的变化
train_step = tf1.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step = training_step )
#优化交叉熵损失和正则化损失
with tf.control_dependencies([train_step ,average_op ]):
    #传入交叉熵损失和正则化损失,滑动平均值
    train_op = tf.no_op (name="train")
    #tf.no_op()表示执行完 train_step, averages_op 操作之后什么都不做
    crorent_prediction = tf.equal (tf.argmax (average_y ,1),tf.argmax (y_,1))
    #预测结果和实际结果相同则crorent_prediction输出1,否则为0
    accuracy = tf.reduce_mean (tf.cast(crorent_prediction ,tf.float32 ))
    #计算预测准确率

with tf1.Session() as sess:
    tf1.global_variables_initializer ().run()
    #tf建立的变量是没有初始化的,首先需要对所有变量初始化
    validate_feed = {x:mnist.validation.images,y_:mnist.validation.labels}
    #验证数据
    test_feed = {x:mnist.test.images, y_:mnist.test.labels}
    #测试数据
    for i in range(max_steps):
        #循环30000(max_steps)次
        if i%1000 == 0:
            #每1000次print一个结果
            validate_accuracy = sess.run(accuracy,feed_dict=validate_feed)
            #计算验证数据集的准确率
            print("After %d training step(s), validation accuracy""using average model is %g%%" %(i,validate_accuracy *100))
            #打印验证数据集的准确率
            xs,ys = mnist.train.next_batch(batch_size=100)
            #读取训练数据集里的一小部分(100),放入xs,ys中
            sess.run(train_op,feed_dict = {x:xs,y_:ys})
            #运行完30000次训练终止
    test_accuracy = sess.run(accuracy,feed_dict=test_feed)
    #计算测试数据集的准确率
    print('After %d trainging step(s), test accuracy using average'"modek is %g%%" %(max_steps ,test_accuracy *100))
    #打印测试数据集的准确率

其训练结果如下:

......
......
......
Skipping registering GPU devices...
2020-07-29 10:43:33.648698: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-29 10:43:33.655346: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x24d6b13eb80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-29 10:43:33.655597: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-29 10:43:33.655828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-29 10:43:33.656018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      
After 0 training step(s), validation accuracyusing average model is 7.48%
After 1000 training step(s), validation accuracyusing average model is 25.26%
After 2000 training step(s), validation accuracyusing average model is 32.86%
After 3000 training step(s), validation accuracyusing average model is 57%
After 4000 training step(s), validation accuracyusing average model is 74.34%
After 5000 training step(s), validation accuracyusing average model is 79.48%
After 6000 training step(s), validation accuracyusing average model is 82.48%
After 7000 training step(s), validation accuracyusing average model is 84.5%
After 8000 training step(s), validation accuracyusing average model is 86.16%
After 9000 training step(s), validation accuracyusing average model is 87.6%
After 10000 training step(s), validation accuracyusing average model is 88.42%
After 11000 training step(s), validation accuracyusing average model is 89.08%
After 12000 training step(s), validation accuracyusing average model is 89.6%
After 13000 training step(s), validation accuracyusing average model is 89.8%
After 14000 training step(s), validation accuracyusing average model is 90.34%
After 15000 training step(s), validation accuracyusing average model is 89.68%
After 16000 training step(s), validation accuracyusing average model is 89.96%
After 17000 training step(s), validation accuracyusing average model is 90.62%
After 18000 training step(s), validation accuracyusing average model is 91.14%
After 19000 training step(s), validation accuracyusing average model is 91.76%
After 20000 training step(s), validation accuracyusing average model is 91.96%
After 21000 training step(s), validation accuracyusing average model is 92.12%
After 22000 training step(s), validation accuracyusing average model is 92.42%
After 23000 training step(s), validation accuracyusing average model is 92.4%
After 24000 training step(s), validation accuracyusing average model is 92.58%
After 25000 training step(s), validation accuracyusing average model is 92.9%
After 26000 training step(s), validation accuracyusing average model is 93.1%
After 27000 training step(s), validation accuracyusing average model is 93.28%
After 28000 training step(s), validation accuracyusing average model is 93.46%
After 29000 training step(s), validation accuracyusing average model is 93.72%
After 30000 trainging step(s), test accuracy using averagemodek is 91.84%

Process finished with exit code 0

未经过滑动平均,和学习率衰减的程序如下,其结果相比较于上述结果要略低一些:

import tensorflow as tf
import tensorflow.compat.v1 as tf1
from tensorflow.examples.tutorials.mnist import input_data
tf.compat.v1.disable_eager_execution()
mnist = input_data.read_data_sets("D:/公用程序部分/tensor/MNIST_data",one_hot = True)
x = tf1.placeholder (tf.float32, [None ,784],name="x-input")
y_ = tf1.placeholder (tf.float32, [None ,10],name="y-output")
weight  = tf.Variable (tf.zeros([784,10]))
bias = tf.Variable (tf.zeros ([10]))
y = tf.matmul(x, weight )+bias
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = y, labels = tf.argmax (y_, 1))
train_op = tf1.train.GradientDescentOptimizer (0.5).minimize(cross_entropy )
correct_prediction = tf.equal (tf.argmax (y,1),tf.argmax (y_,1))
accuracy = tf.reduce_mean (tf.cast(correct_prediction ,tf.float32 ))
with tf1.Session () as sess:
    tf1.global_variables_initializer ().run()
    validate_feed = {x:mnist .validation .images,y_:mnist .validation .labels}
    test_feed = {x:mnist .test .images,y_:mnist .test .labels}
    for i in range(30000):
        if i % 1000 == 0:
            validate_accuracy = sess.run(accuracy ,feed_dict= validate_feed)
            print("After %d training step(s), validation accuracy""using average model is %g%%" %(i,validate_accuracy *100))
        xs,ys = mnist .train .next_batch(100)
        sess.run(train_op ,feed_dict= {x:xs,y_:ys})
    test_accuracy = sess.run(accuracy ,feed_dict= test_feed)
    print("After 30000 trainging step(s), test accuracy using average""model is %g%%" %(test_accuracy *100))

结果如下:

......
......
......
Skipping registering GPU devices...
2020-07-30 09:30:08.093330: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-30 09:30:08.104758: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x240e1d389f0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-30 09:30:08.105195: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-30 09:30:08.105563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-30 09:30:08.105884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      
After 0 training step(s), validation accuracyusing average model is 9.58%
After 1000 training step(s), validation accuracyusing average model is 86.04%
After 2000 training step(s), validation accuracyusing average model is 88.6%
After 3000 training step(s), validation accuracyusing average model is 90.6%
After 4000 training step(s), validation accuracyusing average model is 90.86%
After 5000 training step(s), validation accuracyusing average model is 90.08%
After 6000 training step(s), validation accuracyusing average model is 91.2%
After 7000 training step(s), validation accuracyusing average model is 89.56%
After 8000 training step(s), validation accuracyusing average model is 90.3%
After 9000 training step(s), validation accuracyusing average model is 88.36%
After 10000 training step(s), validation accuracyusing average model is 89.76%
After 11000 training step(s), validation accuracyusing average model is 86.82%
After 12000 training step(s), validation accuracyusing average model is 88.78%
After 13000 training step(s), validation accuracyusing average model is 89.32%
After 14000 training step(s), validation accuracyusing average model is 90.52%
After 15000 training step(s), validation accuracyusing average model is 88.42%
After 16000 training step(s), validation accuracyusing average model is 89.34%
After 17000 training step(s), validation accuracyusing average model is 84.76%
After 18000 training step(s), validation accuracyusing average model is 88.72%
After 19000 training step(s), validation accuracyusing average model is 90.64%
After 20000 training step(s), validation accuracyusing average model is 84.16%
After 21000 training step(s), validation accuracyusing average model is 83.08%
After 22000 training step(s), validation accuracyusing average model is 87.38%
After 23000 training step(s), validation accuracyusing average model is 90.96%
After 24000 training step(s), validation accuracyusing average model is 90.24%
After 25000 training step(s), validation accuracyusing average model is 90%
After 26000 training step(s), validation accuracyusing average model is 90.9%
After 27000 training step(s), validation accuracyusing average model is 90.2%
After 28000 training step(s), validation accuracyusing average model is 89.68%
After 29000 training step(s), validation accuracyusing average model is 90.26%
After 30000 trainging step(s), test accuracy using averagemodel is 90.08%

三、超参数和验证集

人为的设置一些参数,而不是通过学习自主优化,这样的参数被称作超参数
设置原因:
①非常难以优化
②必须作为一个超参数出现(必须由人为的设置一个合理的数值)

测试集要求绝对不参与训练当中,它对于网络来说应该是完全 ‘未知’ 的。

你可能感兴趣的:(TensorFlow学习笔记(九)——全连神经网络)