卷积神经网络(Convolutional Neural Networks / CNNs / ConvNets)与普通神经网络非常相似,它们都由具有可学习的权重和偏置常量(biases)的神经元组成。每个神经元都接收一些输入,并做一些点积计算,输出是每个分类的分数,全连接神经网络(DNN)里的一些计算技巧到这里依旧适用。
CNN(convolutional neural network),主要就是通过一个个的filter,不断地提取特征,从局部的特征到总体的特征,从而进行图像识别等等功能。
一个卷积神经网络由很多层组成,它们的输入是三维的,输出也是三维的,有的层有参数,有的层不需要参数。
卷积神经网络与其他神经网络的区别:卷积神经网络默认输入是图像,可以让我们把特定的性质编码入网络结构,使是我们的前馈函数更加有效率,并减少了大量参数。
卷积神经网络(CNN)结构由以下层组成:
具有三维体积的神经元(3D volumes of neurons):卷积神经网络利用输入是图片的特点,把神经元设计成三个维度 : width, height, depth(注意这个depth不是神经网络的深度,而是用来描述神经元的) 。比如输入的图片大小是 7 × 7 × 3 (rgb),那么输入神经元就也具有 7×7×3 的维度。
一层卷积层的输出单元(由多个Feature Map组成)的大小有以下三个量控制:深度(depth)、步幅(stride)、补零(zero-padding)。
池化(pool)即下采样(downsamples),目的是为了减小特征图(Feature Map)。池化操作对每个深度切片独立,规模一般为 2 × 2 2×2 2×2,相对于卷积层进行卷积运算,池化层进行的运算一般有以下几种:
全连接层和卷积层可以相互转换:
在前向传播过程中,输入的图形数据经过多层卷积层的卷积和池化处理,提出特征向量,将特征向量传入全连接层中,得出分类识别的结果。当输出的结果与我们的期望值相符时,输出结果。
卷积层的向前传播过程是,通过卷积核对输入数据进行卷积操作得到卷积操作。数据在实际的网络中的计算过程,我们以图3-4为例,介绍卷积层的向前传播过程。其中一个输入为15个神经元的图片,卷积核为2×2×1的网络,即卷积核的权值为W1,W2,W3,W4。那么卷积核对于输入数据的卷积过程,如下图4-2所示。卷积核采用步长为1的卷积方式,卷积整个输入图片,形成了局部感受野,然后与其进行卷积算法,即权值矩阵与图片的特征值进行加权和(再加上一个偏置量),然后通过激活函数得到输出。
图片深度为2时,卷积层的向前传播过程如图4-3所示。输入的图片的深度为4×4×2,卷积核为2×2×2,向前传播过程为,求得第一层的数据与卷积核的第一层的权值的加权和,然后再求得第二层的数据与卷积核的第二层的权值的加权和,两层的加权和相加得到网络的输出。
上一层(卷积层)提取的特征作为输入传到下采样层,通过下采样层的池化操作,降低数据的维度,可以避免过拟合。如图4-4中为常见的池化方式示意。最大池化方法也就是选取特征图中的最大值。均值池化则是求出特征图的平均值。随机池化方法则是先求出所有的特征值出现在该特征图中的概率,然后在来随机选取其中的一个概率作为该特征图的特征值,其中概率越大的选择的几率越大。
特征图进过卷积层和下采样层的特征提取之后,将提取出来的特征传到全连接层中,通过全连接层,进行分类,获得分类模型,得到最后的结果。下图为一个三层的全连接层。假设卷积神经网络中,传入全连接层的特征为x1,x2。则其在全连接层中的向前传播过程如图所示。第一层全连接层有3个神经元y1,y2,y3。这三个节点的权值矩阵为W,其中b1,b2,b3分别为节点y1,y2,y3的偏置量。可以看出,在全连接层中,参数的个数=全连接层中节点的个数×输入的特征的个数+节点的个数(偏置量)。其向前传递过程具体如图所示,得到输出矩阵后,经过激励函数f(y)的激活,传入下一层。
当卷积神经网络输出的结果与我们的期望值不相符时,则进行反向传播过程。求出结果与期望值的误差,再将误差一层一层的返回,计算出每一层的误差,然后进行权值更新。该过程的主要目的是通过训练样本和期望值来调整网络权值。误差的传递过程可以这样来理解,首先,数据从输入层到输出层,期间经过了卷积层,下采样层,全连接层,而数据在各层之间传递的过程中难免会造成数据的损失,则也就导致了误差的产生。而每一层造成的误差值是不一样的,所以当我们求出网络的总误差之后,需要将误差传入网络中,求得该各层对于总的误差应该承担多少比重。
反向传播的训练过程的第一步为计算出网络总的误差:求出输出层n的输出a(n)与目标值y之间为误差。计算公式为:
求出网络的总差之后,进行反向传播过程,将误差传入输出层的上一层全连接层,求出在该层中,产生了多少误差。而网络的误差又是由组成该网络的神经元所造成的,所以我们要求出每个神经元在网络中的误差。求上一层的误差,需要找出上一层中哪些节点与该输出层连接,然后用误差乘以节点的权值,求得每个节点的误差,具体如图所示:
在下采样层中,根据采用的池化方法,把误差传入到上一层。下采样层如果采用的是最大池化(max-pooling)的方法,则直接把误差传到上一层连接的节点中。果采用的是均值池化(mean pooling)的方法,误差则是均匀的分布到上一层的网络中。另外在下采样层中,是不需要进行权值更新的,只需要正确的传递所有的误差到上一层。
卷积层中采用的是局部连接的方式,和全连接层的误差传递方式不同,在卷积层中,误差的传递也是依靠卷积核进行传递的。在误差传递的过程,我们需要通过卷积核找到卷积层和上一层的连接节点。求卷积层的上一层的误差的过程为:先对卷积层误差进行一层全零填充,然后将卷积层进行一百八十度旋转,再用旋转后的卷积核卷积填充过程的误差矩阵,并得到了上一层的误差。如图4-7为卷积层的误差传递过程。图右上方为卷积层的向前卷积过程,而右下方为卷积层的误差传递过程。从图中可以看出,误差的卷积过程正好是沿着向前传播的过程,将误差传到了上一层。
卷积层的误差更新过程为:将误差矩阵当做卷积核,卷积输入的特征图,并得到了权值的偏差矩阵,然后与原先的卷积核的权值相加,并得到了更新后的卷积核。如图4-8,图中可以看出,该卷积方式的权值连接正好和向前传播中权值的连接是一致的。
全连接层中的权值更新过程为:
Caffe
Torch
TensorFlow
一般来说,提高泛化能力的方法主要有以下几个:
1、使用L2正则化,dropout技术,扩展数据集等,有效缓解过拟合,提升了性能;
2、使用ReLU,导数为常量,可以缓解梯度下降问题,并加速训练;
3、增加Conv/Pooling与Fc层,可以改善性能。(我自己实测也是如此)
Note:
1、网络并非越深越好,单纯的Conv/Pooling/Fc结构,增加到一定深度后由于过拟合性能反而下降。
2、网络结构信息更重要,如使用GoogleNet、ResNet等。
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # 放在 import tensorflow as tf 之前才有效
import tensorflow as tf
from tensorflow.keras import layers, optimizers, datasets, Sequential
# 一、获取数据集
(X_train, Y_train), (X_val, Y_val) = datasets.cifar100.load_data()
print('X_train.shpae = {0},Y_train.shpae = {1}------------type(X_train) = {2},type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
Y_train = tf.squeeze(Y_train)
Y_val = tf.squeeze(Y_val)
print('X_train.shpae = {0},Y_train.shpae = {1}------------type(X_train) = {2},type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
# 二、数据处理
# 预处理函数:将numpy数据转为tensor
def preprocess(x, y):
x = tf.cast(x, dtype=tf.float32) / 255.
y = tf.cast(y, dtype=tf.int32)
return x, y
# 2.1 处理训练集
# print('X_train.shpae = {0},Y_train.shpae = {1}------------type(X_train) = {2},type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
db_train = tf.data.Dataset.from_tensor_slices((X_train, Y_train)) # 此步骤自动将numpy类型的数据转为tensor
db_train = db_train.map(preprocess) # 调用map()函数批量修改每一个元素数据的数据类型
# 从data数据集中按顺序抽取buffer_size个样本放在buffer中,然后打乱buffer中的样本。buffer中样本个数不足buffer_size,继续从data数据集中安顺序填充至buffer_size,此时会再次打乱。
db_train = db_train.shuffle(buffer_size=1000) # 打散db_train中的样本顺序,防止图片的原始顺序对神经网络性能的干扰。
print('db_train = {0},type(db_train) = {1}'.format(db_train, type(db_train)))
batch_size_train = 2000 # 每个batch里的样本数量设置100-200之间合适。
db_batch_train = db_train.batch(batch_size_train) # 将db_batch_train中每sample_num_of_each_batch_train张图片分为一个batch,读取一个batch相当于一次性并行读取sample_num_of_each_batch_train张图片
print('db_batch_train = {0},type(db_batch_train) = {1}'.format(db_batch_train, type(db_batch_train)))
# 2.2 处理测试集:测试数据集不需要打乱顺序
db_val = tf.data.Dataset.from_tensor_slices((X_val, Y_val)) # 此步骤自动将numpy类型的数据转为tensor
db_val = db_val.map(preprocess) # 调用map()函数批量修改每一个元素数据的数据类型
batch_size_val = 2000 # 每个batch里的样本数量设置100-200之间合适。
db_batch_val = db_val.batch(batch_size_val) # 将db_val中每sample_num_of_each_batch_val张图片分为一个batch,读取一个batch相当于一次性并行读取sample_num_of_each_batch_val张图片
# 三、构建神经网络
# 1、卷积神经网络结构:Conv2D 表示卷积层,激活函数用 relu
conv_layers = [ # 5 units of conv + max pooling
# unit 1
layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), # 64个kernel表示输出的数据的channel为64,padding="same"表示自动padding使得输入与输出大小一致
layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
# unit 2
layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
# unit 3
layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
# unit 4
layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
# unit 5
layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same')
]
# 2、全连接神经网络结构:Dense 表示全连接层,激活函数用 relu
fullcon_layers = [
layers.Dense(300, activation=tf.nn.relu), # 降维:512-->300
layers.Dense(200, activation=tf.nn.relu), # 降维:300-->200
layers.Dense(100) # 降维:200-->100,最后一层一般不需要在此处指定激活函数,在计算Loss的时候会自动运用激活函数
]
# 3、构建卷积神经网络、全连接神经网络
conv_network = Sequential(conv_layers) # [b, 32, 32, 3] => [b, 1, 1, 512]
fullcon_network = Sequential(fullcon_layers) # [b, 1, 1, 512] => [b, 1, 1, 100]
conv_network.build(input_shape=[None, 32, 32, 3]) # 原始图片维度为:[32, 32, 3],None表示样本数量,是不确定的值。
fullcon_network.build(input_shape=[None, 512]) # 从卷积网络传过来的数据维度为:[b, 512],None表示样本数量,是不确定的值。
# 4、打印神经网络信息
conv_network.summary() # 打印卷积神经网络network的简要信息
fullcon_network.summary() # 打印神经网络network的简要信息
# 四、梯度下降优化器设置
optimizer = optimizers.Adam(lr=1e-4)
# 五、整体数据集进行一次梯度下降来更新模型参数,整体数据集迭代一次,一般用epoch。每个epoch中含有batch_step_no个step,每个step中就是设置的每个batch所含有的样本数量。
def train_epoch(epoch_no):
print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Training 阶段:开始++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))
for batch_step_no, (X_batch, Y_batch) in enumerate(db_batch_train): # 每次计算一个batch的数据,循环结束则计算完毕整体数据的一次梯度下降;每个batch的序号一般用step表示(batch_step_no)
print('epoch_no = {0}, batch_step_no = {1},X_batch.shpae = {2},Y_batch.shpae = {3}------------type(X_batch) = {4},type(Y_batch) = {5}'.format(epoch_no, batch_step_no + 1, X_batch.shape, Y_batch.shape, type(X_batch), type(Y_batch)))
Y_batch_one_hot = tf.one_hot(Y_batch, depth=100) # One-Hot编码,共有100类 [] => [b,100]
print('\tY_train_one_hot.shpae = {0}'.format(Y_batch_one_hot.shape))
# 梯度带tf.GradientTape:连接需要计算梯度的”函数“和”变量“的上下文管理器(context manager)。将“函数”(即Loss的定义式)与“变量”(即神经网络的所有参数)都包裹在tf.GradientTape中进行追踪管理
with tf.GradientTape() as tape:
# Step1. 前向传播/前向运算-->计算当前参数下模型的预测值
out_logits_conv = conv_network(X_batch) # [b, 32, 32, 3] => [b, 1, 1, 512]
print('\tout_logits_conv.shape = {0}'.format(out_logits_conv.shape))
out_logits_conv = tf.reshape(out_logits_conv, [-1, 512]) # [b, 1, 1, 512] => [b, 512]
print('\tReshape之后:out_logits_conv.shape = {0}'.format(out_logits_conv.shape))
out_logits_fullcon = fullcon_network(out_logits_conv) # [b, 512] => [b, 100]
print('\tout_logits_fullcon.shape = {0}'.format(out_logits_fullcon.shape))
# Step2. 计算预测值与真实值之间的损失Loss:交叉熵损失
MSE_Loss = tf.losses.categorical_crossentropy(Y_batch_one_hot, out_logits_fullcon, from_logits=True) # categorical_crossentropy()第一个参数是真实值,第二个参数是预测值,顺序不能颠倒
print('\tMSE_Loss.shape = {0}'.format(MSE_Loss.shape))
MSE_Loss = tf.reduce_mean(MSE_Loss)
print('\t求均值后:MSE_Loss.shape = {0}'.format(MSE_Loss.shape))
print('\t第{0}个epoch-->第{1}个batch step的初始时的:MSE_Loss = {2}'.format(epoch_no, batch_step_no + 1, MSE_Loss))
# Step3. 反向传播-->损失值Loss下降一个学习率的梯度之后所对应的更新后的各个Layer的参数:W1, W2, W3, B1, B2, B3...
variables = conv_network.trainable_variables + fullcon_network.trainable_variables # list的拼接: [1, 2] + [3, 4] => [1, 2, 3, 4]
# grads为整个全连接神经网络模型中所有Layer的待优化参数trainable_variables [W1, W2, W3, B1, B2, B3...]分别对目标函数MSE_Loss 在 X_batch 处的梯度值,
grads = tape.gradient(MSE_Loss, variables) # grads为梯度值。MSE_Loss为目标函数,variables为卷积神经网络、全连接神经网络所有待优化参数,
# grads, _ = tf.clip_by_global_norm(grads, 15) # 限幅:解决gradient explosion或者gradients vanishing的问题。
# print('\t第{0}个epoch-->第{1}个batch step的初始时的参数:'.format(epoch_no, batch_step_no + 1))
if batch_step_no == 0:
index_variable = 1
for grad in grads:
print('\t\tgrad{0}:grad.shape = {1},grad.ndim = {2}'.format(index_variable, grad.shape, grad.ndim))
index_variable = index_variable + 1
# 进行一次梯度下降
print('\t梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始')
optimizer.apply_gradients(zip(grads, variables)) # network的所有参数 trainable_variables [W1, W2, W3, B1, B2, B3...]下降一个梯度 w' = w - lr * grad,zip的作用是让梯度值与所属参数前后一一对应
print('\t梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束\n')
print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Training 阶段:结束++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))
# 六、模型评估 test/evluation
def evluation(epoch_no):
print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Evluation 阶段:开始++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))
total_correct, total_num = 0, 0
for batch_step_no, (X_batch, Y_batch) in enumerate(db_batch_val):
print('epoch_no = {0}, batch_step_no = {1},X_batch.shpae = {2},Y_batch.shpae = {3}'.format(epoch_no, batch_step_no + 1, X_batch.shape, Y_batch.shape))
# 根据训练模型计算测试数据的输出值out
out_logits_conv = conv_network(X_batch) # [b, 32, 32, 3] => [b, 1, 1, 512]
print('\tout_logits_conv.shape = {0}'.format(out_logits_conv.shape))
out_logits_conv = tf.reshape(out_logits_conv, [-1, 512]) # [b, 1, 1, 512] => [b, 512]
print('\tReshape之后:out_logits_conv.shape = {0}'.format(out_logits_conv.shape))
out_logits_fullcon = fullcon_network(out_logits_conv) # [b, 512] => [b, 100]
print('\tout_logits_fullcon.shape = {0}'.format(out_logits_fullcon.shape))
# print('\tout_logits_fullcon[:1,:] = {0}'.format(out_logits_fullcon[:1, :]))
# 利用softmax()函数将network的输出值转为0~1范围的值,并且使得所有类别预测概率总和为1
out_logits_prob = tf.nn.softmax(out_logits_fullcon, axis=1) # out_logits_prob: [b, 100] ~ [0, 1]
# print('\tout_logits_prob[:1,:] = {0}'.format(out_logits_prob[:1, :]))
out_logits_prob_max_index = tf.cast(tf.argmax(out_logits_prob, axis=1), dtype=tf.int32) # [b, 100] => [b] 查找最大值所在的索引位置 int64 转为 int32
# print('\t预测值:out_logits_prob_max_index = {0},\t真实值:Y_train_one_hot = {1}'.format(out_logits_prob_max_index, Y_batch))
is_correct_boolean = tf.equal(out_logits_prob_max_index, Y_batch.numpy())
# print('\tis_correct_boolean = {0}'.format(is_correct_boolean))
is_correct_int = tf.cast(is_correct_boolean, dtype=tf.float32)
# print('\tis_correct_int = {0}'.format(is_correct_int))
is_correct_count = tf.reduce_sum(is_correct_int)
print('\tis_correct_count = {0}\n'.format(is_correct_count))
total_correct += int(is_correct_count)
total_num += X_batch.shape[0]
print('total_correct = {0}---total_num = {1}'.format(total_correct, total_num))
acc = total_correct / total_num
print('第{0}轮Epoch迭代的准确度: acc = {1}'.format(epoch_no, acc))
print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Evluation 阶段:结束++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))
# 七、整体数据迭代多次梯度下降来更新模型参数
def train():
epoch_count = 1 # epoch_count为整体数据集迭代梯度下降次数
for epoch_no in range(1, epoch_count + 1):
print('\n\n利用整体数据集进行模型的第{0}轮Epoch迭代开始:**********************************************************************************************************************************'.format(epoch_no))
train_epoch(epoch_no)
evluation(epoch_no)
print('利用整体数据集进行模型的第{0}轮Epoch迭代结束:**********************************************************************************************************************************'.format(epoch_no))
if __name__ == '__main__':
train()
打印结果:
X_train.shpae = (50000, 32, 32, 3),Y_train.shpae = (50000, 1)------------type(X_train) = <class 'numpy.ndarray'>,type(Y_train) = <class 'numpy.ndarray'>
X_train.shpae = (50000, 32, 32, 3),Y_train.shpae = (50000,)------------type(X_train) = <class 'numpy.ndarray'>,type(Y_train) = <class 'tensorflow.python.framework.ops.EagerTensor'>
db_train = <ShuffleDataset shapes: ((32, 32, 3), ()), types: (tf.float32, tf.int32)>,type(db_train) = <class 'tensorflow.python.data.ops.dataset_ops.ShuffleDataset'>
db_batch_train = <BatchDataset shapes: ((None, 32, 32, 3), (None,)), types: (tf.float32, tf.int32)>,type(db_batch_train) = <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 32, 32, 64) 1792
_________________________________________________________________
conv2d_1 (Conv2D) (None, 32, 32, 64) 36928
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 16, 16, 128) 73856
_________________________________________________________________
conv2d_3 (Conv2D) (None, 16, 16, 128) 147584
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 8, 8, 128) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 8, 8, 256) 295168
_________________________________________________________________
conv2d_5 (Conv2D) (None, 8, 8, 256) 590080
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 256) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 4, 4, 512) 1180160
_________________________________________________________________
conv2d_7 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 2, 512) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 2, 2, 512) 2359808
_________________________________________________________________
conv2d_9 (Conv2D) (None, 2, 2, 512) 2359808
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 1, 1, 512) 0
=================================================================
Total params: 9,404,992
Trainable params: 9,404,992
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 300) 153900
_________________________________________________________________
dense_1 (Dense) (None, 200) 60200
_________________________________________________________________
dense_2 (Dense) (None, 100) 20100
=================================================================
Total params: 234,200
Trainable params: 234,200
Non-trainable params: 0
_________________________________________________________________
利用整体数据集进行模型的第1轮Epoch迭代开始:**********************************************************************************************************************************
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Training 阶段:开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 1, batch_step_no = 1,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第1个batch step的初始时的:MSE_Loss = 4.605105400085449
grad1:grad.shape = (3, 3, 3, 64),grad.ndim = 4
grad2:grad.shape = (64,),grad.ndim = 1
grad3:grad.shape = (3, 3, 64, 64),grad.ndim = 4
grad4:grad.shape = (64,),grad.ndim = 1
grad5:grad.shape = (3, 3, 64, 128),grad.ndim = 4
grad6:grad.shape = (128,),grad.ndim = 1
grad7:grad.shape = (3, 3, 128, 128),grad.ndim = 4
grad8:grad.shape = (128,),grad.ndim = 1
grad9:grad.shape = (3, 3, 128, 256),grad.ndim = 4
grad10:grad.shape = (256,),grad.ndim = 1
grad11:grad.shape = (3, 3, 256, 256),grad.ndim = 4
grad12:grad.shape = (256,),grad.ndim = 1
grad13:grad.shape = (3, 3, 256, 512),grad.ndim = 4
grad14:grad.shape = (512,),grad.ndim = 1
grad15:grad.shape = (3, 3, 512, 512),grad.ndim = 4
grad16:grad.shape = (512,),grad.ndim = 1
grad17:grad.shape = (3, 3, 512, 512),grad.ndim = 4
grad18:grad.shape = (512,),grad.ndim = 1
grad19:grad.shape = (3, 3, 512, 512),grad.ndim = 4
grad20:grad.shape = (512,),grad.ndim = 1
grad21:grad.shape = (512, 300),grad.ndim = 2
grad22:grad.shape = (300,),grad.ndim = 1
grad23:grad.shape = (300, 200),grad.ndim = 2
grad24:grad.shape = (200,),grad.ndim = 1
grad25:grad.shape = (200, 100),grad.ndim = 2
grad26:grad.shape = (100,),grad.ndim = 1
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 2,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第2个batch step的初始时的:MSE_Loss = 4.605042934417725
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 3,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第3个batch step的初始时的:MSE_Loss = 4.604988098144531
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 4,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第4个batch step的初始时的:MSE_Loss = 4.6049394607543945
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 5,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第5个batch step的初始时的:MSE_Loss = 4.604981899261475
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 6,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第6个batch step的初始时的:MSE_Loss = 4.604642391204834
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 7,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第7个batch step的初始时的:MSE_Loss = 4.604847431182861
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 8,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第8个batch step的初始时的:MSE_Loss = 4.604538440704346
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 9,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第9个batch step的初始时的:MSE_Loss = 4.604290008544922
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 10,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第10个batch step的初始时的:MSE_Loss = 4.603418827056885
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 11,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第11个batch step的初始时的:MSE_Loss = 4.603573322296143
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 12,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第12个batch step的初始时的:MSE_Loss = 4.6028618812561035
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 13,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第13个batch step的初始时的:MSE_Loss = 4.602911472320557
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 14,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第14个batch step的初始时的:MSE_Loss = 4.600880146026611
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 15,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第15个batch step的初始时的:MSE_Loss = 4.601419925689697
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 16,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第16个batch step的初始时的:MSE_Loss = 4.601880073547363
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 17,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第17个batch step的初始时的:MSE_Loss = 4.596737384796143
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 18,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第18个batch step的初始时的:MSE_Loss = 4.593438625335693
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 19,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第19个batch step的初始时的:MSE_Loss = 4.589395523071289
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 20,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第20个batch step的初始时的:MSE_Loss = 4.584603786468506
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 21,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第21个batch step的初始时的:MSE_Loss = 4.579631328582764
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 22,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第22个batch step的初始时的:MSE_Loss = 4.5727949142456055
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 23,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第23个batch step的初始时的:MSE_Loss = 4.568104267120361
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 24,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第24个batch step的初始时的:MSE_Loss = 4.55913782119751
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
epoch_no = 1, batch_step_no = 25,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>,type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
Y_train_one_hot.shpae = (2000, 100)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
MSE_Loss.shape = (2000,)
求均值后:MSE_Loss.shape = ()
第1个epoch-->第25个batch step的初始时的:MSE_Loss = 4.540274143218994
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):开始
梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables)):结束
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Training 阶段:结束++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Evluation 阶段:开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 1, batch_step_no = 1,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
is_correct_count = 51.0
epoch_no = 1, batch_step_no = 2,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
is_correct_count = 39.0
epoch_no = 1, batch_step_no = 3,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
is_correct_count = 41.0
epoch_no = 1, batch_step_no = 4,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
is_correct_count = 43.0
epoch_no = 1, batch_step_no = 5,X_batch.shpae = (2000, 32, 32, 3),Y_batch.shpae = (2000,)
out_logits_conv.shape = (2000, 1, 1, 512)
Reshape之后:out_logits_conv.shape = (2000, 512)
out_logits_fullcon.shape = (2000, 100)
is_correct_count = 47.0
total_correct = 221---total_num = 10000
第1轮Epoch迭代的准确度: acc = 0.0221
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Evluation 阶段:结束++++++++++++++++++++++++++++++++++++++++++++
利用整体数据集进行模型的第1轮Epoch迭代结束:**********************************************************************************************************************************
Process finished with exit code 0
参考资料:
彻底搞懂感受野的含义与计算
CS231n Convolutional Neural Networks for Visual Recognition
How convolutional neural networks see the world
CNN visualization toolkit:Understanding Neural Networks Through Deep Visualization
CNN visualization toolkit:3D Visualization of a Convolutional Neural Network
How to let machine draw an image: (PixelRNN) Pixel Recurrent Neural Networks
How to let machine draw an image:(VAE) Auto-Encoding Variational Bayes
How to let machine draw an image:(GAN) Generative Adversarial Networks
激活函数博客01
激活函数博客02
激活函数博客03
激活函数博客04
卷积神经网络(CNN)的参数优化方法
卷积神经网络的训练过程
从神经网络到卷积神经网络(CNN)
卷积核(kernel)和过滤器(filter)的区别
卷积神经网络(CNN)学习笔记
YJango的卷积神经网络——介绍