基于 mnist 单标签单分类
测试多种 loss ,分析不同 loss 对准确率的影响
测试 loss 种类
softmax_cross_entropy
hinge_loss
sigmoid_cross_entropy
huber_loss
mean_squared_error
mean_pairwise_squared_error
初始参数
batch_size 10000
learning_rate 0.01
神经元 tf.nn.elu
训练集 mnist
train_acc 上升速度排名,按照第 10 epoch 的情况进行排列
softmax_cross_entropy 0.9797
hinge_loss 0.9380
sigmoid_cross_entropy 0.9300
huber_loss 0.9200
mean_squared_error 0.8863
mean_pairwise_squared_error 0.8789
train_acc 最终准确率排名,500 epoch
softmax_cross_entropy 0.9999
huber_loss 0.9997
mean_pairwise_squared_error 0.9993
mean_squared_error 0.9986
hinge_loss 0.9978
sigmoid_cross_entropy 0.9974
val_acc 最终准确率排名,500 epoch
mean_pairwise_squared_error 0.9841
mean_squared_error 0.9829
sigmoid_cross_entropy 0.9813
huber_loss 0.9794
hinge_loss 0.9792
softmax_cross_entropy 0.9789
初步结论
softmax_cross_entropy 在当中 train_acc 准确率最好,上升速度最快
mean_pairwise_squared_error 在当中 val_acc 准确率最好
测试网络源码
import tensorflow as tf
import tensorlayer as tl
sess = tf.InteractiveSession()
# prepare data
X_train, y_train, X_val, y_val, X_test, y_test = \
tl.files.load_mnist_dataset(shape=(-1,784))
# define placeholder
x = tf.placeholder(tf.float32, shape=[None, 784], name='x')
y_ = tf.placeholder(tf.int64, shape=[None], name='y_')
# define the network
network = tl.layers.InputLayer(x, name='input')
network = tl.layers.DenseLayer(network, 400, tf.nn.elu, name='relu1')
network = tl.layers.BatchNormLayer(network, is_train=True, name='batchnorm1')
network = tl.layers.DenseLayer(network, 400, tf.nn.elu, name='relu2')
network = tl.layers.BatchNormLayer(network, is_train=True, name='batchnorm2')
# network = tl.layers.DenseLayer(network, 400, tf.nn.leaky_relu, name='relu3')
# network = tl.layers.BatchNormLayer(network, is_train=True, name='batchnorm3')
# network = tl.layers.DenseLayer(network, 400, tf.nn.leaky_relu, name='relu4')
# network = tl.layers.BatchNormLayer(network, is_train=True, name='batchnorm4')
# network = tl.layers.DenseLayer(network, n_units=10, act=tf.identity, name='output')
network = tl.layers.DenseLayer(network, n_units=10, act=tf.nn.elu, name='output')
# define cost function and metric.
# oa, ob = tf.split(network.outputs, 2, 1)
# y = tf.argmax(oa, 1) + tf.argmax(ob, 1)
y = network.outputs
# cost = tf.losses.huber_loss(tf.one_hot(y_, 10), network.outputs)
# cost = tf.losses.hinge_loss(tf.one_hot(y_, 10), network.outputs)
# cost = tf.losses.mean_pairwise_squared_error(tf.one_hot(y_, 10), network.outputs)
# cost = tf.losses.softmax_cross_entropy(tf.one_hot(y_, 10), network.outputs)
# cost = tf.losses.sigmoid_cross_entropy(tf.one_hot(y_, 10), network.outputs)
cost = tf.losses.mean_squared_error(tf.one_hot(y_, 10), y)
correct_prediction = tf.equal(tf.argmax(y, 1), y_)
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# define the optimizer
train_params = network.all_params
train_op = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost, var_list=train_params)
# initialize all variables in the session
tl.layers.initialize_global_variables(sess)
# tl.files.load_and_assign_npz(sess, 'model.npz', network)
# print network information
network.print_params()
network.print_layers()
# train the network
tl.utils.fit(
sess, network, train_op, cost, X_train, y_train, x, y_, acc=acc, batch_size=10000, n_epoch=500, print_freq=10, X_val=X_val, y_val=y_val, eval_train=True, tensorboard=True)
# evaluation
#tl.utils.test(sess, network, acc, X_test, y_test, x, y_, batch_size=5000, cost=cost)
# save the network to .npz file
tl.files.save_npz(network.all_params, name='model.npz')
sess.close()
原始测试结果下载
天翼云:
https://cloud.189.cn/t/UfE7FzFbeiIz (test_mnist_tf1.7_multi_losses_result.7z)