学习 Tensorflow实践
Table of Contents
损失函数(loss)
一、激活函数 activation function
二、NN复杂度:多用NN层数和N参数的个数表示
NN优化目标:loss最小
交叉熵ce
softmax函数
学习率learning_rate:每次参数更新的幅度
指数衰减学习率
引入激活函数,可以有效避免XW的纯线性组合,提高模型的表达力,使模型更有区分度
1、relu激活函数 用tf.nn.relu()表示
2、sigmoid激活函数 用tf.nn.sigmoid()表示
3、tanh激活函数 用tf.nn.tamh()表示
计算神经网络层数时只计算有计算能力的层,所以不计算输入层
层数=隐藏层的层数+1个输出层
总参数=总W+总b
上图3*4+4 + 4*2+2=26
主流loss计算:
——————————————————————————————————————————
拟造数据集 X,Y_ X中有x1,x2 y_=x1+x2 噪声-0.05~0.05 拟合可以预测y的函数
import tensorflow as tf
import numpy as np
BATCH_SIZE = 8 #每次喂入神经网络的特征数量
seed=23455
rdm= np.random.RandomState(seed)
X=rdm.rand(32,2)
Y_=[[x1+x2+(rdm.rand()/10.0-0.05)] for (x1,x2) in X]
定义神经网络输入输出及前向传播过程
x = tf.placeholder(tf.float32,shape=(None,2))
y_= tf.placeholder(tf.float32,shape=(None,1))
w1 = tf.Variable(tf.random_normal([2,1],stddev=1,seed=1))
y=tf.matmul(x,w1)
loss_mse = tf.reduce_mean(tf.square(y_-y))
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss_mse)
生成会话 训练STEPS轮
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
STEPS =20000
for i in range(STEPS):
start=(i*BATCH_SIZE)%32
end = (i*BATCH_SIZE)% 32 +BATCH_SIZE
sess.run(train_step,feed_dict={x:X[start:end],y_:Y_[start:end]})
if (i%431) == 0 :
print(start,end)
print("after",i)
print(sess.run(w1))
print("final",sess.run(w1))
——————————————————————————————————————
表征两个概率分布之间的距离
eg:已知答案y_=(1,0) 预测y1=(0.6,0.4) y2=(0.8,0.2)哪个更接近标准答案
ce = -tf.reduce_mean(y_*tf.log(tf.clip_by_value(y,1e-12,1.0)))
#y小于1e-12为1e-12 大于1.0为1.0
当n分类的n个输出 通过softmax 函数,便满足了概率分布的要求 使得每一个元素的范围都在(0,1)之间,并且所有元素的和为1
可以看这张图片来差不多理解softmax函数的作用
输出经过softmax函数 使其满足概率分布后在于标准答案求交叉熵 输出为cem 即为loss
ce=tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y,labels=tf.argmax(y_,1))
cem = tf.reduce_mean(ce)
更新后的参数 = 当前参数-学习率*损失函数的梯度(导数)
设定loss=square(w+1) w初值为5 反向传播求最优w 即求最小loss对应的w值 w值最小应为-1
#coding:utf-8
import tensorflow as tf
w= tf.Variable(tf.constant(5,dtype=tf.float32))
loss = tf.square(w+1)
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
for i in range(50):
sess.run(train_step)
w_val=sess.run(w)
loss_val = sess.run(loss)
if (i%5==0):
print ("after ",i," w is ",w_val," loss is",loss_val)
#learning_rate=0.2
# after 0 w is 2.6 loss is 12.959999
# after 5 w is -0.720064 loss is 0.07836417
# after 10 w is -0.9782322 loss is 0.0004738369
# after 15 w is -0.99830735 loss is 2.8650732e-06
# after 20 w is -0.9998684 loss is 1.7320417e-08
# after 25 w is -0.99998975 loss is 1.0510348e-10
# after 30 w is -0.9999992 loss is 6.004086e-13
# after 35 w is -0.99999994 loss is 3.5527137e-15
# after 40 w is -0.99999994 loss is 3.5527137e-15
# after 45 w is -0.99999994 loss is 3.5527137e-15
#learning_rate=1
# after 0 w is -7.0 loss is 36.0
# after 5 w is 5.0 loss is 36.0
# after 10 w is -7.0 loss is 36.0
# after 15 w is 5.0 loss is 36.0
# after 20 w is -7.0 loss is 36.0
# after 25 w is 5.0 loss is 36.0
# after 30 w is -7.0 loss is 36.0
# after 35 w is 5.0 loss is 36.0
# after 40 w is -7.0 loss is 36.0
# after 45 w is 5.0 loss is 36.0
学习率过大震荡不收敛,学习率小了收敛速度慢
learning_rate = LEARNING_RATE_BASE * LEARNING_RATE_DECAY ^ (global_step/LEARNING_RATE_STEP)
import tensorflow as tf
LEARNING_RATE_BASE = 0.1
LEARNING_RATE_DECAY = 0.99
LEARNING_RATE_STEP = 1 #喂入多少轮BATCHSIZE后更新一次学习率
trainable=False为不被训练
global_step = tf.Variable(0,trainable=False)
#定义指数下降学习率
learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE,global_step,LEARNING_RATE_STEP,LEARNING_RATE_DECAY,staircase=True)
w=tf.Variable(tf.constant(5,dtype=tf.float32))
loss = tf.square(w+1)
train_step=tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step)
生成会话,训练40轮