1.首先保存模型以及值:
import tensorflow as tf
input = tf.placeholder(tf.float32, [], 'input')
with tf.name_scope('hans'):
weights = tf.get_variable('weights', [], tf.float32, tf.ones_initializer(tf.float32))#weights.name:weights:0
ema = tf.train.ExponentialMovingAverage(0.5)
# update = tf.assign(weights, 2)
output = tf.add(input, weights)#output.name: hans/add:0
m_op = ema.apply([output]) #更新hans/add/ExponentialMovingAverage:0
with tf.control_dependencies([m_op]):
y = tf.identity(ema.average(output),'y')#y:0 是output的滑动平均值
print(y.name)#hans/y:0
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(ema.average(output)))#初值为0
print(sess.run(y, {input: 19}))#此时output为9+1=10
saver.save(sess, './checkpoint/model0.ckpt')#保存所有值
0.0
10.0
运行完以上代码后,output的值是不会被保存的,因为他依赖于一个placeholder,所以载入模型后直接sess.run(output)是会报错的;然而output的滑动变量是被保存下来的,可以直接通过ema.average(output)来调用。
2.然后加载模型,按照原代码进行加载
import tensorflow as tf
input = tf.placeholder(tf.float32, [], 'input')
with tf.name_scope('hans'):
weights = tf.get_variable('weights', [], tf.float32, tf.ones_initializer(tf.float32))#weights.name:weights:0
ema = tf.train.ExponentialMovingAverage(0.5)
# update = tf.assign(weights, 2)
output = tf.add(input, weights)#output.name: hans/add:0
m_op = ema.apply([output]) #更新hans/add/ExponentialMovingAverage:0
with tf.control_dependencies([m_op]):
y = tf.identity(ema.average(output),'y')#y:0 是output的滑动平均值
print(y.name)#hans/y:0
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, './checkpoint/model0.ckpt')
print(sess.run(ema.average(output)))#值应为保存的值,及为10
print(sess.run(y, {input: 100}))
print(sess.run(ema.average(output)))#值已经更新
10.0
55.5
55.5
所以在模型的测试阶段,如果想用之前保存好的average值,就直接用ema.average调用,不要apply。
3.此外,name_scope也要注意如果在测试阶段将name_scope丢掉的话,由于之前保存的滑动平均值的名字是hans/add/ExponentialMovingAverage:0,丢到了name_scope('hans')的话,此时的output的滑动平均值的名字为add/ExponentialMovingAverage:0,那么在载入模型的时候,原来的值则找不到对应名字的变量,所以,如果这时候你直接载入值得话会报错,错误代码如下:
import tensorflow as tf
input = tf.placeholder(tf.float32, [], 'input')
weights = tf.get_variable('weights', [], tf.float32, tf.ones_initializer(tf.float32))#weights.name:weights:0
ema = tf.train.ExponentialMovingAverage(0.5)
# update = tf.assign(weights, 2)
output = tf.add(input, weights)#output.name: hans/add:0
m_op = ema.apply([output]) #更新hans/add/ExponentialMovingAverage:0
with tf.control_dependencies([m_op]):
y = tf.identity(ema.average(output),'y')#y:0 是output的滑动平均值
print(y.name)#hans/y:0
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, './checkpoint/model0.ckpt')
print(sess.run(ema.average(output)))#值应为保存的值,及为10
print(sess.run(y, {input: 100}))
print(sess.run(ema.average(output)))#值已经更新
修改代码的方法可以加上一开始的name_scope('hans'),也可以直接改变output的名字,则把相应代码修改为:
output = tf.add(input, weights,'hans/Add')
4.总而言之,在多gpu训练后,测试网络时仍然需要重构网络,因为里面的滑动平均值会找不到对应的变量而报错,但是在多gpu训练是,各个tower上保存的用于batch_normalizaton的均值和方差的滑动变量各不相同,在测试时如何处理这些不同的滑动平均值呢?