1.梯度反转的目标是:
(1)正向传播时传递权值不变
(2)反向传播时,神经元权值增量符号取反,即与目标函数方向切好相反达到对抗的目的
2.梯度反转的实现:
import tensorflow as tf
from tensorflow.python.framework import ops
class FlipGradientBuilder(object):
def __init__(self):
self.num_calls = 0
def __call__(self, x, l=1.0):
grad_name = "FlipGradient%d" % self.num_calls
@ops.RegisterGradient(grad_name)
def _flip_gradients(op, grad):
return [tf.negative(grad) * l]
g = tf.get_default_graph()
with g.gradient_override_map({"Identity": grad_name}):
y = tf.identity(x)
self.num_calls += 1
return y
flip_gradient = FlipGradientBuilder()
其中
(1)@ops.RegisterGradient(grad_name)修饰 _flip_gradients(op, grad)函数,即自定义该层的梯度取反
(2)gradient_override_map函数主要用于解决使用自己定义的函数方式来求梯度的问题,gradient_override_map函数的参数值为一个字典。其表示的意思是:字典中value表示使用该值表示的函数代替key表示的函数进行梯度运算。
import tensorflow as tf
from flip_gradient import flip_gradient
x=tf.placeholder(tf.float32,shape=[1,1])
dec=tf.placeholder(tf.float32,shape=[1,1])
w=tf.get_variable(name='w',shape=[1,1])
b=tf.get_variable(name='b',shape= [1])
y1=tf.matmul(x,w)+b
dom_reps_grl = flip_gradient(y1)
# dom_reps_grl=tf.negative(y1)
w2=tf.get_variable(name='w2',shape=[1,1])
b2=tf.get_variable(name='b2',shape= [1])
y=tf.matmul(dom_reps_grl,w2)+b2
with tf.Session() as sess:
a=[[2.]]
d=[[2]]
sess.run(tf.global_variables_initializer())
print('w2:',sess.run(w2))
print('b2:',sess.run(b2))
print('w1:', sess.run(w))
loss = tf.square(dec - y)
print('y1:',sess.run(y1,feed_dict={x:a,dec:d}))
print('dom:', sess.run(dom_reps_grl, feed_dict={x: a, dec: d}))
print('loss1:',sess.run(loss, feed_dict={x: a, dec: d}))
lossw2 = -2 * (dec - y) * (w * x+b)
lossw =-2* (dec - y)*(w2*x)
print('lossw2:', sess.run(lossw2, feed_dict={x: a, dec: d}))
print('lossw:', sess.run(lossw, feed_dict={x: a, dec: d}))
train_op = tf.train.GradientDescentOptimizer(1).minimize(loss)
grad_W= tf.gradients(xs=w2, ys=loss)
grad_b = tf.gradients(xs=b2, ys=loss)
grad_w1 = tf.gradients(xs=w, ys=loss)
print('w_grad:',sess.run(grad_W, feed_dict={x: a, dec: d}))
print('bias_grad:', sess.run(grad_b, feed_dict={x: a, dec: d}))
print('w1_grad:', sess.run(grad_w1, feed_dict={x: a, dec: d}))
#
sess.run(train_op, feed_dict={x: a, dec: d})
# #
print('w2:',sess.run(w2))
print('b2:', sess.run(b2))
print('w1:', sess.run(w))
(1)定义函数
(2) 前向传播中调用梯度反转函数,即对y1取反
dom_reps_grl = flip_gradient(y1)
(3)loss取l2范数最小化,理论上
w2的梯度增量值为
w1的梯度增量值为
(4)使用tf.gradients()观察实际梯度值
grad_W= tf.gradients(xs=w2, ys=loss) grad_w1 = tf.gradients(xs=w, ys=loss) print('w_grad:',sess.run(grad_W, feed_dict={x: a, dec: d})) print('w1_grad:', sess.run(grad_w1, feed_dict={x: a, dec: d}))
4.运行结果
理论上w2的梯度增量为2.8316033,实际为2.8316033
理论上w1的梯度增量为4.7898517,实际为-4.7898517,关于y1的权值成功实现梯度反转