在训练模型的时候,通常要用到很多参数,而这些参数通常是通用的(如过滤器),我们当然不希望将通用的参数重复定义占用内存,于是我们可以用tensorflow提供的官方共享参数的方法:tf.variable_scope() 和tf.get_variable()
def my_image_filter(input_images):
conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
conv1 = tf.nn.conv2d(input_images, conv1_weights,
strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(conv1 + conv1_biases)
conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
conv2 = tf.nn.conv2d(relu1, conv2_weights,
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv2 + conv2_biases)
这里有四个不同的变量:conv1_weights,conv1_biases, conv2_weights, 和conv2_biases。
假设你想把你的图片过滤器运用到两张不同的图片 image1和image2,想通过拥有同一个参数的同一个过滤器来过滤两张图片。但是调用my_image_filter()两次,会产生两组变量。
# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)
共享变量的一种常见方法是在单独的代码段中创建它们,并将它们传递给使用它们的函数。 例如通过使用字典:
variables_dict = {
"conv1_weights": tf.Variable(tf.random_normal([5, 5, 32, 32]),
"conv1_biases": tf.Variable(tf.zeros([32]), name="conv1_biases")
... etc. ...
def my_image_filter(input_images, variables_dict):
conv1 = tf.nn.conv2d(input_images, variables_dict["conv1_weights"],
strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(conv1 + variables_dict["conv1_biases"])
conv2 = tf.nn.conv2d(relu1, variables_dict["conv2_weights"],
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv2 + variables_dict["conv2_biases"])
# The 2 calls to my_image_filter() now use the same variables
result1 = my_image_filter(image1, variables_dict)
result2 = my_image_filter(image2, variables_dict)
像上面这样在代码外面创建变量虽然很方便, 但是却破坏了封装:
解决此类问题的方法之一就是使用类来创建模块,在需要的地方使用类来小心地管理他们需要的变量. 一个更高明的做法,不用调用类,而是利用TensorFlow 提供了变量作用域机制,当构建一个视图时,很容易就可以共享命名过的变量。
def conv_relu(input, kernel_shape, bias_shape):
# Create variable named "weights".
weights = tf.get_variable("weights", kernel_shape,
# Create variable named "biases".
biases = tf.get_variable("biases", bias_shape,
conv = tf.nn.conv2d(input, weights,
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv + biases)
此这个函数使用“weights”和“biases”命名变量。 我们希望将它用于conv1和conv2,但变量需要具有不同的名称。 这就是tf.variable_scope()发挥作用的地方:它为各变量分配命名空间。
def my_image_filter(input_images):
with tf.variable_scope("conv1"):
# Variables created here will be named "conv1/weights", "conv1/biases".
relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
with tf.variable_scope("conv2"):
# Variables created here will be named "conv2/weights", "conv2/biases".
return conv_relu(relu1, [5, 5, 32, 32], [32])
result1 = my_image_filter(image1)
result2 = my_image_filter(image2)
# Raises ValueError(... conv1/weights already exists ...)
如你所见,tf.get_variable() 会检查已存在的变量是不是已经共享。如果你想共享它们,你需要通过如下设置reuse_variables()来指定它。
with tf.variable_scope("image_filters") as scope:
result1 = my_image_filter(image1)
result2 = my_image_filter(image2)
v = tf.get_variable(name, shape, dtype, initializer)
①tf.get_variable_scope().reuse == Fulse
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
assert v.name == "foo/v:0"
②tf.get_variable_scope().reuse == True
调用就会搜索一个已经存在的变量,它的全称和当前变量的作用域名+所提供的名字是否相等,如果不存在相应的变量,就会抛出ValueError 错误,如果变量找到了,就返回这个变量,如下:
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
with tf.variable_scope("foo", reuse=True):
v1 = tf.get_variable("v", [1])
assert v1 == v
with tf.variable_scope("foo"):
with tf.variable_scope("bar"):
v = tf.get_variable("v", [1])
assert v.name == "foo/bar/v:0"
当前变量作用域可以用tf.get_variable_scope()进行检索并且可以通过调用tf.get_variable_scope().reuse_variables()把reuse 标签设置为True。
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
v1 = tf.get_variable("v", [1])
assert v1 == v
可以输入一个重用变量作用域,然后就释放掉,就成为非重用的变量,reuse 参数是不可继承,所以当你打开一个重用变量作用域,那么所有的子作用域也将会被重用:
with tf.variable_scope("root"):
# At start, the scope is not reusing.
assert tf.get_variable_scope().reuse == False
with tf.variable_scope("foo"):
# Opened a sub-scope, still not reusing.
assert tf.get_variable_scope().reuse == False
with tf.variable_scope("foo", reuse=True):
# Explicitly opened a reusing scope.
assert tf.get_variable_scope().reuse == True
with tf.variable_scope("bar"):
# Now sub-scope inherits the reuse flag.
assert tf.get_variable_scope().reuse == True
# Exited the reusing scope, back to a non-reusing one.
assert tf.get_variable_scope().reuse == False
还可以为变量作用域定义(或简称成)一个对象,使用as 关键字:
with tf.variable_scope("foo") as foo_scope:
v = tf.get_variable("v", [1])
with tf.variable_scope(foo_scope)
w = tf.get_variable("w", [1])
with tf.variable_scope(foo_scope, reuse=True)
v1 = tf.get_variable("v", [1])
w1 = tf.get_variable("w", [1])
assert v1 == v
assert w1 == w
with tf.variable_scope("foo") as foo_scope:
assert foo_scope.name == "foo"
with tf.variable_scope("bar")
with tf.variable_scope("baz") as other_scope:
assert other_scope.name == "bar/baz"
with tf.variable_scope(foo_scope) as foo_scope2:
assert foo_scope2.name == "foo" # Not changed.
还可以为变量作用域的所有变量设置一个默认初始化器,它可以被子作用域继承并传递给tf.get_variable() 调用,也可以被重写:
with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)):
v = tf.get_variable("v", [1])
assert v.eval() == 0.4 # Default initializer as set above.
w = tf.get_variable("w", [1], initializer=tf.constant_initializer(0.3)):
assert w.eval() == 0.3 # Specific initializer overrides the default.
with tf.variable_scope("bar"):
v = tf.get_variable("v", [1])
assert v.eval() == 0.4 # Inherited default initializer.
with tf.variable_scope("baz", initializer=tf.constant_initializer(0.2)):
v = tf.get_variable("v", [1])
assert v.eval() == 0.2 # Changed default initializer.
用with tf.variable_scope("name")时,会间接地开启一个tf.name_scope("name"),比如:
with tf.variable_scope("foo"):
x = 1.0 + tf.get_variable("v", [1])
assert x.op.name == "foo/add"
with tf.variable_scope("foo"):
with tf.name_scope("bar"):
v = tf.get_variable("v", [1])
x = 1.0 + v
assert v.name == "foo/v:0"
assert x.op.name == "foo/bar/add"