[翻译]TensorBoard

点击查看原文地址

使用TensorFlow训练很庞大的深度学习神经网络是很复杂也很困难的。为了便于理解,调试和优化TF程序,我们开发了一套用于可视化的工具,叫做TensorBoard。可以用TensorBoard来可视化TensorFlwo的graph,量化图的执行过程,显示其他诸如图片之类的数据并通过它传递数据。

序列化数据

TensorBoard通过读取TensorFlow的事件(events)文件进行操作,该文件包括TensorFlow运行过程中生成的摘要数据(summary data)。下面是TensorBoard中摘要数据的生命周期。

首先,创建一个你想要从中搜集摘要数据的TensorFlow数据流图(TensorFlow graph),并决定哪些点是想要对其进行摘要操作(summary operations)的。

比如,假设要用CNN(卷积神经网络)训练一个MNIST识别手写数字集,可能比较关心学习率随时间的变化以及目标函数是如何变化的,就需要对该点进行tf.summary.scalar操作,分别输出学习率和损失。然后,给每个scalar_summary一个有意义的tag,比如learning rate或者loss function

可能你也想可视化每一层激活函数的分布、梯度分布或权重分布,则可以通过tf.summary.hsitogram操作来分别显示梯度和有权重的变量。

更详细的summary operations请见summary operations

TensorFlow不会执行任何操作,直到你调用了run或者有一个run依赖于它们输出。我们刚才创建的summary nodes属于graph之外,你当前运行的操作也不会依赖与它。所以,为了产生summaries,我们需要运行所有的summary nodes。手动管理她们是很烦人的,所以我们用tf.summary.merge_all来把它们连成一个操作,生成所有的summary data。

接下来,你可以只运行合并好的summary操作,它会在一个给定步骤用所有summary data生成一个序列化的summary protobuf对象。最后,将这summary data写入disk, 讲summary protobuf传给tf.summary.FileWriter

这个FileWriter的构造函数里有一个logdir,它很重要,是所有写出数据的目录。FileWriter也能传Graph作为可选参数。如果传递了Graph对象,TensorBoard就会可视化Graph,并带有tensor shape信息。更多Tensor shape information请见此

现在你修改了graph,也有了FileWriter,可以开始运行你的网路了。只要你想,可以每一步都运行合并summary操作,记录训练数据,尽管这可能会有多余数据。相反,可以考虑每n步运行一次summary op。

下面的代码示例是简单MINIST教程的修改,我们在其中添加了一些summary ops, 并且每一步都运行了它。如果你运行它并且打开teensorboard --logdir=/tmp/mnist_logs,就能看到统计图,比如训练过程中权重和精度是如何变化的。下面的代码是一段摘录,源代码在这里。

def variable_summaries(var):
  """Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
  with tf.name_scope('summaries'):
    mean = tf.reduce_mean(var)
    tf.summary.scalar('mean', mean)
    with tf.name_scope('stddev'):
      stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
    tf.summary.scalar('stddev', stddev)
    tf.summary.scalar('max', tf.reduce_max(var))
    tf.summary.scalar('min', tf.reduce_min(var))
    tf.summary.histogram('histogram', var)

def nn_layer(input_tensor, input_dim, output_dim, layer_name, act=tf.nn.relu):
  """Reusable code for making a simple neural net layer.

  It does a matrix multiply, bias add, and then uses relu to nonlinearize.
  It also sets up name scoping so that the resultant graph is easy to read,
  and adds a number of summary ops.
  """
  # Adding a name scope ensures logical grouping of the layers in the graph.
  with tf.name_scope(layer_name):
    # This Variable will hold the state of the weights for the layer
    with tf.name_scope('weights'):
      weights = weight_variable([input_dim, output_dim])
      variable_summaries(weights)
    with tf.name_scope('biases'):
      biases = bias_variable([output_dim])
      variable_summaries(biases)
    with tf.name_scope('Wx_plus_b'):
      preactivate = tf.matmul(input_tensor, weights) + biases
      tf.summary.histogram('pre_activations', preactivate)
    activations = act(preactivate, name='activation')
    tf.summary.histogram('activations', activations)
    return activations

hidden1 = nn_layer(x, 784, 500, 'layer1')

with tf.name_scope('dropout'):
  keep_prob = tf.placeholder(tf.float32)
  tf.summary.scalar('dropout_keep_probability', keep_prob)
  dropped = tf.nn.dropout(hidden1, keep_prob)

# Do not apply softmax activation yet, see below.
y = nn_layer(dropped, 500, 10, 'layer2', act=tf.identity)

with tf.name_scope('cross_entropy'):
  # The raw formulation of cross-entropy,
  #
  # tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)),
  #                               reduction_indices=[1]))
  #
  # can be numerically unstable.
  #
  # So here we use tf.nn.softmax_cross_entropy_with_logits on the
  # raw outputs of the nn_layer above, and then average across
  # the batch.
  diff = tf.nn.softmax_cross_entropy_with_logits(targets=y_, logits=y)
  with tf.name_scope('total'):
    cross_entropy = tf.reduce_mean(diff)
tf.summary.scalar('cross_entropy', cross_entropy)

with tf.name_scope('train'):
  train_step = tf.train.AdamOptimizer(FLAGS.learning_rate).minimize(
      cross_entropy)

with tf.name_scope('accuracy'):
  with tf.name_scope('correct_prediction'):
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
  with tf.name_scope('accuracy'):
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
tf.summary.scalar('accuracy', accuracy)

# Merge all the summaries and write them out to /tmp/mnist_logs (by default)
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/train',
                                      sess.graph)
test_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/test')
tf.global_variables_initializer().run()

初始化完FileWriters后,我们必须在训练和测试模型时向FileWriters添加summaries。

# Train the model, and also write summaries.
# Every 10th step, measure test-set accuracy, and write test summaries
# All other steps, run train_step on training data, & add training summaries

def feed_dict(train):
  """Make a TensorFlow feed_dict: maps data onto Tensor placeholders."""
  if train or FLAGS.fake_data:
    xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data)
    k = FLAGS.dropout
  else:
    xs, ys = mnist.test.images, mnist.test.labels
    k = 1.0
  return {x: xs, y_: ys, keep_prob: k}

for i in range(FLAGS.max_steps):
  if i % 10 == 0:  # Record summaries and test-set accuracy
    summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False))
    test_writer.add_summary(summary, i)
    print('Accuracy at step %s: %s' % (i, acc))
  else:  # Record train set summaries, and train
    summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True))
    train_writer.add_summary(summary, i)

到此,用TensorBoard可视化数据已全部完成。

启动TensorBoard

用下面命令启动TensorBoard(或者python -m tensorflow.tensorboard

tensorboard --logdir=path/to/log-directory

logir指向FileWriter序列化数据的地方。如果logdir目录包含其他运行的子目录,TensorBoard将把全部数据可视化。TensorBoard运行后,用浏览器到localhost:6006查看。

打开TensorBoard,在右上角可以看到导航栏。每个tab代表能看到到一个序列化数据集。

更多如何使用graph tab可视化graph,请见这里

更多TensorBoard信息请见这里

你可能感兴趣的:([翻译]TensorBoard)