Training and Evaluation Routines in TF-Slim

Training Routines in TF-Slim

Training Tensorflow model requirements
- a model represented has a computational graph.
- a loss function to minimize and optimize over.
- the gradient computation of the model weights relative to the loss to perform backpropagation of the error signal.
- a training routine that iteratively does all of the above and updates the weights accordingly.

# load data
images, labels = LoadData(...)

# Create a model and make predictions
predictions = MyModel(images)

# Define a losses function
slim.losses.log_loss(predictions, labels)

# Get total model loss and regularization loss
total_loss = slim.losses.get_total_loss()

# Define Optimization method (SGD, Momentum, RMSProp, AdaGrad, Adam optimizer)
optimizer = tf.train.GradientDescentOptimizer(learning_rate)

# create_train_op at each steps:
# compute loss, comput gradients and compute update_ops
train_op = slim.learning.create_train_op(total_loss, optimizer)

# Where checkpoints and event files are stored.
logdir = "/logdir/path" 

slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000, # number of gradient steps
    save_summaries_secs=60, # compute summaries every 60 secs
    save_interval_secs=300) # save model checkpoint every 5 min

Evaluation Routines in TF-Slim

It is important to monitor the ‘health’ of the training because optimization could stop functioning properly.

For example for the following reasons:

  • overfitting (use early stopping, regularization, more data)
  • vanishing or exploding gradients (clip gradient norm, change activation function, residual skip connection)
  • non-converging learning (bad initialization, large learning rate, tune optimizer, bug in network)
  • reaching local minima (update learning rate, dropout)
  • covariate shift in very deep network (batch normalization)
  • low performance, high bias (modify model architecture, larger network)

Evaluation for a Single Run

In the simplest use case, we use a model to create the predictions, then specify
the metrics and finally call the evaluation method:
slim.evaluation.evaluation() will perform a single evaluation run.

# Create model and obtain the predictions:
images, labels = LoadData(...)
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "accuracy": slim.metrics.accuracy(predictions, labels),
    "mse": slim.metrics.mean_squared_error(predictions, labels),
})

# Initialize variables
inital_op = tf.group(
    tf.global_variables_initializer(),
    tf.local_variables_initializer())

with tf.Session() as sess:
    # Run evaluation
    metric_values = slim.evaluation.evaluation(
        sess,
        num_evals=10,
        inital_op=initial_op,
        eval_op=names_to_updates.values(),
        final_op=name_to_values.values())

    # print final metric values
    for metric, value in zip(names_to_values.keys(), metric_values):
        logging.info('Metric %s has value: %f', metric, value)

Evaluating a Checkpointed Model with Metrics

Often, one wants to evaluate a model checkpoint saved on disk.

The evaluation can be performed periodically during training on a set schedule.

Instead of calling the evaluation() method, we now call evaluation_loop() method. We now provide in addition the logging and checkpoint directory, as well as, a evaluation time interval.

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(predictions, targets),
})


# Define the summaries to write:
for metric_name, metric_value in names_to_values.iteritems():
    tf.summary.scalar(metric_name, metric_value)

# Define other summaries to write (loss, activations, gradients)
tf.summary.scalar(...)
tf.summary.histogram(...)

checkpoint_dir = '/tmp/my_model_dir/'
log_dir = '/tmp/my_model_eval/'

# evaluate for 1000 batches:
num_evals = 1000

# Setup the global step.
slim.get_or_create_global_step()

slim.evaluation.evaluation_loop(
    master='',
    checkpoint_dir,
    log_dir,
    num_evals=num_evals,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops), # Merge summaries (list of summary operations)
    eval_interval_secs=600) # How often to run the evaluation

Evaluating at a Given Checkpoint.

When a model has already been trained, and we only wish to evaluate it from its last checkpoint, TF-Slim has provided us with a method calle evaluate_once(). It only evaluates the model at the given checkpoint path.

logits, nodes = CNN_model(inputs, dropout = 0.5, is_training=False)
predictions = tf.argmax(logits, 1)

# Define streaming metrics
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'eval/Accuracy': slim.metrics.streaming_accuracy(predictions, targets),
    'eval/Recall@3': slim.metrics.streaming_sparse_recall_at_k(
            tf.to_float(logits), tf.expand_dims(targets,1), 3),
    'eval/Precision': slim.metrics.streaming_precision(predictions, targets),
    'eval/Recall': slim.metrics.streaming_recall(predictions, targets)
})


print('Running evaluation Loop...')
# Only load latest checkpoint
checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)

metric_values = slim.evaluation.evaluate_once(
    num_evals=num_evals,
    master='',
    checkpoint_path=checkpoint_path,
    logdir=checkpoint_dir,
    eval_op=names_to_updates.values(),
    final_op=names_to_values.values())

# print final metric values
names_to_values = dict(zip(names_to_values.keys(), metric_values))
for name in names_to_values:
    print('%s: %f' % (name, names_to_values[name]))

你可能感兴趣的:(Training and Evaluation Routines in TF-Slim)