TensorFlow实现经典深度学习网络(1):TensorFlow实现AlexNet

TensorFlow实现经典深度学习网络(1):TensorFlow实现AlexNet


    本文介绍的经典卷积神经网络为AlexNet,它是由Hinton的学生Alex Krizhevsky提出。AlexNet可以说是在2012年被发表的一个非常经典之作,它可以算是LeNet的一种更深更宽的版本,并在当年取得了ImageNet最好成绩,点燃了深度学习这把火。也是在那年之后,更多的更深的神经网路被提出,比如优秀的VGGNet、Google InceptionNet、和ResNet。AlexNet将LeNet的思想发扬光大,把CNN的基本原理应用到了很深很宽的网络中。AlexNet可以说是神经网络在低谷期的第一次发声,确立来深度学习在计算机视觉的统治地位,碾压其他传统的hand-craft 特征方法,使得计算机视觉从业者从繁重的特征工程中解脱出来,转向思考能够从数据中自动提取需要的特征,做到数据驱动。

TensorFlow实现经典深度学习网络(1):TensorFlow实现AlexNet_第1张图片

        

         AlexNet共包含8个权重层,其中前5层为卷积层,后3层为全连接层。如上图所示,第1个及第2个卷积层后连有LRN层,不过此后的网络也证明LRN并非CNN中必须包含的层,甚至有些网络加入LRN后效果反而降低(读者可自行验证)。两个LRN层及最后层卷积层后都跟有最大池化层,并且各个层均连有ReLU激活函数。全连接层后使用了Dropout,能够随机忽略一部分神经单元,从而解决过拟合。

         AlexNet主要用到的新技术包括:

        (1)训练出当前最大规模的卷积神经网络,此前LeNet-5网络仅为3个卷积层及1个全连接层;

        (2)成功运用诸多Trick,如Dropout、ReLU、Data Augmentation、Maxpooling等解决深层神经网络存在的问题,使得该网络体现出优秀的性能

        (3)实现高效的GPU卷积运算结构

        因使用ImageNet数据集非常耗时,因此本文会对完整的AlexNet网络进行速度测试。若读者感兴趣,可自行下载ImageNet数据集进行训练测试。

        在准备工作就绪后,我们就可以搭建AlexNet网络。以下代码是本人根据自己的理解整理而成,并亲自验证,代码中标有自己理解所得的注释,若有错误请指正。

# -*- coding: utf-8 -*-
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# 导入常用库,载入TensorFlow
from datetime import datetime
import math
import time
import tensorflow as tf

# 设置参数
batch_size = 32
num_batches = 100

# 定义显示网络结构的函数,展示输出tensor的尺寸
def print_activations(t):
    print(t.op.name, ' ', t.get_shape().as_list())

# 设计AlexNet网络结构
def inference(images):
    parameters = []
    # conv1
    with tf.name_scope('conv1') as scope:
        kernel = tf.Variable(tf.truncated_normal([11, 11, 3, 64], dtype=tf.float32,
                                                 stddev=1e-1), name='weights')
        conv = tf.nn.conv2d(images, kernel, [1, 4, 4, 1], padding='SAME')
        biases = tf.Variable(tf.constant(0.0, shape=[64], dtype=tf.float32),
                             trainable=True, name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv1 = tf.nn.relu(bias, name=scope)
        print_activations(conv1)
        parameters += [kernel, biases]

    # LRN层和pool1
    lrn1 = tf.nn.lrn(conv1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75, name='lrn1')
    pool1 = tf.nn.max_pool(lrn1,
                           ksize=[1, 3, 3, 1],
                           strides=[1, 2, 2, 1],
                           padding='VALID',
                           name='pool1')
    print_activations(pool1)

    # conv2
    with tf.name_scope('conv2') as scope:
        kernel = tf.Variable(tf.truncated_normal([5, 5, 64, 192], dtype=tf.float32,
                                                 stddev=1e-1), name='weights')
        conv = tf.nn.conv2d(pool1, kernel, [1, 1, 1, 1], padding='SAME')
        biases = tf.Variable(tf.constant(0.0, shape=[192], dtype=tf.float32),
                             trainable=True, name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv2 = tf.nn.relu(bias, name=scope)
        parameters += [kernel, biases]
    print_activations(conv2)

    # pool2
    lrn2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75, name='lrn2')
    pool2 = tf.nn.max_pool(lrn2,
                           ksize=[1, 3, 3, 1],
                           strides=[1, 2, 2, 1],
                           padding='VALID',
                           name='pool2')
    print_activations(pool2)

    # conv3
    with tf.name_scope('conv3') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 192, 384],
                                                 dtype=tf.float32,
                                                 stddev=1e-1), name='weights')
        conv = tf.nn.conv2d(pool2, kernel, [1, 1, 1, 1], padding='SAME')
        biases = tf.Variable(tf.constant(0.0, shape=[384], dtype=tf.float32),
                             trainable=True, name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv3 = tf.nn.relu(bias, name=scope)
        parameters += [kernel, biases]
        print_activations(conv3)

    # conv4
    with tf.name_scope('conv4') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 384, 256],
                                                 dtype=tf.float32,
                                                 stddev=1e-1), name='weights')
        conv = tf.nn.conv2d(conv3, kernel, [1, 1, 1, 1], padding='SAME')
        biases = tf.Variable(tf.constant(0.0, shape=[256], dtype=tf.float32),
                             trainable=True, name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv4 = tf.nn.relu(bias, name=scope)
        parameters += [kernel, biases]
        print_activations(conv4)

    # conv5
    with tf.name_scope('conv5') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 256, 256],
                                                 dtype=tf.float32,
                                                 stddev=1e-1), name='weights')
        conv = tf.nn.conv2d(conv4, kernel, [1, 1, 1, 1], padding='SAME')
        biases = tf.Variable(tf.constant(0.0, shape=[256], dtype=tf.float32),
                             trainable=True, name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv5 = tf.nn.relu(bias, name=scope)
        parameters += [kernel, biases]
        print_activations(conv5)

    # pool5
    pool5 = tf.nn.max_pool(conv5,
                           ksize=[1, 3, 3, 1],
                           strides=[1, 2, 2, 1],
                           padding='VALID',
                           name='pool5')
    print_activations(pool5)

    return pool5, parameters


# 评估AlexNet每轮计算时间
def time_tensorflow_run(session, target, info_string):

    # 每10轮迭代显示当前所需时间
    num_steps_burn_in = 10
    total_duration = 0.0
    total_duration_squared = 0.0

    for i in range(num_batches + num_steps_burn_in):
        start_time = time.time()
        _ = session.run(target)
        duration = time.time() - start_time
        if i >= num_steps_burn_in:
            if not i % 10:
                print('%s: step %d, duration = %.3f' %
                      (datetime.now(), i - num_steps_burn_in, duration))
            total_duration += duration
            total_duration_squared += duration * duration
    mn = total_duration / num_batches
    vr = total_duration_squared / num_batches - mn * mn
    sd = math.sqrt(vr)
    print('%s: %s across %d steps, %.3f +/- %.3f sec / batch' %
          (datetime.now(), info_string, num_batches, mn, sd))


# 主函数run_benchmark
def run_benchmark():
    with tf.Graph().as_default():
        image_size = 224
        images = tf.Variable(tf.random_normal([batch_size,
                                               image_size,
                                               image_size, 3],
                                              dtype=tf.float32,
                                              stddev=1e-1))
        pool5, parameters = inference(images)

        init = tf.global_variables_initializer()

        # Start running operations on the Graph.
        config = tf.ConfigProto()
        config.gpu_options.allocator_type = 'BFC'
        sess = tf.Session(config=config)
        sess.run(init)


        # forward, backward计算评测
        time_tensorflow_run(sess, pool5, "Forward")

        objective = tf.nn.l2_loss(pool5)
        grad = tf.gradients(objective, parameters)
        time_tensorflow_run(sess, grad, "Forward-backward")


# 执行主函数
run_benchmark()

        运行程序,我们会看到如下的程序显示

conv1   [32, 56, 56, 64]
pool1   [32, 27, 27, 64]
conv2   [32, 27, 27, 192]
pool2   [32, 13, 13, 192]
conv3   [32, 13, 13, 384]
conv4   [32, 13, 13, 256]
conv5   [32, 13, 13, 256]
pool5   [32, 6, 6, 256]
2017-10-12 20:46:36.770538: step 0, duration = 0.456
2017-10-12 20:46:41.341906: step 10, duration = 0.456
2017-10-12 20:46:45.911911: step 20, duration = 0.454
2017-10-12 20:46:50.475582: step 30, duration = 0.455
2017-10-12 20:46:55.042899: step 40, duration = 0.457
2017-10-12 20:46:59.687746: step 50, duration = 0.457
2017-10-12 20:47:04.291188: step 60, duration = 0.461
2017-10-12 20:47:09.032564: step 70, duration = 0.472
2017-10-12 20:47:13.620677: step 80, duration = 0.456
2017-10-12 20:47:18.195206: step 90, duration = 0.454
2017-10-12 20:47:22.303328: Forward across 100 steps, 0.460 +/- 0.009 sec / batch
2017-10-12 20:47:38.932348: step 0, duration = 1.512
2017-10-12 20:47:54.137846: step 10, duration = 1.524
2017-10-12 20:48:09.268915: step 20, duration = 1.519
2017-10-12 20:48:24.390340: step 30, duration = 1.523
2017-10-12 20:48:39.538181: step 40, duration = 1.508
2017-10-12 20:48:54.735946: step 50, duration = 1.553
2017-10-12 20:49:10.095126: step 60, duration = 1.515
2017-10-12 20:49:25.401976: step 70, duration = 1.513
2017-10-12 20:49:40.519296: step 80, duration = 1.506
2017-10-12 20:49:55.673532: step 90, duration = 1.513
2017-10-12 20:50:09.469297: Forward-backward across 100 steps, 1.520 +/- 0.020 sec / batch


        以上为程序运行过程中显示的AlexNet结构以及forward、backword运算时间。

        至此,TensorFlow实现AlexNet的工作就完成了。AlexNet的出现,极大的引起来学术界的关注,为深度学习的崛起奠定基础,也促使了更先进网络的出现,后续,我将和大家探讨其他的经典深度学习网络。

       在后续工作中,我将继续为大家展现TensorFlow和深度学习网络带来的无尽乐趣,我将和大家一起探讨深度学习的奥秘。当然,如果你感兴趣,我的Weibo将与你一起分享最前沿的人工智能、机器学习、深度学习与计算机视觉方面的技术。

你可能感兴趣的:(Python,机器学习,深度学习,计算机视觉)