Tensorflow学习笔记5:Logistic回归

使用TensorFlow进行Logistic回归

本教程是关于通过TensorFlow进行二元分类的逻辑回归训练。

介绍

在 使用TensorFlow 的线性回归中,我们描述了如何通过线性建模系统来预测连续值参数。如果目标是在两个选择之间做出决定怎么办?答案很简单:我们正在处理分类问题。在本教程中,使用Logistic回归确定输入图像是数字“0”还是数字“1”的目标。换句话说,它是否是数字“1”!完整的源代码可在关联的GitHub存储库中找到。

数据集

我们在本教程中处理的数据集是 MNIST数据集。主要数据集包括55000次训练和10000次测试图像。图像为28x28x1,每个图像代表0到9的手写数字。我们创建每个图像大小为784的特征向量。我们只为我们的设置使用0和1图像。

Logistic回归

在线性回归中,努力是使用$ y = W ^ {T} x 的 线 性 函 数 来 预 测 结 果 连 续 值 。 另 一 方 面 , 在 逻 辑 回 归 中 , 我 们 决 定 将 二 进 制 标 签 预 测 为 的线性函数来预测结果连续值。另一方面,在逻辑回归中,我们决定将二进制标签预测为 线 y \ in \ {0,1 } $,其中我们使用不同的预测过程而不是线性回归。在逻辑回归中,预测输出是输入样本属于在我们的情况下为数字“1”的目标类的概率。在二元分类问题中,显然如果
$ P(x \ in \ {target \ _class })$ = M,那么$ P(x \ in \ {non \ _target \ _class })= 1 - M $ 。
因此,假设可以创建如下:

P ( y = 1 ∣ x ) = h W ( x ) = 1 1 + e x p ( − W T x ) = S i g m o i d ( W T x )     ( 1 ) P(y=1|x)=h_{W}(x)={{1}\over{1+exp(-W^{T}x)}}=Sigmoid(W^{T}x) \ \ \ (1) P(y=1x)=hW(x)=1+exp(WTx)1=Sigmoid(WTx)   (1) P ( y = 0 ∣ x ) = 1 − P ( y = 1 ∣ x ) = 1 − h W ( x )     ( 2 ) P(y=0|x)=1 - P(y=1|x) = 1 - h_{W}(x) \ \ \ (2) P(y=0x)=1P(y=1x)=1hW(x)   (2)

在上面的等式中,Sigmoid函数将预测输出映射到概率空间,其中值在$ [0,1] $的范围内。主要目的是找到一个模型,使用该模型,当输入样本为“1”时,输出变为高概率,否则变小。重要的目标是设计适当的成本函数,以便在需要输出时将损失降至最低,反之亦然。
( x i , y i ) (x ^ {i},y ^ {i}) xiyi等一组数据的成本函数可以定义如下:

L o s s ( W ) = ∑ i y ( i ) l o g 1 h W ( x i ) + ( 1 − y ( i ) ) l o g 1 1 − h W ( x i ) Loss(W) = \sum_{i}{y^{(i)}log{1\over{h_{W}(x^{i})}}+(1-y^{(i)})log{1\over{1-h_{W}(x^{i})}}} Loss(W)=iy(i)loghW(xi)1+(1y(i))log1hW(xi)1

从上面的等式可以看出,损失函数由两个项组成,并且在每个样本中,考虑到二进制标记,它们中只有一个是非零的。

到目前为止,我们已经定义了逻辑回归的公式和优化函数。在下一部分中,我们将展示如何使用小批量优化在代码中执行此操作。

整个过程的描述

首先,我们处理数据集并仅提取“0”和“1”数字。用于逻辑回归的代码很大程度上受到我们的 训练卷积神经网络作为分类器帖子的启发 。我们参考上述帖子以更好地理解实现细节。在本教程中,我们仅解释了我们如何处理数据集以及如何实现逻辑回归,其余内容从我们之前提到的CNN分类器帖子中清楚可见。

如何在代码中做到这一点?

在这一部分中,我们将解释如何从数据集中提取所需样本并使用Softmax实现逻辑回归。

处理数据集

首先,我们需要从MNIST数据集中提取“0”和“1”数字:

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", reshape=True, one_hot=False)

########################
### Data Processing ####
########################
# Organize the data and feed it to associated dictionaries.
data={}

data['train/image'] = mnist.train.images
data['train/label'] = mnist.train.labels
data['test/image'] = mnist.test.images
data['test/label'] = mnist.test.labels

# Get only the samples with zero and one label for training.
index_list_train = []
for sample_index in range(data['train/label'].shape[0]):
    label = data['train/label'][sample_index]
    if label == 1 or label == 0:
        index_list_train.append(sample_index)

# Reform the train data structure.
data['train/image'] = mnist.train.images[index_list_train]
data['train/label'] = mnist.train.labels[index_list_train]


# Get only the samples with zero and one label for test set.
index_list_test = []
for sample_index in range(data['test/label'].shape[0]):
    label = data['test/label'][sample_index]
    if label == 1 or label == 0:
        index_list_test.append(sample_index)

# Reform the test data structure.
data['test/image'] = mnist.test.images[index_list_test]
data['test/label'] = mnist.test.labels[index_list_test]

代码看起来很冗长但实际上非常简单。我们想要的只是在第28-32行中实现,其中提取了所需的数据样本。接下来,我们必须深入研究逻辑回归架构。

物流回归实施

逻辑回归结构简单地通过完全连接的层来馈送 - 转发输入特征,其中最后一层仅具有两个类。完全连接的架构可以定义如下:

###############################################
########### Defining place holders ############
###############################################
image_place = tf.placeholder(tf.float32, shape=([None, num_features]), name='image')
label_place = tf.placeholder(tf.int32, shape=([None,]), name='gt')
label_one_hot = tf.one_hot(label_place, depth=FLAGS.num_classes, axis=-1)
dropout_param = tf.placeholder(tf.float32)

##################################################
########### Model + Loss + Accuracy ##############
##################################################
# A simple fully connected with two class and a Softmax is equivalent to Logistic Regression.
logits = tf.contrib.layers.fully_connected(inputs=image_place, num_outputs = FLAGS.num_classes, scope='fc')

前几行是定义占位符,以便在图表上放置所需的值。 有关详细信息,请参阅此帖子。使用TensorFlow可以使用以下脚本轻松实现所需的损失函数:

# Define loss
with tf.name_scope('loss'):
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=label_one_hot))

# Accuracy
with tf.name_scope('accuracy'):
    # Evaluate the model
    correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(label_one_hot, 1))

    # Accuracy calculation
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

tf.nn.softmax_cross_entropy_with_logits函数可以完成工作。它通过微妙的差异优化先前定义的成本函数。它产生两个输入,即使样本数字为“0”,相应的概率也会很高。所以tf.nn.softmax_cross_entropy_with_logits函数,为每个类预测一个概率,并固有地自己做出决定。

摘要

在本教程中,我们描述了逻辑回归并表示了如何在代码中实现它。我们不是根据基于目标类的输出概率做出决策,而是将问题扩展到两类问题,其中对于每个类我们预测概率。在以后的帖子中,我们将这个问题扩展到多类问题,我们表明它可以用类似的方法完成。

完整代码

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tempfile
import urllib
import pandas as pd
import os
from tensorflow.examples.tutorials.mnist import input_data

######################################
######### Necessary Flags ############
######################################

tf.app.flags.DEFINE_string(
    'train_path', os.path.dirname(os.path.abspath(__file__)) + '/train_logs',
    'Directory where event logs are written to.')

tf.app.flags.DEFINE_string(
    'checkpoint_path',
    os.path.dirname(os.path.abspath(__file__)) + '/checkpoints',
    'Directory where checkpoints are written to.')

tf.app.flags.DEFINE_integer('max_num_checkpoint', 10,
                            'Maximum number of checkpoints that TensorFlow will keep.')

tf.app.flags.DEFINE_integer('num_classes', 2,
                            'Number of model clones to deploy.')

tf.app.flags.DEFINE_integer('batch_size', int(np.power(2, 9)),
                            'Number of model clones to deploy.')

tf.app.flags.DEFINE_integer('num_epochs', 10,
                            'Number of epochs for training.')

##########################################
######## Learning rate flags #############
##########################################
tf.app.flags.DEFINE_float('initial_learning_rate', 0.001, 'Initial learning rate.')

tf.app.flags.DEFINE_float(
    'learning_rate_decay_factor', 0.95, 'Learning rate decay factor.')

tf.app.flags.DEFINE_float(
    'num_epochs_per_decay', 1, 'Number of epoch pass to decay learning rate.')

#########################################
########## status flags #################
#########################################
tf.app.flags.DEFINE_boolean('is_training', False,
                            'Training/Testing.')

tf.app.flags.DEFINE_boolean('fine_tuning', False,
                            'Fine tuning is desired or not?.')

tf.app.flags.DEFINE_boolean('online_test', True,
                            'Fine tuning is desired or not?.')

tf.app.flags.DEFINE_boolean('allow_soft_placement', True,
                            'Automatically put the variables on CPU if there is no GPU support.')

tf.app.flags.DEFINE_boolean('log_device_placement', False,
                            'Demonstrate which variables are on what device.')

# Store all elemnts in FLAG structure!
FLAGS = tf.app.flags.FLAGS


################################################
################# handling errors!##############
################################################
if not os.path.isabs(FLAGS.train_path):
    raise ValueError('You must assign absolute path for --train_path')

if not os.path.isabs(FLAGS.checkpoint_path):
    raise ValueError('You must assign absolute path for --checkpoint_path')

# Download and get MNIST dataset(available in tensorflow.contrib.learn.python.learn.datasets.mnist)
# It checks and download MNIST if it's not already downloaded then extract it.
# The 'reshape' is True by default to extract feature vectors but we set it to false to we get the original images.
mnist = input_data.read_data_sets("./MNIST_data/", reshape=True, one_hot=False)

########################
### Data Processing ####
########################
# Organize the data and feed it to associated dictionaries.
data={}

data['train/image'] = mnist.train.images
data['train/label'] = mnist.train.labels
data['test/image'] = mnist.test.images
data['test/label'] = mnist.test.labels

def extract_samples_Fn(data):
    index_list = []
    for sample_index in range(data.shape[0]):
        label = data[sample_index]
        if label == 1 or label == 0:
            index_list.append(sample_index)
    return index_list


# Get only the samples with zero and one label for training.
index_list_train = extract_samples_Fn(data['train/label'])


# Get only the samples with zero and one label for test set.
index_list_test = extract_samples_Fn(data['test/label'])

# Reform the train data structure.
data['train/image'] = mnist.train.images[index_list_train]
data['train/label'] = mnist.train.labels[index_list_train]

# Reform the test data structure.
data['test/image'] = mnist.test.images[index_list_test]
data['test/label'] = mnist.test.labels[index_list_test]

# Dimentionality of train
dimensionality_train = data['train/image'].shape

# Dimensions
num_train_samples = dimensionality_train[0]
num_features = dimensionality_train[1]

#######################################
########## Defining Graph ############
#######################################

graph = tf.Graph()
with graph.as_default():
    ###################################
    ########### Parameters ############
    ###################################

    # global step
    global_step = tf.Variable(0, name="global_step", trainable=False)

    # learning rate policy
    decay_steps = int(num_train_samples / FLAGS.batch_size *
                      FLAGS.num_epochs_per_decay)
    learning_rate = tf.train.exponential_decay(FLAGS.initial_learning_rate,
                                               global_step,
                                               decay_steps,
                                               FLAGS.learning_rate_decay_factor,
                                               staircase=True,
                                               name='exponential_decay_learning_rate')

    ###############################################
    ########### Defining place holders ############
    ###############################################
    image_place = tf.placeholder(tf.float32, shape=([None, num_features]), name='image')
    label_place = tf.placeholder(tf.int32, shape=([None,]), name='gt')
    label_one_hot = tf.one_hot(label_place, depth=FLAGS.num_classes, axis=-1)
    dropout_param = tf.placeholder(tf.float32)

    ##################################################
    ########### Model + Loss + Accuracy ##############
    ##################################################
    # A simple fully connected with two class and a softmax is equivalent to Logistic Regression.
    logits = tf.contrib.layers.fully_connected(inputs=image_place, num_outputs = FLAGS.num_classes, scope='fc')

    # Define loss
    with tf.name_scope('loss'):
        loss_tensor = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=label_one_hot))

    # Accuracy
    # Evaluate the model
    prediction_correct = tf.equal(tf.argmax(logits, 1), tf.argmax(label_one_hot, 1))

    # Accuracy calculation
    accuracy = tf.reduce_mean(tf.cast(prediction_correct, tf.float32))

    #############################################
    ########### training operation ##############
    #############################################

    # Define optimizer by its default values
    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)

    # 'train_op' is a operation that is run for gradient update on parameters.
    # Each execution of 'train_op' is a training step.
    # By passing 'global_step' to the optimizer, each time that the 'train_op' is run, Tensorflow
    # update the 'global_step' and increment it by one!

    # gradient update.
    with tf.name_scope('train_op'):
        gradients_and_variables = optimizer.compute_gradients(loss_tensor)
        train_op = optimizer.apply_gradients(gradients_and_variables, global_step=global_step)


    ############################################
    ############ Run the Session ###############
    ############################################
    session_conf = tf.ConfigProto(
        allow_soft_placement=FLAGS.allow_soft_placement,
        log_device_placement=FLAGS.log_device_placement)
    sess = tf.Session(graph=graph, config=session_conf)

    with sess.as_default():

        # The saver op.
        saver = tf.train.Saver()

        # Initialize all variables
        sess.run(tf.global_variables_initializer())

        # The prefix for checkpoint files
        checkpoint_prefix = 'model'

        # If fie-tuning flag in 'True' the model will be restored.
        if FLAGS.fine_tuning:
            saver.restore(sess, os.path.join(FLAGS.checkpoint_path, checkpoint_prefix))
            print("Model restored for fine-tuning...")

        ###################################################################
        ########## Run the training and loop over the batches #############
        ###################################################################

        # go through the batches
        test_accuracy = 0
        for epoch in range(FLAGS.num_epochs):
            total_batch_training = int(data['train/image'].shape[0] / FLAGS.batch_size)

            # go through the batches
            for batch_num in range(total_batch_training):
                #################################################
                ########## Get the training batches #############
                #################################################

                start_idx = batch_num * FLAGS.batch_size
                end_idx = (batch_num + 1) * FLAGS.batch_size

                # Fit training using batch data
                train_batch_data, train_batch_label = data['train/image'][start_idx:end_idx], data['train/label'][
                                                                                             start_idx:end_idx]

                ########################################
                ########## Run the session #############
                ########################################

                # Run optimization op (backprop) and Calculate batch loss and accuracy
                # When the tensor tensors['global_step'] is evaluated, it will be incremented by one.
                batch_loss, _, training_step = sess.run(
                    [loss_tensor, train_op,
                     global_step],
                    feed_dict={image_place: train_batch_data,
                               label_place: train_batch_label,
                               dropout_param: 0.5})

                ########################################
                ########## Write summaries #############
                ########################################


                #################################################
                ########## Plot the progressive bar #############
                #################################################

            print("Epoch " + str(epoch + 1) + ", Training Loss= " + \
                  "{:.5f}".format(batch_loss))

        ###########################################################
        ############ Saving the model checkpoint ##################
        ###########################################################

        # # The model will be saved when the training is done.

        # Create the path for saving the checkpoints.
        if not os.path.exists(FLAGS.checkpoint_path):
            os.makedirs(FLAGS.checkpoint_path)

        # save the model
        save_path = saver.save(sess, os.path.join(FLAGS.checkpoint_path, checkpoint_prefix))
        print("Model saved in file: %s" % save_path)

        ############################################################################
        ########## Run the session for pur evaluation on the test data #############
        ############################################################################

        # The prefix for checkpoint files
        checkpoint_prefix = 'model'

        # Restoring the saved weights.
        saver.restore(sess, os.path.join(FLAGS.checkpoint_path, checkpoint_prefix))
        print("Model restored...")

        # Evaluation of the model
        test_accuracy = 100 * sess.run(accuracy, feed_dict={
            image_place: data['test/image'],
            label_place: data['test/label'],
            dropout_param: 1.})

        print("Final Test Accuracy is %% %.2f" % test_accuracy)

你可能感兴趣的:(机器学习)