Logistic Regression 逻辑回归,Tensorflow源码实现

Logistic Regression 逻辑回归,Tensorflow 源码实现

Logistic Regression是一种基于概率的线性分类器。它由一个权重矩阵 W 和偏差向量b参数组成。逻辑回归将输入向量投影到一组超平面,每个超平面代表着一个分类。输入向量到超平面的距离反映了此向量属于超平面对应分类的概率。
在数学上,输入向量 x 属于类别i,即随机变量 Y 的值概率P,定义如下:

P(Y=i|x,W,b) =softmaxi(Wx+b)=eWix+bijeWjx+bj

模型预测值 pred 为最大概率对应的类,具体如下:
ypred=argmaxiP(Y=i|x,W,b)

为了获得模型的最优参数,我们需要定义最小化损失函数;通过求解最小化损失函数,学习最优参数。在多分类情况下,通常利用负对数似然函数作为损失函数。相当于在模型参数 θ 下最大化数据集 D 的似然函数。我们先定义一下似然函数 L 和损失函数 :
L(θ={W,b},D)=i=0|D|log(P(Y=y(i)|x(i),W,b))

(θ={W,b},D)=L(θ={W,b},D)

在机器学习和深度学习算法中,我们的主要目标都是致力于损失函数的最小化方法,得到模型最优参数,提高模型准确率;至今为止,梯度下降方法是最小化任意非线性函数的最简单的方法。
下面我们利用tensorflow实现逻辑回归:

# -*- coding: utf-8 -*-
'''

@Author: Zhang Zhan

@License: (C) Copyright 2013-2017, Revenue Management R & D Department, JD.COM.

@Email: [email protected]

@Software: PyCharm Community Edition

@File: logistic_regression.py

@Time: 2017/11/20 11:16

@Desc: A logistic regression learning algorithm example using TensorFlow library.
This example is using the MNIST database of handwritten digits
(http://yann.lecun.com/exdb/mnist/)

'''

from __future__ import print_function
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/data/", one_hot=True)

# Parameters
learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes

# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()

# Start training
with tf.Session() as sess:

    # Run the initializer
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                          y: batch_ys})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    # Calculate accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))

你可能感兴趣的:(机器学习,机器学习)