Logistic回归和梯度上升算法

一. Logistic回归原理

Logistic回归是一种广义线性回归,常用的分类器函数是Sigmoid函数,其公式如下:

σ(z)=11+ez

其中, z 可由下面公式得出:
z=w0x0+w1x1+w2x2++wnxn

如果采用向量的写法,上面的公式可以写成:
z=wTx

我们的主要任务是找到最佳参数 w 使得分类器尽可能准确。

二. 梯度上升算法

1. 梯度

函数 f(x,y) 的梯度的公式如下:

f(x,y)=f(x,y)xf(x,y)y

2. 梯度上升算法的原理

梯度上升算法的迭代公式如下:

w=w+αwf(w)

上面的公式一直被迭代执行,直到符合某个条件为止。

三. 关键点

1. 目标函数是啥?

我们考虑二分类问题,其中的包含的类别为类别1和类别0。
可以得到预测函数,其公式如下:

hw(x)=σ(wTx)=11+ewTx

hw(x) 的值表示 y=1 的概率,因此分类结果为类别1和类别0的概率分别为:
P(y=1|x;w)=hw(x)

P(y=0|x;w)=1hw(x)

上面的公式可以合起来写成:
P(y|x;w)=(hw(x))y(1hw(x))1y

其似然函数为:
L(w)=i=1mP(y(i)|x(i);w)=i=1m(hw(x(i)))y(i)(1hw(x(i)))1y(i)

对数似然函数为:
l(w)=logL(w)=i=1m(y(i)loghw(x(i))+(1y(i))log(1hw(x(i))))

最大似然估计就是要求使得 l(w) 取值最大时的 w ,所以我们的目标函数就是 l(w)

2. 梯度怎么求?

l(w)wj=i=1m(y(i)1hw(x(i))hw(x(i))wj(1y(i))11hw(x(i))hw(x(i))wj)=i=1m(y(i)1σ(wTx(i))(1y(i))11σ(wTx(i)))σ(wTx(i))wj=i=1m(y(i)1σ(wTx(i))(1y(i))11σ(wTx(i)))σ(wTx(i))(1σ(wTx(i)))wTx(i)wj=i=1m(y(i)(1σ(wTx(i)))(1y(i))σ(wTx(i)))x(i)j=i=1m(y(i)σ(wTx(i)))x(i)j=i=1m(y(i)hw(x(i)))x(i)j

四. 实验数据

Logistic回归和梯度上升算法_第1张图片

-0.017612   14.053064   0
-1.395634   4.662541    1
-0.752157   6.538620    0
-1.322371   7.152853    0
0.423363    11.054677   0
0.406704    7.067335    1
0.667394    12.741452   0
-2.460150   6.866805    1
0.569411    9.548755    0
-0.026632   10.427743   0
0.850433    6.920334    1
1.347183    13.175500   0
1.176813    3.167020    1
-1.781871   9.097953    0
-0.566606   5.749003    1
0.931635    1.589505    1
-0.024205   6.151823    1
-0.036453   2.690988    1
-0.196949   0.444165    1
1.014459    5.754399    1
1.985298    3.230619    1
-1.693453   -0.557540   1
-0.576525   11.778922   0
-0.346811   -1.678730   1
-2.124484   2.672471    1
1.217916    9.597015    0
-0.733928   9.098687    0
-3.642001   -1.618087   1
0.315985    3.523953    1
1.416614    9.619232    0
-0.386323   3.989286    1
0.556921    8.294984    1
1.224863    11.587360   0
-1.347803   -2.406051   1
1.196604    4.951851    1
0.275221    9.543647    0
0.470575    9.332488    0
-1.889567   9.542662    0
-1.527893   12.150579   0
-1.185247   11.309318   0
-0.445678   3.297303    1
1.042222    6.105155    1
-0.618787   10.320986   0
1.152083    0.548467    1
0.828534    2.676045    1
-1.237728   10.549033   0
-0.683565   -2.166125   1
0.229456    5.921938    1
-0.959885   11.555336   0
0.492911    10.993324   0
0.184992    8.721488    0
-0.355715   10.325976   0
-0.397822   8.058397    0
0.824839    13.730343   0
1.507278    5.027866    1
0.099671    6.835839    1
-0.344008   10.717485   0
1.785928    7.718645    1
-0.918801   11.560217   0
-0.364009   4.747300    1
-0.841722   4.119083    1
0.490426    1.960539    1
-0.007194   9.075792    0
0.356107    12.447863   0
0.342578    12.281162   0
-0.810823   -1.466018   1
2.530777    6.476801    1
1.296683    11.607559   0
0.475487    12.040035   0
-0.783277   11.009725   0
0.074798    11.023650   0
-1.337472   0.468339    1
-0.102781   13.763651   0
-0.147324   2.874846    1
0.518389    9.887035    0
1.015399    7.571882    0
-1.658086   -0.027255   1
1.319944    2.171228    1
2.056216    5.019981    1
-0.851633   4.375691    1
-1.510047   6.061992    0
-1.076637   -3.181888   1
1.821096    10.283990   0
3.010150    8.401766    1
-1.099458   1.688274    1
-0.834872   -1.733869   1
-0.846637   3.849075    1
1.400102    12.628781   0
1.752842    5.468166    1
0.078557    0.059736    1
0.089392    -0.715300   1
1.825662    12.693808   0
0.197445    9.744638    0
0.126117    0.922311    1
-0.679797   1.220530    1
0.677983    2.556666    1
0.761349    10.693862   0
-2.168791   0.143632    1
1.388610    9.341997    0
0.317029    14.739025   0

五. 实验代码

#encoding: utf-8
import numpy
import matplotlib.pyplot as plt

def load_data():
    data = []
    label = []
    file_object = open('data.txt')
    for line in file_object.readlines():
        arr = line.strip().split()
        data.append([1.0, float(arr[0]), float(arr[1])])
        label.append(int(arr[2]))
    return data, label

def sigmoid(x):
    return 1.0 / (1 + numpy.exp(-x))

def grad_ascent(data, label):
    data_mat = numpy.mat(data)
    label_mat = numpy.mat(label).transpose()
    n, m = data_mat.shape
    alpha = 0.001
    max_step = 500
    weights = numpy.ones((m, 1))
    for i in range(max_step):
        h = sigmoid(data_mat * weights)
        err = (label_mat - h)
        weights = weights + alpha * data_mat.transpose() * err
    return weights

#随机梯度上升
def stoc_grad_ascent0(data, label):
    data = numpy.array(data)
    n, m = data.shape
    alpha = 0.01
    weights = numpy.ones(m)
    for i in range(n):
        h = sigmoid(numpy.sum(data[i] * weights))
        err = label[i] - h
        weights = weights + data[i] * alpha * err
    return weights

#优化后的随机梯度上升
def stoc_grad_ascent1(data, label, max_step):
    data = numpy.array(data)
    n, m = data.shape
    weights = numpy.ones(m)
    for i in range(max_step):
        data_index = range(n)
        for j in range(n):
            alpha = 4 / (1.0 + i + j) + 0.01
            rand_index = int(numpy.random.uniform(0, len(data_index)))
            h = sigmoid(numpy.sum(data[rand_index] * weights))
            error = label[rand_index] - h
            weights = weights + alpha * error * data[rand_index]
            del(data_index[rand_index])
    return weights

def plot(weights):
    data, label = load_data()
    n = len(data)
    x1 = []
    y1 = []
    x2 = []
    y2 = []
    for i in range(n):
        if label[i] == 1:
            x1.append(data[i][1])
            y1.append(data[i][2])
        else:
            x2.append(data[i][1])
            y2.append(data[i][2])
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.scatter(x1, y1, s = 30, c = 'red', marker = 's')
    ax.scatter(x2, y2, s = 30, c = 'green')
    x = numpy.arange(-3.0, 3.0, 0.1)
    y = (-weights[0] - weights[1] * x) / weights[2]
    ax.plot(x, y)
    plt.xlabel('X1')
    plt.ylabel('X2')
    plt.show()

if __name__=="__main__":
    data, label = load_data()

    weights = grad_ascent(data, label)
    # print weights
    plot(weights.getA())

    # weights = stoc_grad_ascent0(data, label)
    # # print weights
    # plot(weights)

    # weights = stoc_grad_ascent1(data, label, 500)
    # # print weights
    # plot(weights)

六. 实验结果

Logistic回归和梯度上升算法_第2张图片
如有错误,请指正

你可能感兴趣的:(机器学习)