线性回归推导(三)--梯度下降法及纯python实现

1、梯度下降法

假设函数:
h θ ( x ) = θ 0 + θ 1 x 1 + θ 2 x 2 + . . . + θ n x n = ∑ i = 0 n θ i x i ( x 0 = 1 ) h_{\theta}(x)=\theta_{0}+\theta_{1}x_{1}+\theta_{2}x_{2}+...+\theta_{n}x_{n}=\sum_{i=0}^{n}\theta_{i}x_{i}\quad (x_{0}=1) hθ(x)=θ0+θ1x1+θ2x2+...+θnxn=i=0nθixi(x0=1)
代价函数:
J ( θ ) = 1 2 ∑ i = 1 m [ h θ ( x ( i ) ) − y ( i ) ] 2 = 1 2 ∑ i = 1 m [ θ T x ( i ) − y ( i ) ] 2 \begin{aligned} J(\theta)&=\frac{1}{2}\sum_{i=1}^{m}[h_{\theta}(x^{(i)})-y^{(i)}]^{2}\\ &=\frac{1}{2}\sum_{i=1}^{m}[\theta^{T}x^{(i)}-y^{(i)}]^{2} \end{aligned} J(θ)=21i=1m[hθ(x(i))y(i)]2=21i=1m[θTx(i)y(i)]2
梯度下降公式:
θ : = θ − α ∂ J ( θ ) ∂ θ \theta:=\theta-\alpha \frac{\partial J(\theta)}{\partial \theta} θ:=θαθJ(θ)
θ \theta θ的推导过程为
∂ J ( θ ) ∂ θ = 1 2 ∂ ∂ θ ∑ i = 1 m [ h θ ( x ( i ) ) − y ( i ) ] 2 = 1 2 ∂ ∂ θ [ ( θ T x ( 1 ) − y ( 1 ) ) 2 + ( θ T x ( 2 ) − y ( 2 ) ) 2 + . . . + ( θ T x ( m ) − y ( m ) ) 2 ] = 1 2 ⋅ 2 ∑ i = 1 m [ ( θ T x ( i ) − y ( i ) ) ⋅ ∂ ∂ θ ( θ T x ( i ) − y ( i ) ) ] = ∑ i = 1 m [ ( θ T x ( i ) − y ( i ) ) ⋅ ∂ ∂ θ ( θ T x ( i ) ) ] = ∑ i = 1 m [ ( θ T x ( i ) − y ( i ) ) ⋅ x ( i ) ] = ∑ i = 1 m [ ( h θ ( x ( i ) ) − y ( i ) ) ⋅ x ( i ) ] ( e r r o r ∗ f e a t u r e ) \begin{aligned} \frac{\partial J(\theta)}{\partial \theta}&=\frac{1}{2}\frac{\partial}{\partial \theta} \sum_{i=1}^{m}[h_{\theta}(x^{(i)})-y^{(i)}]^{2}\\ &=\frac{1}{2}\frac{\partial}{\partial \theta} [(\theta^{T}x^{(1)}-y^{(1)})^{2}+(\theta^{T}x^{(2)}-y^{(2)})^{2}+...+(\theta^{T}x^{(m)}-y^{(m)})^{2}]\\ &=\frac{1}{2} \cdot 2\sum_{i=1}^{m}[(\theta^{T}x^{(i)}-y^{(i)}) \cdot \frac{\partial}{\partial \theta}(\theta^{T}x^{(i)}-y^{(i)})]\\ &=\sum_{i=1}^{m}[(\theta^{T}x^{(i)}-y^{(i)}) \cdot \frac{\partial}{\partial \theta}(\theta^{T}x^{(i)})]\\ &=\sum_{i=1}^{m}[(\theta^{T}x^{(i)}-y^{(i)}) \cdot x^{(i)}]\\ &=\sum_{i=1}^{m}[(h_{\theta}(x^{(i)})-y^{(i)}) \cdot x^{(i)}] \quad {\color{red}(error *feature)} \end{aligned} θJ(θ)=21θi=1m[hθ(x(i))y(i)]2=21θ[(θTx(1)y(1))2+(θTx(2)y(2))2+...+(θTx(m)y(m))2]=212i=1m[(θTx(i)y(i))θ(θTx(i)y(i))]=i=1m[(θTx(i)y(i))θ(θTx(i))]=i=1m[(θTx(i)y(i))x(i)]=i=1m[(hθ(x(i))y(i))x(i)](errorfeature)

θ : = θ − α ∑ i = 1 m [ ( h θ ( x ( i ) ) − y ( i ) ) ⋅ x ( i ) ] \theta:=\theta-\alpha \sum_{i=1}^{m}[(h_{\theta}(x^{(i)})-y^{(i)}) \cdot x^{(i)}] θ:=θαi=1m[(hθ(x(i))y(i))x(i)]

2、python实现

代码如下

import numpy as np
import matplotlib.pyplot as plt
import time


# 加载数据
def load_data():
    X = [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013]
    X_p = np.array(X)
    Y = [2.000, 2.500, 2.900, 3.147, 4.515, 4.903, 5.365, 5.704, 6.853, 7.971, 8.561, 10.000, 11.280, 12.900]
    Y_p = np.array(Y)
    return X_p, Y_p


# 梯度下降法求解线性回归
def gradient_descent(X, Y, learn_rate = 0.15):
    theta0, theta1 = 0, 0
    loss, temploss = 1, 0
    y, iter_  = 0, 0
    lenth = len(X)
    loss_list = []
    iteration = []

    # 均值方差归一化
    mean = np.mean(X)
    variance = np.std(X)
    X_new = (X - mean)/variance

    # 梯度下降,以loss变化小于1e-10为迭代停止条件
    while abs(loss-temploss)>1e-10:
        iter_ += 1
        temploss = loss
        loss=0
        gradient0, gradient1 = 0, 0
        # 计算梯度
        for i in range(lenth):
            gradient0 += theta0 + theta1 * X_new[i] - Y[i]
            gradient1 += (theta0 + theta1 * X_new[i] - Y[i]) * X_new[i]
        # 参数更新
        theta0 = theta0 - learn_rate * gradient0
        theta1 = theta1 - learn_rate * gradient1
        # 动态学习率
        learn_rate = learn_rate * 0.99
        # 计算损失
        for i in range(lenth):
            loss += (theta0 + theta1 * X_new[i] - Y[i]) ** 2
        loss_list.append(loss)
        iteration.append(iter_)
        # print(loss)
        y = X_new * theta1 + theta0
    # 预测2014年
    x = (2014 - mean)/variance
    print("the housing price in 2014 is %f"%(x * theta1 + theta0))


if __name__ == "__main__":
    X, Y = load_data()
    print("\n--------------gradient descent----------------")
    gradient_descent(X, Y)

拟合过程如下

(自己学习机器学习的笔记,如有错误望提醒修正)

你可能感兴趣的:(机器学习,python,机器学习,人工智能)