机器学习—单变量线性回归

单变量线性回归

案例

波士顿房价预测,给出不同面积的房屋价格,预测其它面积的房屋

模型表示

h θ ( x ) = θ 0 + θ 1 x h_θ(x)=\theta_0+\theta_1x hθ(x)=θ0+θ1x
其中: θ 0 \theta_0 θ0 θ 1 \theta_1 θ1为代求参数

代价函数

J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta_0,\theta_1)=\frac{1}{2m}\sum_{i=1}^m{(h_\theta(x^{(i)})-y^{(i)})^2} J(θ0,θ1)=2m1i=1m(hθ(x(i))y(i))2
其中,m为样本数量。
目标:选择合适的参数,使代价函数降到最低

梯度下降算法

作用:求解函数最小值
思想:开始时,随机选择一组参数组合,然后计算代价函数及其梯度,然后寻找下一个能让代价函数下降最多的参数组合。持续这样做,得到一个局部最小值(选择不同的初始参数组合,可能会找到不同的局部最小值)
公式如下:

θ j : = θ j − α ∂ ∂ θ j J ( θ 0 , θ 1 )        f o r   j = 0   a n d   j = 1 \theta_j:=\theta_j-\alpha\frac{\partial}{\partial\theta_j}J(\theta_0,\theta_1)\ \ \ \ \ \ for\ j=0\ and\ j=1 θj:=θjαθjJ(θ0,θ1)      for j=0 and j=1
在进行参数更新的时候,参数要同步更新。
其中, α \alpha α是学习率,决定下降的步长; α \alpha α太小时,梯度下降慢; α \alpha α太大时,梯度下降法可能会越过最低点,导致无法收敛甚至发散。
对线性回归问题运用批量梯度下降算法:
∂ ∂ θ j J ( θ 0 , θ 1 ) = ∂ ∂ θ j 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 \frac{\partial}{\partial\theta_j}J(\theta_0,\theta_1)=\frac{\partial}{\partial\theta_j}\frac{1}{2m}\sum_{i=1}^m{(h_\theta(x^{(i)})-y^{(i)})^2} θjJ(θ0,θ1)=θj2m1i=1m(hθ(x(i))y(i))2
j   =   0 时 : ∂ ∂ θ 0 J ( θ 0 , θ 1 ) = 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) j\ =\ 0时:\frac{\partial}{\partial\theta_0}J(\theta_0,\theta_1)=\frac{1}{m}\sum_{i=1}^m{(h_\theta(x^{(i)})-y^{(i)})} j = 0θ0J(θ0,θ1)=m1i=1m(hθ(x(i))y(i)) j   =   1 时 : ∂ ∂ θ 1 J ( θ 0 , θ 1 ) = 1 m ∑ i = 1 m ( ( h θ ( x ( i ) ) − y ( i ) ) ⋅ x ( i ) ) j\ =\ 1时:\frac{\partial}{\partial\theta_1}J(\theta_0,\theta_1)=\frac{1}{m}\sum_{i=1}^m{((h_\theta(x^{(i)})-y^{(i)})\cdot x^{(i)})} j = 1θ1J(θ0,θ1)=m1i=1m((hθ(x(i))y(i))x(i))
即:
θ 0 : = θ 0 − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) \theta_0:=\theta_0-\alpha\frac{1}{m}\sum_{i=1}^m{(h_\theta(x^{(i)})-y^{(i)})} θ0:=θ0αm1i=1m(hθ(x(i))y(i)) θ 1 : = θ 1 − α 1 m ∑ i = 1 m ( ( h θ ( x ( i ) ) − y ( i ) ) ⋅ x ( i ) ) \theta_1:=\theta_1-\alpha\frac{1}{m}\sum_{i=1}^m{((h_\theta(x^{(i)})-y^{(i)})\cdot x^{(i)})} θ1:=θ1αm1i=1m((hθ(x(i))y(i))x(i))

python实现

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np


# 获取X、y矩阵
def getX(df):
    ones = pd.DataFrame({'ones': np.ones(len(df))})
    data = pd.concat([ones, df], axis=1)
    return data.iloc[:, :-1].values


def gety(df):
    return np.array(df.iloc[:, -1])


# 代价函数
def cost(theta, X, y):
    m = X.shape[0]
    inner = X @ theta - y
    cost = (inner.T @ inner) / (2 * m)
    return cost


# 梯度下降
def gradientDescent(theta, X, y):
    return X.T @ (X @ theta - y) / X.shape[0]


def batchDo(theta, X, y, epoch, learningRate):
    costData = [cost(theta, X, y)]
    for i in range(epoch):
        theta = theta - learningRate * gradientDescent(theta, X, y)
        costData.append(cost(theta, X, y))
    return theta, costData


# 主函数
if __name__ == '__main__':
    # 获取数据并绘图
    data = pd.read_csv('ex1data1.txt', names=['population', 'profit'])
    sns.set(context="notebook", style="white", palette="dark")
    sns.lmplot('population', 'profit', data, height=10, fit_reg=True)
    plt.show()
    # 获取X、y,并随机生产Θ
    X = getX(data)
    y = gety(data)
    theta = np.random.rand(X.shape[1])
    # 梯度下降
    epoch = 800
    learningRate = 0.01
    theta, costData = batchDo(theta, X, y, epoch, learningRate)
    # 绘制代价函数值--迭代次数图
    a = sns.tsplot(costData, np.arange(epoch + 1))
    a.set_xlabel("epoch")
    a.set_ylabel("cost")
    plt.show()
    # 绘制原始数据及回归方程
    plt.scatter(data.population, data.profit, label="Training data")
    plt.plot(data.population, data.population * theta[1] + theta[0], label="Prediction")
    plt.legend(loc=2)
    plt.show()

你可能感兴趣的:(吴恩达机器学习课程整理,机器学习)