python 最小二乘回归 高斯核_python实现简单的线性回归-最小二乘法(一元一次)...

一、数学原理

(1)模型定义(一元线性方程):

(2)损失函数:

.

目标:最小化

,即
. 因此,需要寻找合适的W和b,从而使得loss最小。

解决方法:寻找该函数的极值 ==> 找到它导数为0的驻点,然后去判断哪些驻点为最小值。

求导:

求导:

二、python代码实现

(1)准备数据

1.1,39343.00
1.3,46205.00
1.5,37731.00
2.0,43525.00
2.2,39891.00
2.9,56642.00
3.0,60150.00
3.2,54445.00
3.2,64445.00
3.7,57189.00
3.9,63218.00
4.0,55794.00
4.0,56957.00
4.1,57081.00
4.5,61111.00
4.9,67938.00
5.1,66029.00
5.3,83088.00
5.9,81363.00
6.0,93940.00
6.8,91738.00
7.1,98273.00
7.9,101302.00
8.2,113812.00
8.7,109431.00
9.0,105582.00
9.5,116969.00
9.6,112635.00
10.3,122391.00
10.5,121872.00

(2)读取及显示数据

def read_data(datafile):
    file = open(datafile)
    x = []
    y = []
    for line in file.readlines():
        curLine = line.strip().split(",")
        x.append(float(curLine[0]))
        y.append(float(curLine[1]))
    return x, y
def show_points(x, y):
    # 设置图表标题并给坐标轴加上标签
    plt.title('Linear Datas', fontsize=24)
    plt.xlabel('X_Value', fontsize=14)
    plt.ylabel('Y_Value', fontsize=14)
    plt.scatter(x[:], y[:], c='r')  # 绘制数据点
    plt.show()

(3)更新

def train(learning_rate, w, b):
    dw = np.sum((np.power(point_x, 2) * w - np.transpose(point_y - b) * point_x))
    db = np.sum(point_y - (point_x * w - b))
    temp_w = w - learning_rate * dw
    temp_b = b - learning_rate * db
    w = temp_w
    b = temp_b
    return w, b


(4)训练

if __name__=='__main__':
    point_x, point_y = read_data("linear_data.txt")
    # show_points(point_x, point_y)
    w = np.random.rand(1)
    b = np.random.rand(1)
    learning_rate = 0.0001
    # w, b = train(learning_rate, w, b)  # 训练一次
    for i in range(1000):
        w, b = train(learning_rate, w, b)
    y_pred = w * point_x + b
    loss = np.power(y_pred - point_y, 2).sum()
    plt.scatter(point_x, point_y)
    plt.plot(point_x, y_pred)
    plt.show()

你可能感兴趣的:(python,最小二乘回归,高斯核)