1.李航机器学习-多项式回归公式推导及python实现

多项式回归

项目链接:https://github.com/Wchenguang/gglearn/blob/master/PolynomialClassifier/李航机器学习-讲解/PolynomialClassifier.ipynb

公式推导

  • 损失函数定义为平方损失函数

cos ⁡ t = 1 2 ( f ( X ) − Y ) 2 L = 1 N × ∑ i = 1 N 1 2 ( ∑ j = 0 M ( w j x i j ) − y i ) 2 \begin{array}{l}{\cos t=\frac{1}{2}(f(X)-Y)^{2}} \\ {L=\frac{1}{N} \times \sum_{i=1}^{N} \frac{1}{2}\left(\sum_{j=0}^{M}\left(w_{j} x_{i}^{j}\right)-y_{i}\right)^{2}}\end{array} cost=21(f(X)Y)2L=N1×i=1N21(j=0M(wjxij)yi)2

  • 求导并使导数为0,直接求出权值

∂ L ∂ w k = 1 N × ∑ i = 1 N 1 2 × 2 ( x i k × ( ∑ j = 0 M ( w j x i k ) − y i ) ) = 1 N × ∑ i = 1 N ( x i k × ∑ j = 0 M ( w j x i j ) − x i k × y i ) = 0 ↓ x 1 k ∑ j = 1 m w j x 1 j + x 2 K ∑ j = 1 m w j x 2 j ⋯ x n K ∑ j = 1 m w j x n j = ∑ i = 1 n x i k y i ↓ [ x 1 k … x n k ] [ x 1 0 ⋯   x 1 m ⋯   ⋮ x n 0 ⋯   x n m ] [ w 0 ⋮ w m ] = [ x 1 k ⋯   x n k ] [ y 1 ⋮ y n ] \begin{array}{c}\begin{array}{l}{\frac{\partial L}{\partial w_{k}}=\frac{1}{N} \times \sum_{i=1}^{N} \frac{1}{2} \times 2\left(x_{i}^{k} \times\left(\sum_{j=0}^{M}\left(w_{j} x_{i}^{k}\right)-y_{i}\right)\right)} \\ {=\frac{1}{N} \times \sum_{i=1}^{N}\left(x_{i}^{k} \times \sum_{j=0}^{M}\left(w_{j} x_{i}^{j}\right)-x_{i}^{k} \times y_{i}\right)} \\ {=0}\end{array} \\ {\downarrow} \\ x_{1}^{k} \sum_{j=1}^{m} w_{j} x_{1}^{j}+x_{2}^{K} \sum_{j=1}^{m} w_{j} x_{2}^{j} \cdots x_{n}^{K} \sum_{j=1}^{m} w_{j} x_{n}^{j}=\sum_{i=1}^{n} x_{i}^{k} y_{i} \\ {\downarrow} \\ \left[x_{1}^{k} \dots x_{n}^{k}\right] \begin{bmatrix} x_{1}^{0}& \cdots\ &x_{1}^{m}\\ & \cdots\ &\\ &\vdots \\ x_{n}^{0}& \cdots\ &x_{n}^{m} \end{bmatrix} \left[\begin{array}{c}{w_{0}} \\ {\vdots} \\ {w_{m}}\end{array}\right] = \left[x_{1}^{k} \cdots\ x_{n}^{k}\right] \left[\begin{array}{c}{y_{1}} \\ {\vdots} \\ {y_{n}}\end{array}\right] \end{array} wkL=N1×i=1N21×2(xik×(j=0M(wjxik)yi))=N1×i=1N(xik×j=0M(wjxij)xik×yi)=0x1kj=1mwjx1j+x2Kj=1mwjx2jxnKj=1mwjxnj=i=1nxikyi[x1kxnk]x10xn0   x1mxnmw0wm=[x1k xnk]y1yn

  • 上式所求的 w k w_{k} wk可以用含有 w 0 w 1 ⋯   w k − 1 w k + 1 ⋯   w m w_{0} w_{1} \cdots\ w_{k-1} w_{k+1}\cdots\ w_{m} w0w1 wk1wk+1 wm 的式子表示,这些 w w w值表示的是相应权重值的最优解,因而可以利用线性代数将所有权重值求出

[ x 1 k … x n k ] [ x 1 0 ⋯   x 1 m ⋯   ⋮ x n 0 ⋯   x n m ] [ w 0 ⋮ w m ] = [ x 1 k ⋯   x n k ] [ y 1 ⋮ y n ] ↓ [ x 1 0 ⋯   x n 0 ⋯   ⋮ x 1 m ⋯   x n m ] [ x 1 0 ⋯   x 1 m ⋯   ⋮ x n 0 ⋯   x n m ] [ w 0 ⋮ w m ] = [ x 1 0 ⋯   x n 0 ⋯   ⋮ x 1 m ⋯   x n m ] [ y 1 ⋮ y n ] \begin{array}{c}\left[x_{1}^{k} \dots x_{n}^{k}\right] \begin{bmatrix} x_{1}^{0}& \cdots\ &x_{1}^{m}\\ & \cdots\ &\\ &\vdots \\ x_{n}^{0}& \cdots\ &x_{n}^{m} \end{bmatrix} \left[\begin{array}{c}{w_{0}} \\ {\vdots} \\ {w_{m}}\end{array}\right] = \left[x_{1}^{k} \cdots\ x_{n}^{k}\right] \left[\begin{array}{c}{y_{1}} \\ {\vdots} \\ {y_{n}}\end{array}\right] \\ {\downarrow} \\ \begin{bmatrix} x_{1}^{0}& \cdots\ &x_{n}^{0}\\ & \cdots\ &\\ &\vdots \\ x_{1}^{m}& \cdots\ &x_{n}^{m} \end{bmatrix} \begin{bmatrix} x_{1}^{0}& \cdots\ &x_{1}^{m}\\ & \cdots\ &\\ &\vdots \\ x_{n}^{0}& \cdots\ &x_{n}^{m} \end{bmatrix} \left[\begin{array}{c}{w_{0}} \\ {\vdots} \\ {w_{m}}\end{array}\right] = \begin{bmatrix} x_{1}^{0}& \cdots\ &x_{n}^{0}\\ & \cdots\ &\\ &\vdots \\ x_{1}^{m}& \cdots\ &x_{n}^{m} \end{bmatrix} \left[\begin{array}{c}{y_{1}} \\ {\vdots} \\ {y_{n}}\end{array}\right] \end{array} [x1kxnk]x10xn0   x1mxnmw0wm=[x1k xnk]y1ynx10x1m   xn0xnmx10xn0   x1mxnmw0wm=x10x1m   xn0xnmy1yn

  • 由上式即可得梯度下降中的正规方程
    x ⊤ x w = x ⊤ y ↓ w = ( x ⊤ x ) − 1 x ⊤ y \begin{array}{c} x^{\top} x w =x^{\top} y \\ {\downarrow} \\ w =\left(x^{\top} x\right)^{-1} x^{\top} y \end{array} xxw=xyw=(xx)1xy

实现多项式分类器PolynomialRegression

import numpy as np
class PolynomialRegression:
    '''
    只支持numpy数组的输入
    '''
    def __init__(self):
        pass
    def fit(self, x, y):
        '''
        只支持二维数组
        '''
        x= np.hstack((np.ones((len(x), 1)), x))
        x = np.mat(x)
        y = np.mat(y)
        self.w = np.mat((x.T * x).I * x.T * y)
        return self
    def transform(self, x):
        x= np.hstack((np.ones((len(x), 1)), x))
        x = np.mat(x)
        return x * self.w
    def fit_transform(self, x, y):
        x= np.hstack((np.ones((len(x), 1)), x))
        x = np.mat(x)
        y = np.mat(y)
        self.w = (x.T * x).I * x.T * y
        return x * self.w
           #if __name__ == 'main':
for i in range(10, 51, 10):
    x = np.random.randint(1, 100, (40, i))
    y = np.random.randint(1, 100, (40, 1))
    reg_result = PolynomialRegression().fit(x, y).transform(x)
    import matplotlib.pyplot as plt
    fig = plt.figure(num = 1, figsize = (15, 8))
    plt.plot(np.arange(len(y)), y, label = 'real_y')
    plt.plot(np.arange(len(reg_result)), reg_result, label = 'pred_y')
    plt.legend()
    plt.show()
  • 运行结果
    在这里插入图片描述

你可能感兴趣的:(李航机器学习)