机器学习--- 均方误差损失(Mean Squared Error, MSE) -->[附代码]

文章目录

  • 一、简介
  • 二、数学推导
    • 2.1 导数计算
        • 首先求解 b b b
        • 然后求解 w w w
  • 三、代码实现
    • 3.1 python 代码实现
    • 3.2 torch 代码实现
  • 参考资料

一、简介

  • 欧氏距离损失经常用在线性回归问题(求解的是连续问题)。
  • 回归问题解决的是对具体数值的预测,比如房价预测、销量预测等等。
  • 解决回归问题的神经网络一般只有一个输出节点,这个节点的输出值就是预测值。

二、数学推导

  1. 假设训练数据 X X X, 训练数据的 l a b e l 为 Y label 为 Y labelY
  2. 预测函数为: f ( x i ) = y i ^ = w x i + b f(x_i) = \hat{y_i} = w x_i + b f(xi)=yi^=wxi+b
  3. 损失函数:
    L o s s M S E ( y , y ^ ) = 1 m a r g   m i n ( w , b ) ∑ i = 1 m ( f ( x i ) − y i ) 2 = 1 m a r g   m i n ( w , b ) ∑ i = 1 m ( y i − w x i − b ) 2 \begin{aligned} Loss_{MSE}(y, \hat{y}) &= \frac{1}{m} \mathop{arg \ min} \limits_{(w,b)} \sum_{i=1}^m (f(x_i) - y_i)^2 \\ &=\frac{1}{m} \mathop{arg \ min} \limits_{(w,b)} \sum_{i=1}^m (y_i - w x_i - b)^2 \end{aligned} LossMSE(y,y^)=m1(w,b)arg mini=1m(f(xi)yi)2=m1(w,b)arg mini=1m(yiwxib)2

2.1 导数计算

  1. 计算导数 ∂ l o s s ∂ w , ∂ l o s s ∂ b \frac{\partial loss}{\partial w}, \frac{\partial loss}{\partial b} wloss,bloss

    ∂ l o s s ∂ w = 2 ∑ i = 1 m x i [ f ( x i ) − y i ] \begin{aligned} \frac{\partial loss}{\partial w} = 2 \sum_{i=1}^m x_i [f(x_i) - y_i] \end{aligned} wloss=2i=1mxi[f(xi)yi]
    ∂ l o s s ∂ b = 2 ∑ i = 1 m [ f ( x i ) − y i ] \begin{aligned} \frac{\partial loss}{\partial b} = 2 \sum_{i=1}^m [f(x_i) - y_i] \end{aligned} bloss=2i=1m[f(xi)yi]

  2. 求解新的 m 和 b m 和 b mb

    { b = 1 m ∑ i = 1 m [ y i − w x i ] w = ∑ i = 1 m y i ( x i − x ˉ ) ∑ i = 1 m x i 2 − 1 m ( ∑ i = 1 m x i ) 2 \begin{cases} b = \frac{1}{m} \sum_{i=1}^m [y_i - w x_i] \\ \\ w = \frac{\sum_{i=1}^m y_i (x_i - \bar{x})}{ \sum_{i=1}^m x_i^2 - \frac{1}{m}(\sum_{i=1}^m x_i)^2 } \end{cases} b=m1i=1m[yiwxi]w=i=1mxi2m1(i=1mxi)2i=1myi(xixˉ)


首先求解 b b b

∂ l o s s ∂ b = 2 ∑ i = 1 m [ f ( x i ) − y i ] = 2 ∑ i = 1 m [ w x i + b − y i ] = 2 ( m b − ∑ i = 1 m [ y i − w x i ] ) \begin{aligned} \frac{\partial loss}{\partial b} &= 2 \sum_{i=1}^m [f(x_i) - y_i] \\ &= 2 \sum_{i=1}^m [ w x_i + b - y_i] \\ &= 2(mb - \sum_{i=1}^m [y_i - w x_i]) \end{aligned} bloss=2i=1m[f(xi)yi]=2i=1m[wxi+byi]=2(mbi=1m[yiwxi])

∂ l o s s ∂ b = 0 \frac{\partial loss}{\partial b} = 0 bloss=0
上 式 = 0 m b = ∑ i = 1 m [ y i − w x i ] b = 1 m ∑ i = 1 m [ y i − w x i ] \begin{aligned} 上式 &=0 \\ mb &= \sum_{i=1}^m [y_i - w x_i] \\ b &= \frac{1}{m} \sum_{i=1}^m [y_i - w x_i] \end{aligned} mbb=0=i=1m[yiwxi]=m1i=1m[yiwxi]


然后求解 w w w

∂ l o s s ∂ w = 2 ∑ i = 1 m x i [ f ( x i ) − y i ] = 2 ( w ∑ i = 1 m x i 2 − ∑ i = 1 m ( y i − b ) x i ) \begin{aligned} \frac{\partial loss}{\partial w} &= 2 \sum_{i=1}^m x_i [f(x_i) - y_i] \\ &= 2 (w \sum_{i=1}^m x_i^2 - \sum_{i=1}^m (y_i - b)x_i ) \end{aligned} wloss=2i=1mxi[f(xi)yi]=2(wi=1mxi2i=1m(yib)xi)

∂ l o s s ∂ w = 0 \frac{\partial loss}{\partial w} = 0 wloss=0
上 式 = 0 w ∑ i = 1 m x i 2 = ∑ i = 1 m ( y i − b ) x i 将 b = 1 m ∑ i = 1 m [ y i − w x i ] 代 入 上 式 w ∑ i = 1 m x i 2 = ∑ i = 1 m y i x i − ∑ i = 1 m x i 1 m ∑ i = 1 m [ y i − w x i ] w ∑ i = 1 m x i 2 = ∑ i = 1 m y i x i − 1 m ∑ i = 1 m x i ∑ i = 1 m y i + w m ( ∑ i = 1 m x i ) 2 w ( ∑ i = 1 m x i 2 − 1 m ( ∑ i = 1 m x i ) 2 ) = ∑ i = 1 m y i x i − 1 m ∑ i = 1 m x i ∑ i = 1 m y i w ( ∑ i = 1 m x i 2 − 1 m ( ∑ i = 1 m x i ) 2 ) = ∑ i = 1 m y i x i − x ˉ ∑ i = 1 m y i w = ∑ i = 1 m y i ( x i − x ˉ ) ∑ i = 1 m x i 2 − 1 m ( ∑ i = 1 m x i ) 2 \begin{aligned} 上式 &=0 \\ w \sum_{i=1}^m x_i^2 &= \sum_{i=1}^m (y_i - b)x_i \\ 将 b &= \frac{1}{m} \sum_{i=1}^m [y_i - w x_i] 代入上式 \\ w \sum_{i=1}^m x_i^2 &= \sum_{i=1}^m y_i x_i - \sum_{i=1}^m x_i \frac{1}{m} \sum_{i=1}^m [y_i - w x_i] \\ w \sum_{i=1}^m x_i^2 &= \sum_{i=1}^m y_i x_i - \frac{1}{m}\sum_{i=1}^m x_i \sum_{i=1}^m y_i + \frac{w}{m} (\sum_{i=1}^m x_i)^2 \\ w(\sum_{i=1}^m x_i^2 - \frac{1}{m}(\sum_{i=1}^m x_i)^2) &= \sum_{i=1}^m y_i x_i - \frac{1}{m}\sum_{i=1}^m x_i \sum_{i=1}^m y_i \\ w(\sum_{i=1}^m x_i^2 - \frac{1}{m}(\sum_{i=1}^m x_i)^2) &= \sum_{i=1}^m y_i x_i - \bar{x} \sum_{i=1}^m y_i \\ w &= \frac{\sum_{i=1}^m y_i (x_i - \bar{x})}{ \sum_{i=1}^m x_i^2 - \frac{1}{m}(\sum_{i=1}^m x_i)^2 } \end{aligned} wi=1mxi2bwi=1mxi2wi=1mxi2w(i=1mxi2m1(i=1mxi)2)w(i=1mxi2m1(i=1mxi)2)w=0=i=1m(yib)xi=m1i=1m[yiwxi]=i=1myixii=1mxim1i=1m[yiwxi]=i=1myixim1i=1mxii=1myi+mw(i=1mxi)2=i=1myixim1i=1mxii=1myi=i=1myixixˉi=1myi=i=1mxi2m1(i=1mxi)2i=1myi(xixˉ)

三、代码实现

3.1 python 代码实现

import numpy as np

# 模型公式
y_hat = np.dot(X, w) + b

# 损失函数
loss = np.sum((y_hat - y) ** 2) / num_train

# 参数的偏导
dw = np.dot(X.T, (y_hat - y)) / num_train
db = np.sum((y_hat - y)) / num_train

3.2 torch 代码实现

import torch

loss_fn = torch.nn.MSELoss(reduce=False, size_average=False,reduction='mean')
loss = loss_fn(input.float(), target.float())

'''
reduction-三个值,
	none: 不使用约简;
	mean:返回loss和的平均值;
	sum:返回loss的和。
默认:mean。
'''

参考资料

你可能感兴趣的:(机器学习基础知识,损失函数)