【机器学习笔记20】神经网络(链式求导和反向传播)

【参考文献】
【1】《面向机器智能的TensorFlow实践》4.7

假设存在网络结果如下

【机器学习笔记20】神经网络(链式求导和反向传播)_第1张图片

各个层输出定义

L 1 = s i g m o i d ( w 1 ⋅ x ) L_1 = sigmoid(w_1 \cdot x) L1=sigmoid(w1x)
L 2 = s i g m o i d ( w 2 ⋅ L 1 ) L_2 = sigmoid(w_2 \cdot L_1) L2=sigmoid(w2L1)
L 3 = s i g m o i d ( w 3 ⋅ L 2 ) L_3 = sigmoid(w_3 \cdot L_2) L3=sigmoid(w3L2)

定义整个网络最终的损失函数为
l o s s = L o s s ( L 3 , y e x p e c t ) loss = Loss(L_3, y_{expect}) loss=Loss(L3,yexpect)

对损失函数求 w 3 w_3 w3偏导数,得到

∂ l o s s ∂ w 3 = L o s s ′ ( L 3 , y e x p e c t ) s i g m o i d ′ ( w 3 , L 2 ) L 2 \dfrac{\partial loss}{\partial w_3}=Loss'(L_3, y_{expect})sigmoid'(w_3, L_2)L_2 w3loss=Loss(L3,yexpect)sigmoid(w3,L2)L2

同理我们得到对 w 2 w_2 w2 w 1 w_1 w1的偏导数
∂ l o s s ∂ w 2 = L o s s ′ ( L 3 , y e x p e c t ) s i g m o i d ′ ( w 3 , L 2 ) s i g m o i d ′ ( w 2 , L 1 ) L 1 \dfrac{\partial loss}{\partial w_2}=Loss'(L_3, y_{expect})sigmoid'(w_3, L_2)sigmoid'(w_2, L_1)L_1 w2loss=Loss(L3,yexpect)sigmoid(w3,L2)sigmoid(w2,L1)L1

∂ l o s s ∂ w 1 = L o s s ′ ( L 3 , y e x p e c t ) s i g m o i d ′ ( w 3 , L 2 ) s i g m o i d ′ ( w 2 , L 1 ) s i g m o i d ′ ( w 1 , x ) x \dfrac{\partial loss}{\partial w_1}=Loss'(L_3, y_{expect})sigmoid'(w_3, L_2)sigmoid'(w_2, L_1)sigmoid'(w_1, x)x w1loss=Loss(L3,yexpect)sigmoid(w3,L2)sigmoid(w2,L1)sigmoid(w1,x)x

综上所述,我们将整个求导公式简写
∂ l o s s ∂ w 3 = L o s s ′ L 3 ′ L 2 \dfrac{\partial loss}{\partial w_3}=Loss'L_3'L_2 w3loss=LossL3L2
∂ l o s s ∂ w 2 = L o s s ′ L 3 ′ L 2 ′ L 1 \dfrac{\partial loss}{\partial w_2}=Loss'L_3'L_2'L1 w2loss=LossL3L2L1
∂ l o s s ∂ w 1 = L o s s ′ L 3 ′ L 2 ′ L 1 ′ x \dfrac{\partial loss}{\partial w_1}=Loss'L_3'L_2'L1'x w1loss=LossL3L2L1x

可以看到规律在反向求导中,每一次计算都可以重用前一层的计算结果,这也就是所谓的反向传播算法。

你可能感兴趣的:(机器学习,机器学习笔记)