softmax求导

softmax函数:
S i = e a i ∑ j e a j S_{i}=\frac{e^{a_{i}}}{\sum_{j} e^{a_{j}}} Si=jeajeai

softmax求导

∂ S i ∂ a j = ∂ e a i ∂ a j ⋅ Σ − ∂ Σ ∂ a j ⋅ e a i ∑ 2 \frac{\partial S_{i}}{\partial a_{j}}=\frac{\frac{\partial e^{a_{i}}}{\partial a_{j}} \cdot \Sigma-\frac{\partial \Sigma}{\partial a_{j}} \cdot e^{a_{i}}}{\sum^{2}} ajSi=2ajeaiΣajΣeai

i = j i = j i=j时:
∂ S i ∂ a j = e a i ⋅ Σ − e a j e a i Σ 2 = e a i Σ ⋅ Σ − e a j Σ = S i ⋅ ( 1 − S j ) \frac{\partial S_{i}}{\partial a_{j}}=\frac{e^{a_{i}} \cdot \Sigma-e^{a_{j}} e^{a_{i}}}{\Sigma^{2}} = \frac{e^{a_{i}}}{\Sigma} \cdot \frac{\Sigma-e^{a_{j}}}{\Sigma}=S_{i} \cdot\left(1-S_{j}\right) ajSi=Σ2eaiΣeajeai=ΣeaiΣΣeaj=Si(1Sj)

i ≠ j i \neq j i̸=j时:
∂ S i ∂ a j = − e α j ⋅ e a i Σ 2 = − S i ⋅ S j \frac{\partial S_{i}}{\partial a_{j}}=-\frac{e^{\alpha_{j}} \cdot e^{a_{i}}}{\Sigma^{2}}=-S_{i} \cdot S_{j} ajSi=Σ2eαjeai=SiSj

交叉熵损失函数

L = − ∑ y i log ⁡ S i L=-\sum y_{i} \log S_{i} L=yilogSi

∂ L ∂ S i = − y i ⋅ 1 S i \frac{\partial L}{\partial S_{i}}=-y_{i} \cdot \frac{1}{S_{i}} SiL=yiSi1

对softmax的输入求导:

softmax求导_第1张图片

∂ L ∂ a i = ∑ j ∂ L ∂ S j ⋅ ∂ S j ∂ a i = ∂ L ∂ S i ⋅ ∂ S i ∂ a i + ∑ j ≠ i ∂ L ∂ S j ⋅ ∂ S j ∂ a i = − y i ⋅ 1 S i ⋅ S i ( 1 − S i ) + ∑ j ≠ i − y j S j ⋅ ( − 1 ) S i S j = − y i ( 1 − S i ) + ∑ j ≠ i y j ⋅ S i = − y i + y i S i + ∑ j ≠ i y j ⋅ S i = S i − y i \frac{\partial L}{\partial a_{i}}=\sum_{j} \frac{\partial L}{\partial S_{j}} \cdot \frac{\partial S_{j}}{\partial a_{i}}=\frac{\partial L}{\partial S_{i}} \cdot \frac{\partial S_{i}}{\partial a_{i}}+\sum_{j \neq i} \frac{\partial L}{\partial S_{j}} \cdot \frac{\partial S_{j}}{\partial a_{i}} \\ =-y_{i} \cdot \frac{1}{S_{i}} \cdot S_{i}\left(1-S_{i}\right)+\sum_{j \neq i} \frac{-y_{j}}{S_{j}} \cdot(-1) S_{i} S_{j} \\ =-y_{i}\left(1-S_{i}\right)+\sum_{j \neq i} y_{j} \cdot S_{i} \\ =-y_{i}+y_{i} S_{i}+\sum_{j \neq i} y_{j} \cdot S_{i} \\ =S_{i}-y_{i} aiL=jSjLaiSj=SiLaiSi+j̸=iSjLaiSj=yiSi1Si(1Si)+j̸=iSjyj(1)SiSj=yi(1Si)+j̸=iyjSi=yi+yiSi+j̸=iyjSi=Siyi

你可能感兴趣的:(algorithm)