张量梯度求导公式总结

结论1. 若 C = A B C=AB C=AB, 则
∂ C ∂ A = B T , ∂ C ∂ B = A T \frac{\partial C}{\partial A}=B^T, \frac{\partial C}{\partial B}=A^T AC=BT,BC=AT
结论2. 若 L = A B C D E F L=ABCDEF L=ABCDEF, 则
∂ L ∂ A = ( B C D E F ) T , ∂ L ∂ B = A T ( C D E F ) T \frac{\partial L}{\partial A}=(BCDEF)^T, \frac{\partial L}{\partial B}=A^T(CDEF)^T AL=(BCDEF)T,BL=AT(CDEF)T
∂ L ∂ C = ( A B ) T ( D E F ) T , ∂ L ∂ D = ( A B C ) T ( E F ) T \frac{\partial L}{\partial C}=(AB)^T(DEF)^T, \frac{\partial L}{\partial D}=(ABC)^T(EF)^T CL=(AB)T(DEF)T,DL=(ABC)T(EF)T
∂ L ∂ E = ( A B C D ) T F T , ∂ L ∂ F = ( A B C D E ) T \frac{\partial L}{\partial E}=(ABCD)^TF^T, \frac{\partial L}{\partial F}=(ABCDE)^T EL=(ABCD)TFT,FL=(ABCDE)T
这个还是比较容易看出规律的,L对右边项中间某个张量的偏导等于该张量左边所有的转置乘右边所有的转置。

结论3. 若 O p × n = V p × m H m × n O_{p\times n}=V_{p\times m}H_{m\times n} Op×n=Vp×mHm×n, L o s s Loss Loss是标量(scalar)则,
∂ L o s s ∂ H = ∂ O ∂ H ∂ L o s s ∂ O \frac{\partial Loss}{\partial H}=\frac{\partial O}{\partial H}\frac{\partial Loss}{\partial O} HLoss=HOOLoss
∂ L o s s ∂ V = ∂ L o s s ∂ O ∂ O ∂ V \frac{\partial Loss}{\partial V}=\frac{\partial Loss}{\partial O}\frac{\partial O}{\partial V} VLoss=OLossVO
下证明之:
∵ L o s s ∈ R , 令 L o s s = A 1 × p O p × n B n × 1 \because Loss \in \mathbb{R}, \quad令 \quad Loss = A_{1\times p}O_{p\times n}B_{n\times 1} LossR,Loss=A1×pOp×nBn×1又由已知 O p × n = V p × m H m × n O_{p\times n}=V_{p\times m}H_{m\times n} Op×n=Vp×mHm×n
∴ L o s s = A 1 × p V p × m H m × n B n × 1 \therefore Loss=A_{1\times p}V_{p\times m}H_{m\times n}B_{n\times 1} Loss=A1×pVp×mHm×nBn×1
结论2
∂ L o s s ∂ H = ( A V ) T B T = V T A T B T = V T ( A T B T ) \frac{\partial Loss}{\partial H}=(AV)^TB^T=V^TA^TB^T=V^T(A^TB^T) HLoss=(AV)TBT=VTATBT=VT(ATBT)
∂ O ∂ H = V T , ∂ L o s s ∂ O = A T B T \frac{\partial O}{\partial H}=V^T,\frac{\partial Loss}{\partial O}=A^TB^T HO=VT,OLoss=ATBT
∴ ∂ L o s s ∂ H = ∂ O ∂ H ∂ L o s s ∂ O \therefore \frac{\partial Loss}{\partial H}=\frac{\partial O}{\partial H}\frac{\partial Loss}{\partial O} HLoss=HOOLoss
同理可证
∂ L o s s ∂ V = ∂ L o s s ∂ O ∂ O ∂ V \frac{\partial Loss}{\partial V}=\frac{\partial Loss}{\partial O}\frac{\partial O}{\partial V} VLoss=OLossVO
结论4.
( ∂ C ∂ A T ) T = ∂ C ∂ A (\frac{\partial C}{\partial A^T})^T=\frac{\partial C}{\partial A} (ATC)T=AC
结论5. 若 C = A T B C=A^TB C=ATB, 则由结论4易证
∂ C ∂ A = B \frac{\partial C}{\partial A} = B AC=B
结论6. 若 y = w T X w y=w^TXw y=wTXw, 则
∂ y ∂ w = ( X + X T ) w \frac{\partial y}{\partial w}=(X+X^T)w wy=(X+XT)w
特别地,若 X X X是实对称矩阵,则有 X = X T X=X^T X=XT,故
∂ y ∂ w = 2 X w \frac{\partial y}{\partial w}=2Xw wy=2Xw

你可能感兴趣的:(数学,矩阵,线性代数)