对向量求导的常用公式

对向量求导的常用公式

鲁鹏
北京理工大学宇航学院
2019.05.09

最近经常会遇到常数和向量对向量求导的计算,感觉需要总结点什么了。以后,我还会在这个文档中添加新的公式。

前提和定义

首先做如下定义,已知 f ( x ) f(\boldsymbol{x}) f(x)是关于列向量 x = [ x 1 x 2 . . . x n ] T \boldsymbol{x}= [x_{1}\quad x_{2}\quad ...\quad x_{n}]^{T} x=[x1x2...xn]T的标量函数
∂ f ( x ) ∂ x = [ ∂ f ∂ x 1 ∂ f ∂ x 2 . . . ∂ f ∂ x n ] \frac{\partial f(\boldsymbol{x})}{\partial \boldsymbol{x}} = \left[\begin{matrix} \frac{\partial f}{\partial x_{1}} \\ \frac{\partial f}{\partial x_{2}} \\...\\ \frac{\partial f}{\partial x_{n}} \end{matrix}\right] xf(x)=x1fx2f...xnf

函数 f ( x ) f(\boldsymbol{x}) f(x) x \boldsymbol{x} x处的梯度记为 ∇ f ( x ) \nabla f(\boldsymbol{x}) f(x) ∇ f ( x ) = ∂ f ( x ) / ∂ x \nabla f(\boldsymbol{x}) = \partial f(\boldsymbol{x})/\partial \boldsymbol{x} f(x)=f(x)/x
函数 f ( x ) f(\boldsymbol{x}) f(x) x \boldsymbol{x} x处的Hesse矩阵是 n × n n \times n n×n矩阵记为 ∇ 2 f ( x ) \nabla^2 f(\boldsymbol{x}) 2f(x)
∇ 2 f ( x ) = [ ∂ 2 f ∂ x 1 2 ∂ 2 f ∂ x 1 ∂ x 2 . . . ∂ 2 f ∂ x 1 ∂ x n ∂ 2 f ∂ x 2 ∂ x 1 ∂ 2 f ∂ x 2 2 . . . ∂ 2 f ∂ x 2 ∂ x n . . . . . . . . . . . . ∂ 2 f ∂ x n ∂ x 1 ∂ 2 f ∂ x n ∂ x 2 . . . ∂ 2 f ∂ x n 2 ] \nabla^2 f(\boldsymbol{x}) = \left[\begin{matrix} \frac{\partial^{2} f}{\partial x_{1}^{2}} & \frac{\partial^{2} f}{\partial x_{1}\partial x_{2}} & ... & \frac{\partial^{2} f}{\partial x_{1}\partial x_{n}}\\ \frac{\partial^{2} f}{\partial x_{2}\partial x_{1}} & \frac{\partial^{2} f}{\partial x_{2}^{2}} & ...& \frac{\partial^{2} f}{\partial x_{2}\partial x_{n}} \\ ... & ... & ... & ... \\ \frac{\partial^{2} f}{\partial x_{n}\partial x_{1}} & \frac{\partial^{2} f}{\partial x_{n}\partial x_{2}} & ... & \frac{\partial^{2} f}{\partial x_{n}^{2}} \end{matrix}\right] 2f(x)=x122fx2x12f...xnx12fx1x22fx222f...xnx22f............x1xn2fx2xn2f...xn22f

已知 F ( x ) = [ f 1 ( x ) f 2 ( x ) . . . f m ( x ) ] T F(\boldsymbol{x}) = [f_{1}(\boldsymbol{x})\quad f_{2}(\boldsymbol{x})\quad ...\quad f_{m}(\boldsymbol{x})]^{T} F(x)=[f1(x)f2(x)...fm(x)]T是关于列向量 x \boldsymbol{x} x的向量值函数
∂ F ( x ) ∂ x = [ ∂ f 1 ∂ x 1 ∂ f 1 ∂ x 2 . . . ∂ f 1 ∂ x n ∂ f 2 ∂ x 1 ∂ f 2 ∂ x 2 . . . ∂ f 2 ∂ x n . . . . . . . . . . . . ∂ f m ∂ x 1 ∂ f m ∂ x 2 . . . ∂ f m ∂ x n ] \frac{\partial F(\boldsymbol{x})}{\partial \boldsymbol{x}} = \left[\begin{matrix} \frac{\partial f_{1}}{\partial x_{1}} & \frac{\partial f_{1}}{\partial x_{2}} &...& \frac{\partial f_{1}}{\partial x_{n}}\\ \frac{\partial f_{2}}{\partial x_{1}} & \frac{\partial f_{2}}{\partial x_{2}} &...& \frac{\partial f_{2}}{\partial x_{n}} \\ ... & ... & ... & ...\\ \frac{\partial f_{m}}{\partial x_{1}} & \frac{\partial f_{m}}{\partial x_{2}} &...& \frac{\partial f_{m}}{\partial x_{n}} \end{matrix}\right] xF(x)=x1f1x1f2...x1fmx2f1x2f2...x2fm............xnf1xnf2...xnfm

函数 F ( x ) F(\boldsymbol{x}) F(x)在点 x \boldsymbol{x} x的雅克比矩阵记为 J J J,则 J = ∂ F ( x ) / ∂ x J = {\partial F(\boldsymbol{x})}/{\partial \boldsymbol{x}} J=F(x)/x,雅克比矩阵 J J J称为向量值函数 F ( x ) F(\boldsymbol{x}) F(x) x \boldsymbol{x} x处的导数,也记作 F ′ ( x ) F^{\prime}(\boldsymbol{x}) F(x) ∇ F ( x ) T \nabla F(\boldsymbol{x})^{T} F(x)T,其中 ∇ F = ( ∇ f 1 , ∇ f 2 , . . . , ∇ f m ) \nabla F = (\nabla f_{1}, \nabla f_{2},...,\nabla f_{m}) F=(f1,f2,...,fm)
已知向量 a = [ a 1 a 2 a 3 ] T \boldsymbol{a}= [a_{1}\quad a_{2}\quad a_{3}]^{T} a=[a1a2a3]T,则 a \boldsymbol{a} a的叉乘矩阵 a × \boldsymbol{a}^{\times} a×定义如下
a × = [ 0 − a 3 a 2 a 3 0 − a 1 − a 2 a 1 0 ] \boldsymbol{a}^{\times} = \left[\begin{matrix} 0 & -a_{3} & a_{2}\\ a_{3} & 0 & -a_{1}\\ -a_{2} & a_{1} & 0 \end{matrix}\right] a×=0a3a2a30a1a2a10

有了叉乘矩阵,向量叉乘可以像等式(1)那样表示
(1) a × b = a × b \boldsymbol{a} \times \boldsymbol{b} = \boldsymbol{a}^{\times} \boldsymbol{b} \tag{1} a×b=a×b(1)

常用求导公式

在以上定义的基础上,可以总结以下常用的求导公式
(2) ∂ ∥ a ∥ ∂ a = ∂ a ∂ a = a a \frac{\partial \lVert \boldsymbol{a}\rVert}{\partial \boldsymbol{a}} = \frac{\partial a}{\partial \boldsymbol{a}} = \frac{\boldsymbol{a}}{a} \tag{2} aa=aa=aa(2)

(3) ∂ a T a ∂ a = ∂ a 2 ∂ a = 2 a \frac{\partial \boldsymbol{a}^{T}\boldsymbol{a}}{\partial \boldsymbol{a}} = \frac{\partial a^{2}}{\partial \boldsymbol{a}} = 2\boldsymbol{a}\tag{3} aaTa=aa2=2a(3)

(4) ∂ ( A x ) ∂ x = A \frac{\partial(A \boldsymbol{x})}{\partial \boldsymbol{x}} = A \tag{4} x(Ax)=A(4)

(5) ∂ x T A x ∂ x = ( A + A T ) x \frac{\partial \boldsymbol{x}^{T} A \boldsymbol{x}}{\partial \boldsymbol{x}}= (A + A^{T})\boldsymbol{x} \tag{5} xxTAx=(A+AT)x(5)

已知 y = a × b + c \boldsymbol{y} = \boldsymbol{a} \times \boldsymbol{b}+\boldsymbol{c} y=a×b+c,则
(6) ∂ y ∂ b = ( a × ) T y y = − a × y y \frac{\partial y}{\partial\boldsymbol{b}} = (\boldsymbol{a}^{\times})^{T}\frac{\boldsymbol{y}}{y} = -\boldsymbol{a}^{\times} \frac{\boldsymbol{y}}{y} \tag{6} by=(a×)Tyy=a×yy(6)

(7) ∂ α ∂ r = [ ∂ V 1 ∂ r 1 ∂ V 1 ∂ r 2 ∂ V 1 ∂ r 3 ∂ V 2 ∂ r 1 ∂ V 2 ∂ r 2 ∂ V 2 ∂ r 3 ∂ V 3 ∂ r 1 ∂ V 3 ∂ r 2 ∂ V 3 ∂ r 3 ] T [ ∂ α ∂ V 1 ∂ α ∂ V 1 ∂ α ∂ V 1 ] = ( ∂ V ∂ r ) T ∂ α ∂ V \frac{\partial\alpha}{\partial\boldsymbol{r}} = \left[\begin{matrix} \frac{\partial V_{1}}{\partial r_{1}} & \frac{\partial V_{1}}{\partial r_{2}} & \frac{\partial V_{1}}{\partial r_{3}}\\ \frac{\partial V_{2}}{\partial r_{1}} & \frac{\partial V_{2}}{\partial r_{2}} & \frac{\partial V_{2}}{\partial r_{3}}\\ \frac{\partial V_{3}}{\partial r_{1}} & \frac{\partial V_{3}}{\partial r_{2}} & \frac{\partial V_{3}}{\partial r_{3}}\\ \end{matrix}\right]^{T} \left[\begin{matrix} \frac{\partial\alpha}{\partial V_{1}} \\ \frac{\partial\alpha}{\partial V_{1}} \\ \frac{\partial\alpha}{\partial V_{1}} \end{matrix}\right] = \left(\frac{\partial\boldsymbol{V}}{\partial\boldsymbol{r}}\right)^{T} \frac{\partial\alpha}{\partial\boldsymbol{V}} \tag{7} rα=r1V1r1V2r1V3r2V1r2V2r2V3r3V1r3V2r3V3TV1αV1αV1α=(rV)TVα(7)

你可能感兴趣的:(数学)