【参考资料】
【1】《矩阵分析与应用》
【2】https://baike.baidu.com/item/雅可比矩阵/10753754?fr=aladdin
Jacobian矩阵是函数对向量求导,其结果是一阶偏导数组成的矩阵。假设: F : R n → R m F:R_n \to R_m F:Rn→Rm也就是一个n维欧式空间向m维欧式空间的一个映射。
举例:
R × [ 0 , π ] × [ 0 , 2 π ] → R 3 R \times [0, \pi] \times [0, 2\pi] \to R^3 R×[0,π]×[0,2π]→R3
原始坐标:
x 1 = r c o s θ s i n ϕ x_1 = rcos\theta sin\phi x1=rcosθsinϕ
x 2 = r s i n θ c o s ϕ x_2 = rsin\theta cos\phi x2=rsinθcosϕ
x 3 = r c o s ϕ x_3 = rcos\phi x3=rcosϕ
其Jacobian矩阵转换后如下:
J F ( r , θ , ϕ ) = [ ∂ x 1 ∂ r ∂ x 1 ∂ θ ∂ x 1 ∂ ϕ ∂ x 2 ∂ r ∂ x 2 ∂ θ ∂ x 2 ∂ ϕ ∂ x 3 ∂ r ∂ x 3 ∂ θ ∂ x 3 ∂ ϕ ] = [ c o s θ s i n ϕ − r s i n θ s i n ϕ r c o s θ c o s ϕ s i n θ c o s ϕ r c o s θ c o s ϕ − r s i n θ s i n ϕ c o s ϕ 0 − r s i n ϕ ] J_F(r, \theta, \phi)=\begin{bmatrix} \dfrac{\partial x_1}{\partial r} & \dfrac{\partial x_1}{\partial \theta} & \dfrac{\partial x_1}{\partial \phi} \\ \dfrac{\partial x_2}{\partial r} & \dfrac{\partial x_2}{\partial \theta} & \dfrac{\partial x_2}{\partial \phi} \\ \dfrac{\partial x_3}{\partial r} & \dfrac{\partial x_3}{\partial \theta} & \dfrac{\partial x_3}{\partial \phi} \\ \end{bmatrix}=\begin{bmatrix} cos\theta sin\phi & -rsin\theta sin\phi & rcos\theta cos\phi \\ sin\theta cos\phi & rcos\theta cos\phi & -rsin\theta sin\phi \\ cos\phi& 0 & -rsin\phi \\ \end{bmatrix} JF(r,θ,ϕ)=⎣⎢⎢⎢⎢⎢⎡∂r∂x1∂r∂x2∂r∂x3∂θ∂x1∂θ∂x2∂θ∂x3∂ϕ∂x1∂ϕ∂x2∂ϕ∂x3⎦⎥⎥⎥⎥⎥⎤=⎣⎡cosθsinϕsinθcosϕcosϕ−rsinθsinϕrcosθcosϕ0rcosθcosϕ−rsinθsinϕ−rsinϕ⎦⎤
2. 存在 R 4 R^4 R4的函数如下(Jacobian矩阵不一定是方阵):
y 1 = x 1 y_1 = x_1 y1=x1
y 2 = 5 x 3 y_2 = 5x_3 y2=5x3
y 3 = 4 x 2 2 − 2 x 3 y_3 = 4x_2^2-2x_3 y3=4x22−2x3
y 4 = x 3 s i n x 1 y_4 = x_3sinx_1 y4=x3sinx1
J F ( x 1 , x 2 , x 3 ) = [ ∂ y 1 ∂ x 1 ∂ y 1 ∂ x 2 ∂ y 1 ∂ x 3 ∂ y 2 ∂ x 1 ∂ y 2 ∂ x 2 ∂ y 2 ∂ x 3 ∂ y 3 ∂ x 1 ∂ y 3 ∂ x 2 ∂ y 3 ∂ x 3 ∂ y 4 ∂ x 1 ∂ y 4 ∂ x 2 ∂ y 4 ∂ x 3 ] [ 1 0 0 0 0 5 0 b x 2 − 2 x 3 c o s x 1 0 s i n x 1 ] J_F(x_1, x_2, x_3)=\begin{bmatrix} \dfrac{\partial y_1}{\partial x_1} & \dfrac{\partial y_1}{\partial x_2} & \dfrac{\partial y_1}{\partial x_3}\\ \dfrac{\partial y_2}{\partial x_1} & \dfrac{\partial y_2}{\partial x_2} & \dfrac{\partial y_2}{\partial x_3}\\ \dfrac{\partial y_3}{\partial x_1} & \dfrac{\partial y_3}{\partial x_2} & \dfrac{\partial y_3}{\partial x_3}\\ \dfrac{\partial y_4}{\partial x_1} & \dfrac{\partial y_4}{\partial x_2} & \dfrac{\partial y_4}{\partial x_3} \end{bmatrix}\begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 5 \\ 0 & bx_2 & -2 \\ x_3cosx_1 & 0 & sinx_1 \end{bmatrix} JF(x1,x2,x3)=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡∂x1∂y1∂x1∂y2∂x1∂y3∂x1∂y4∂x2∂y1∂x2∂y2∂x2∂y3∂x2∂y4∂x3∂y1∂x3∂y2∂x3∂y3∂x3∂y4⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤⎣⎢⎢⎡100x3cosx100bx2005−2sinx1⎦⎥⎥⎤
Hessian矩阵是由目标函数f在点X处的二阶偏导数组成的n阶对称矩阵。
函数f在 x 0 x_0 x0处求梯度如下:
▽ f ( X ( 0 ) ) = [ ∂ f ∂ x 1 , ∂ f ∂ x 2 , ⋯   , ∂ f ∂ x n ] T \bigtriangledown f(X^{(0)})=[\dfrac{\partial f}{\partial x_1}, \dfrac{\partial f}{\partial x_2},\cdots, \dfrac{\partial f}{\partial x_n}]^T ▽f(X(0))=[∂x1∂f,∂x2∂f,⋯,∂xn∂f]T
函数f在 x 0 x_0 x0处的Hessian矩阵如下:
G ( X ( 0 ) ) = [ ∂ 2 f ∂ x 1 2 , ∂ 2 f ∂ x 1 ∂ x 2 , ⋯ ∂ 2 f ∂ x 1 ∂ x n , ∂ 2 f ∂ x 2 ∂ x 1 ∂ 2 f ∂ x 2 2 , ⋯ ∂ 2 f ∂ x 2 ∂ x n , ⋯ ⋯ ⋯ ⋯ ∂ 2 f ∂ x n ∂ x 1 ∂ 2 f ∂ x n ∂ x 2 , ⋯ ∂ 2 f ∂ x n 2 ] G(X^{(0)})=\begin{bmatrix} \dfrac{\partial ^2 f}{\partial x_1^2, } & \dfrac{\partial ^2 f}{\partial x_1 \partial x_2, } & \cdots & \dfrac{\partial ^2 f}{\partial x_1 \partial x_n, } \\ \dfrac{\partial ^2 f}{\partial x_2 \partial x_1 } & \dfrac{\partial ^2 f}{\partial x_2^2, } & \cdots & \dfrac{\partial ^2 f}{\partial x_2 \partial x_n, } \\ \cdots & \cdots & \cdots & \cdots \\ \dfrac{\partial ^2 f}{\partial x_n \partial x_1} & \dfrac{\partial ^2 f}{\partial x_n \partial x_2, } & \cdots & \dfrac{\partial ^2 f}{\partial x_n^2 } \\ \end{bmatrix} G(X(0))=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡∂x12,∂2f∂x2∂x1∂2f⋯∂xn∂x1∂2f∂x1∂x2,∂2f∂x22,∂2f⋯∂xn∂x2,∂2f⋯⋯⋯⋯∂x1∂xn,∂2f∂x2∂xn,∂2f⋯∂xn2∂2f⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤
存在如下结论:
当G是正定矩阵时,函数f在 X 0 X_0 X0处有极小值
当G是负定矩阵时,函数f在 X 0 X_0 X0处有极大值
当G是不定矩阵时,函数f在 X 0 X_0 X0不是极值点
Hession矩阵可以作为二阶泰勒级数在多元下的展开,因此可以使用在牛顿法求极值。
备注:
正定矩阵:定义A是n阶方阵,对于任何非零向量x,都存在 x T A x > 0 x^TAx>0 xTAx>0。如果A是正定矩阵,其特征值都为正,所有顺序主子式都为正。