【机器学习总结】向量、矩阵求导公式

关于向量求导用到的公式实在是太多了…经常公式推着推着就被卡住,这里一次性做个总结吧。

文章目录

  • 0.引言
  • 1.向量对元素求导
  • 2.向量对向量求导
  • 3.矩阵对向量求导
  • 4.矩阵复合向量的求导

0.引言

  正文中,元素使用字母a,b,c等表示,向量使用小写的 x , y , z x,y,z xyz等表示,并且默认是列向量,矩阵使用大写的A,B,C进行表示。

1.向量对元素求导

  • 行向量对元素求导
    ∂ x T ∂ a = [ ∂ x 1 ∂ a , ∂ x 2 ∂ a , . . . , ∂ x n ∂ a ] \frac{\partial x^T}{\partial a}= \begin{bmatrix} \frac{\partial x_1}{\partial a} , \frac{\partial x_2}{\partial a},... ,\frac{\partial x_n}{\partial a} \end{bmatrix} axT=[ax1,ax2,...,axn]
  • 列向量对元素求导
    ∂ x ∂ a = [ ∂ x 1 ∂ a ∂ x 2 ∂ a . . . ∂ x n ∂ a ] \frac{\partial x}{\partial a}= \begin{bmatrix} \frac{\partial x_1}{\partial a} \\ \frac{\partial x_2}{\partial a}\\... \\\frac{\partial x_n}{\partial a} \end{bmatrix} ax=ax1ax2...axn

2.向量对向量求导

  • 行向量对列向量求导
    ∂ y T ∂ x = [ ∂ y 1 ∂ x 1 , ∂ y 2 ∂ x 1 , . . . , ∂ y n ∂ x 1 ∂ y 1 ∂ x 2 , ∂ y 2 ∂ x 2 , . . . , ∂ y n ∂ x 2 . . . ∂ y 1 ∂ x n , ∂ y 2 ∂ x n , . . . , ∂ y n ∂ x n ] \frac{\partial y^T}{\partial x}= \begin{bmatrix} \frac{\partial y_1}{\partial x_1} , \frac{\partial y_2}{\partial x_1},... ,\frac{\partial y_n}{\partial x_1} \\ \frac{\partial y_1}{\partial x_2} , \frac{\partial y_2}{\partial x_2},... ,\frac{\partial y_n}{\partial x_2} \\ ...\\ \frac{\partial y_1}{\partial x_n} , \frac{\partial y_2}{\partial x_n},... ,\frac{\partial y_n}{\partial x_n} \end{bmatrix} xyT=x1y1,x1y2,...,x1ynx2y1,x2y2,...,x2yn...xny1,xny2,...,xnyn
  • 列向量对行向量求导
    ∂ y ∂ x T = [ ∂ y 1 ∂ x 1 , ∂ y 1 ∂ x 2 , . . . , ∂ y 1 ∂ x n ∂ y 2 ∂ x 1 , ∂ y 2 ∂ x 2 , . . . , ∂ y 2 ∂ x n . . . ∂ y n ∂ x 1 , ∂ y n ∂ x 2 , . . . , ∂ y n ∂ x n ] \frac{\partial y}{\partial x^T}= \begin{bmatrix} \frac{\partial y_1}{\partial x_1} , \frac{\partial y_1}{\partial x_2},... ,\frac{\partial y_1}{\partial x_n} \\ \frac{\partial y_2}{\partial x_1} , \frac{\partial y_2}{\partial x_2},... ,\frac{\partial y_2}{\partial x_n} \\ ...\\ \frac{\partial y_n}{\partial x_1} , \frac{\partial y_n}{\partial x_2},... ,\frac{\partial y_n}{\partial x_n} \end{bmatrix} xTy=x1y1,x2y1,...,xny1x1y2,x2y2,...,xny2...x1yn,x2yn,...,xnyn
  • 行向量对行向量求导
    ∂ y T ∂ x T = [ ∂ y T ∂ x 1 , ∂ y T ∂ x 2 , . . . , ∂ y T ∂ x n ] \frac{\partial y^T}{\partial x^T}= \begin{bmatrix} \frac{\partial y_T}{\partial x_1} , \frac{\partial y_T}{\partial x_2},... ,\frac{\partial y_T}{\partial x_n} \end{bmatrix} xTyT=[x1yT,x2yT,...,xnyT]
  • 列向量对列向量求导
    ∂ y ∂ x = [ ∂ y 1 ∂ x ∂ y 2 ∂ x . . . ∂ y n ∂ x ] \frac{\partial y}{\partial x}= \begin{bmatrix} \frac{\partial y_1}{\partial x} \\ \frac{\partial y_2}{\partial x}\\... \\\frac{\partial y_n}{\partial x} \end{bmatrix} xy=xy1xy2...xyn

3.矩阵对向量求导

  • 矩阵对行向量求导
    ∂ A ∂ x T = [ ∂ A ∂ x 1 , ∂ A ∂ x 2 , . . . , ∂ A ∂ x n ] \frac{\partial A}{\partial x^T}= \begin{bmatrix} \frac{\partial A}{\partial x_1} , \frac{\partial A}{\partial x_2},... ,\frac{\partial A}{\partial x_n} \end{bmatrix} xTA=[x1A,x2A,...,xnA]
  • 矩阵对列向量求导
    ∂ A ∂ x = [ ∂ A 11 ∂ x , ∂ A 12 ∂ x , . . . , ∂ A 1 n ∂ x . . .   ∂ A n 1 ∂ x , ∂ A n 2 ∂ x , . . . , ∂ A n n ∂ x ] \frac{\partial A}{\partial x}= \begin{bmatrix} \frac{\partial A_{11}}{\partial x} , \frac{\partial A_{12}}{\partial x},... ,\frac{\partial A_{1n}}{\partial x}\\...\\\ \frac{\partial A_{n1}}{\partial x} , \frac{\partial A_{n2}}{\partial x},... ,\frac{\partial A_{nn}}{\partial x} \end{bmatrix} xA=xA11,xA12,...,xA1n... xAn1,xAn2,...,xAnn

4.矩阵复合向量的求导

  • d d x x T A = A \frac{d}{dx}x^TA=A dxdxTA=A
  • d d x T A x = A \frac{d}{dx^T}Ax=A dxTdAx=A
  • d d x x A = A T \frac{d}{dx}xA=A^T dxdxA=AT
  • d d x A x = A T \frac{d}{dx}Ax=A^T dxdAx=AT
  • d d x x T = I \frac{d}{dx}x^T=I dxdxT=I
  • d d x T x = I \frac{d}{dx^T}x=I dxTdx=I
  • d d x x T y = d d x y T x = y \frac{d}{dx}x^Ty=\frac{d}{dx}y^Tx=y dxdxTy=dxdyTx=y
  • d d x x T A y = x y T \frac{d}{dx}x^TAy=xy^T dxdxTAy=xyT
  • d d A x T A x = x x T \frac{d}{dA}x^TAx=xx^T dAdxTAx=xxT
  • d d A x T A T y = y x T \frac{d}{dA}x^TA^Ty=yx^T dAdxTATy=yxT
  • d d x x T A x = ( A + A T ) x = 2 A x \frac{d}{dx}x^TAx=(A+A^T)x=2Ax dxdxTAx=(A+AT)x=2Ax(当A为对称矩阵时第二个等式成立)
  • d d x x T x = 2 x \frac{d}{dx}x^Tx=2x dxdxTx=2x

你可能感兴趣的:(机器学习之旅)