向量求导

因变量为标量,自变量为向量

参考
y y y 为因变量,标量; X = [ x 1 , x 2 , … , x n ] T X=[x_1,x_2,\dots,x_n]^T X=[x1,x2,,xn]T 为自变量是向量,n维。
y = f ( X ) y=f(X) y=f(X),即!! y = f ( x 1 , x 2 , … , x n ) y = f(x_1,x_2,\dots,x_n) y=f(x1,x2,,xn)
因此可以直接求导:
∂ y ∂ X = ( ∂ y ∂ x 1 ; ∂ y ∂ x 2 ; …   ; ∂ y ∂ x n ) \frac{\partial y}{\partial X} = (\frac{\partial y}{\partial x_1};\frac{\partial y}{\partial x_2};\dots;\frac{\partial y}{\partial x_n}) Xy=(x1y;x2y;;xny)
求导结果为n维向量
y = a ⃗ T x ⃗ y = \vec a ^T\vec x y=a Tx :表示y为两个向量的内积,结果为一个标量
则求 ∂ y ∂ x ⃗ \frac{\partial y}{\partial \vec x} x y,只需求出所有的 ∂ y ∂ x ⃗ i \frac{\partial y}{\partial \vec x_i} x iy即可。
具体方法为:
y y y的表达式展开成累加和的形式,然后套用标量的求导法则即可,这一方法适用于所有多维情况的求导。
解:
y = a ⃗ T x ⃗ = ∑ i = 1 n a i x i y = \vec a^T\vec x=\sum_{i=1}^n a_i x_i y=a Tx =i=1naixi
故对 ∀ i \forall i i:
∂ y ∂ x i = a i \frac{\partial y}{\partial x_i} = a_i xiy=ai
故:
∂ y ∂ x ⃗ = ( ∂ y ∂ x 1 ; ∂ y ∂ x 2 ; …   ; ∂ y ∂ x n )   = ( a 1 ; a 2 ; …   ; a n )   = a \begin{aligned} \frac{\partial y}{\partial \vec x}&=(\frac{\partial y}{\partial x_1};\frac{\partial y}{\partial x_2};\dots;\frac{\partial y}{\partial x_n}) \\ ~&=(a_1;a_2;\dots ;a_n) \\ ~&=a \end{aligned} x y  =(x1y;x2y;;xny)=(a1;a2;;an)=a

注意:若 y = x ⃗ 点乘 x ⃗ y=\vec x 点乘 \vec x y=x 点乘x , 则求导结果是 2 x ⃗ 2\vec x 2x

例子:
向量求导_第1张图片
注意图中,向量 x x x w w w均写成了1n的形式,而不是我们通常的n1,因此最终算出来的结果里面为 x T x^T xT,而不是 x x x

因变量、自变量均为向量

当自变量和因变量均为向量时,求导结果为一个矩阵,我们称该矩阵为雅可比矩阵(Jacobian Matrix)。
向量求导_第2张图片

特别的,如果X为n*m的矩阵,w为m维向量,则
∂ X ∂ w ⃗ = X \frac{\partial X}{\partial \vec w} = X w X=X
证明:

X = [ x 11 x 12 … x 1 m x 21 x 22 … x 2 m ⋮ ⋮ ⋱ ⋮ x n 1 x n 2 … x n m ] , w = [ w 1 w 2 ⋮ w m ] X = \begin{bmatrix} x_{11}&x_{12}&\dots&x_{1m}\\ x_{21}&x_{22}&\dots&x_{2m}\\ \vdots&\vdots&\ddots&\vdots\\ x_{n1}&x_{n2}&\dots&x_{nm} \end{bmatrix}, w = \begin{bmatrix} w_{1}\\ w_2\\ \vdots\\ w_m \end{bmatrix} X= x11x21xn1x12x22xn2x1mx2mxnm ,w= w1w2wm
则,
z ⃗ = X w = [ x 11 w 1 + x 12 w 2 + ⋯ + x 1 m w m x 21 w 1 + x 22 w 2 + ⋯ + x 2 m w m ⋮ x n 1 w 1 + x n 2 w 2 + ⋯ + x n m w m ] = [ z 1 z 2 ⋮ z n ] \vec z=Xw=\begin{bmatrix} x_{11}w_1+x_{12}w_2+\dots+x_{1m}w_m\\ x_{21}w_1+x_{22}w_2+\dots+x_{2m}w_m\\ \vdots\\ x_{n1}w_1+x_{n2}w_2+\dots+x_{nm}w_m \end{bmatrix}=\begin{bmatrix} z_1\\ z_2\\ \vdots\\ z_n \end{bmatrix} z =Xw= x11w1+x12w2++x1mwmx21w1+x22w2++x2mwmxn1w1+xn2w2++xnmwm = z1z2zn

∂ X w ⃗ ∂ w ⃗ = ∂ z ⃗ ∂ w ⃗ = [ ∂ z 1 ∂ w 1 ∂ z 1 ∂ w 2 … ∂ z 1 ∂ w m ∂ z 2 ∂ w 1 ∂ z 2 ∂ w 2 … ∂ z 2 ∂ w m ⋮ ⋮ ⋱ ⋮ ∂ z n ∂ w 1 ∂ z n ∂ w 2 … ∂ z n ∂ w m ] = [ x 11 x 12 … x 1 m x 21 x 22 … x 2 m ⋮ ⋮ ⋱ ⋮ x n 1 x n 2 … x n m ] = X \begin{aligned} \frac{\partial X\vec w}{\partial \vec w} &= \frac{\partial \vec z}{\partial \vec w}\\ &=\begin{bmatrix} \frac{\partial z_1}{\partial w_1}&\frac{\partial z_1}{\partial w_2}&\dots&\frac{\partial z_1}{\partial w_m}\\ \frac{\partial z_2}{\partial w_1}&\frac{\partial z_2}{\partial w_2}&\dots&\frac{\partial z_2}{\partial w_m}\\ \vdots&\vdots&\ddots&\vdots\\ \frac{\partial z_n}{\partial w_1}&\frac{\partial z_n}{\partial w_2}&\dots&\frac{\partial z_n}{\partial w_m}\\ \end{bmatrix}\\ &=\begin{bmatrix} x_{11}&x_{12}&\dots&x_{1m}\\ x_{21}&x_{22}&\dots&x_{2m}\\ \vdots&\vdots&\ddots&\vdots\\ x_{n1}&x_{n2}&\dots&x_{nm} \end{bmatrix}\\ &=X \end{aligned} w Xw =w z = w1z1w1z2w1znw2z1w2z2w2znwmz1wmz2wmzn = x11x21xn1x12x22xn2x1mx2mxnm =X
例子:
向量求导_第3张图片

你可能感兴趣的:(线性代数,矩阵,python)