AutoCV第九课:ML基础

目录

  • 矩阵运算
    • 前言
    • 1. 矩阵乘法和求导
    • 总结

矩阵运算

前言

手写AI推出的全新保姆级从零手写自动驾驶CV课程,链接。记录下个人学习笔记,仅供自己参考。

本次课程主要学习矩阵运算的基础,考虑使用矩阵来表达多个线性回归模型。

课程大纲可看下面的思维导图。

AutoCV第九课:ML基础_第1张图片

1. 矩阵乘法和求导

先回忆下矩阵相关知识

定义矩阵乘法
{ a b d e } × { 1 3 2 4 } = { a 1 + b 2 a 3 + b 4 d 1 + e 2 d 3 + e 4 } \left\{\begin{array}{cc}a&b\\ d&e\end{array}\right\}\times\left\{\begin{array}{cc}1&3\\ 2&4\end{array}\right\}=\left\{\begin{array}{cc}a1+b2&a3+b4\\ d1+e2&d3+e4\end{array}\right\} {adbe}×{1234}={a1+b2d1+e2a3+b4d3+e4}
记法:C[r][c] = 乘加(A中取 r 行,B中取 c 列)

AutoCV第九课:ML基础_第2张图片

图1 矩阵乘法示例

参考:https://www.cnblogs.com/ljy-endl/p/11411665.html

矩阵求导

对于 A ⋅ B = C A\cdot B = C AB=C 定义 L L L 是关于 C C C 的损失函数

G = ∂ L ∂ C G = \dfrac{\partial L}{\partial C} G=CL 若直接 C C C A A A 求导,则 G G G 定义为 C C C 大小的全 1 矩阵,则有:
∂ L ∂ A = G ⋅ B T       ∂ L ∂ B = A T ⋅ G \dfrac{\partial L}{\partial A}=G\cdot B^T \ \ \ \ \ \dfrac{\partial L}{\partial B}=A^T \cdot G AL=GBT     BL=ATG
矩阵求导推导

  1. 考虑矩阵乘法 A ⋅ B = C A \cdot B = C AB=C

  2. 考虑 Loss 函数 L = ∑ i m ∑ j n ( C i j − p ) 2 L = \sum^m_{i}\sum^n_{j}{(C_{ij} - p)^2} L=imjn(Cijp)2

  3. 考虑 C C C 的每一项导数 ▽ C i j = ∂ L ∂ C i j \triangledown C_{ij} = \frac{\partial L}{\partial C_{ij}} Cij=CijL

  4. 考虑 A B C ABC ABC 都为 2x2 矩阵时,定义 G G G L L L C C C 的导数
    A = [ a b c d ] B = [ e f g h ] C = [ i j k l ] G = ∂ L ∂ C = [ ∂ L ∂ i ∂ L ∂ j ∂ L ∂ k ∂ L ∂ l ] = [ w x y z ] A = \begin{bmatrix} a & b\\ c & d \end{bmatrix} \quad B = \begin{bmatrix} e & f \\ g & h \end{bmatrix} \quad C = \begin{bmatrix} i & j \\ k & l \end{bmatrix} \quad G = \frac{\partial L}{\partial C} = \begin{bmatrix} \frac{\partial L}{\partial i} & \frac{\partial L}{\partial j} \\ \frac{\partial L}{\partial k} & \frac{\partial L}{\partial l} \end{bmatrix} = \begin{bmatrix} w & x \\ y & z \end{bmatrix} A=[acbd]B=[egfh]C=[ikjl]G=CL=[iLkLjLlL]=[wyxz]

  5. 展开左边 A ⋅ B A \cdot B AB

C = [ i = a e + b g j = a f + b h k = c e + d g l = c f + d h ] C = \begin{bmatrix} i = ae + bg & j = af + bh\\ k = ce + dg & l = cf + dh \end{bmatrix} C=[i=ae+bgk=ce+dgj=af+bhl=cf+dh]

  1. L L L 对于每一个 A A A 的导数
    ▽ A i j = ∂ L ∂ A i j \triangledown A_{ij} = \frac{\partial L}{\partial A_{ij}} Aij=AijL

    ∂ L ∂ a = ∂ L ∂ i ∗ ∂ i ∂ a + ∂ L ∂ j ∗ ∂ j ∂ a ∂ L ∂ b = ∂ L ∂ i ∗ ∂ i ∂ b + ∂ L ∂ j ∗ ∂ j ∂ b ∂ L ∂ c = ∂ L ∂ k ∗ ∂ k ∂ c + ∂ L ∂ l ∗ ∂ l ∂ c ∂ L ∂ d = ∂ L ∂ k ∗ ∂ k ∂ d + ∂ L ∂ l ∗ ∂ l ∂ d \begin{aligned} \frac{\partial L}{\partial a} &= \frac{\partial L}{\partial i} * \frac{\partial i}{\partial a} + \frac{\partial L}{\partial j} * \frac{\partial j}{\partial a} \\ \frac{\partial L}{\partial b} &= \frac{\partial L}{\partial i} * \frac{\partial i}{\partial b} + \frac{\partial L}{\partial j} * \frac{\partial j}{\partial b} \\ \frac{\partial L}{\partial c} &= \frac{\partial L}{\partial k} * \frac{\partial k}{\partial c} + \frac{\partial L}{\partial l} * \frac{\partial l}{\partial c} \\ \frac{\partial L}{\partial d} &= \frac{\partial L}{\partial k} * \frac{\partial k}{\partial d} + \frac{\partial L}{\partial l} * \frac{\partial l}{\partial d} \end{aligned} aLbLcLdL=iLai+jLaj=iLbi+jLbj=kLck+lLcl=kLdk+lLdl

    ∂ L ∂ a = w e + x f ∂ L ∂ b = w g + x h ∂ L ∂ c = y e + z f ∂ L ∂ d = y g + z h \begin{aligned} \frac{\partial L}{\partial a} &= we + xf \\ \frac{\partial L}{\partial b} &= wg + xh \\ \frac{\partial L}{\partial c} &= ye + zf \\ \frac{\partial L}{\partial d} &= yg + zh \end{aligned} aLbLcLdL=we+xf=wg+xh=ye+zf=yg+zh

  2. 因此 A A A 的导数为

    ∂ L ∂ A = [ w e + x f w g + x h y e + z f y g + z h ] ∂ L ∂ A = [ w x y z ] [ e g f h ] \frac{\partial L}{\partial A} = \begin{bmatrix} we + xf & wg + xh\\ ye + zf & yg + zh \end{bmatrix} \quad \frac{\partial L}{\partial A} = \begin{bmatrix} w & x\\ y & z \end{bmatrix} \begin{bmatrix} e & g\\ f & h \end{bmatrix} AL=[we+xfye+zfwg+xhyg+zh]AL=[wyxz][efgh]

    ∂ L ∂ A = G ⋅ B T \frac{\partial L}{\partial A} = G \cdot B^T AL=GBT

  3. 同理 B B B 的导数为
    ∂ L ∂ e = w a + y c ∂ L ∂ f = x a + z c ∂ L ∂ g = w b + y d ∂ L ∂ h = x b + z d \begin{aligned} \frac{\partial L}{\partial e} &= wa + yc \\ \frac{\partial L}{\partial f} &= xa + zc \\ \frac{\partial L}{\partial g} &= wb + yd \\ \frac{\partial L}{\partial h} &= xb + zd \end{aligned} eLfLgLhL=wa+yc=xa+zc=wb+yd=xb+zd

    ∂ L ∂ B = [ w a + y c x a + z c w b + y d x b + z d ] ∂ L ∂ B = [ a c b d ] [ w x y z ] \frac{\partial L}{\partial B} = \begin{bmatrix} wa + yc & xa + zc\\ wb + yd & xb + zd \end{bmatrix} \quad \frac{\partial L}{\partial B} = \begin{bmatrix} a & c\\ b & d \end{bmatrix} \begin{bmatrix} w & x\\ y & z \end{bmatrix} BL=[wa+ycwb+ydxa+zcxb+zd]BL=[abcd][wyxz]

    ∂ L ∂ B = A T ⋅ G \frac{\partial L}{\partial B} = A^T \cdot G BL=ATG

总结

本次课程学习了矩阵求导相关知识,后续实现多个线性回归或者逻辑逻辑回归模型可以考虑使用矩阵方式来表达

你可能感兴趣的:(保姆级从零手写自动驾驶CV,python,深度学习,自动驾驶)