深度学习基础 - 链式法则
flyfish
复合函数
Function composition 从字面理解就是函数的组合
直观理解就是多个函数组合在一起
代码表示
#include
int f(int x){
return x+1;
}
int g(int x){
return x+2;
}
int main(){
int x=3;
std::cout<
如果看复合函数的定义,它已经脱离了人们的日常生活语言
复合函数 ( f ∘ g ) ′ ( x ) (f\circ g)'(x) (f∘g)′(x)的导数是:
( f ∘ g ) ′ ( x ) = f ′ ( g ( x ) ) g ′ ( x ) (f \circ g)^{\prime}(x)=f^{\prime}(g(x)) g^{\prime}(x) (f∘g)′(x)=f′(g(x))g′(x)
其中 ( f ∘ g ) ′ ( x ) (f\circ g)'(x) (f∘g)′(x)的读法是
中间的圈多种读法
g circle f
g round f
g about f
g composed with f
g after f
g following f
g of f
g on f
不同的写法
F ( x ) = f ( g ( x ) ) F ′ ( x ) = f ′ ( g ( x ) ) g ′ ( x ) F\left( x \right) = f\left( {g\left( x \right)} \right)\hspace{0.5in}F'\left( x \right) = f'\left( {g\left( x \right)} \right)g'\left( x \right) F(x)=f(g(x))F′(x)=f′(g(x))g′(x)
if z = f ( y ) and y = g ( x ) then \text { if } z=f(y) \text { and } y=g(x) \text { then } if z=f(y) and y=g(x) then
d z d x = d z d y ⋅ d y d x = f ′ ( y ) g ′ ( x ) = f ′ ( g ( x ) ) g ′ ( x ) \frac{d z}{d x}=\frac{d z}{d y} \cdot \frac{d y}{d x}=f^{\prime}(y) g^{\prime}(x)=f^{\prime}(g(x)) g^{\prime}(x) dxdz=dydz⋅dxdy=f′(y)g′(x)=f′(g(x))g′(x)
如何分解
y = sin 2 x y = sin u ; u = 2 x y = ln sin x y = ln u ; u = sin x y = ln cos e x y = ln u ; u = cos v ; ν = e x \begin{array}{l}{y=\sin 2 x \quad \quad y=\sin u ; u=2 x} \\ {y=\ln \sin x \quad y=\ln u ; u=\sin x} \\ {y=\ln \cos e^{x} \quad y=\ln u ; u=\cos v ; \nu=e^{x}}\end{array} y=sin2xy=sinu;u=2xy=lnsinxy=lnu;u=sinxy=lncosexy=lnu;u=cosv;ν=ex
如何求导
例子
求 f ( g ( x ) ) = ( 3 x + 1 ) 2 f(g(x))=(3x+1)^2 f(g(x))=(3x+1)2 的导数
解:
f ( g ) = g 2 , g ( x ) = 3 x + 1 f(g)=g^2\ ,\ g(x)=3x+1 f(g)=g2 , g(x)=3x+1
f ′ ( g ) = 2 g , g ′ ( x ) = 3 f'(g)=2g\ ,\ g'(x)=3 f′(g)=2g , g′(x)=3
f ( g ( x ) ) ′ = 2 ( 3 x + 1 ) ( 3 ) = 18 x + 6 f(g(x))'=2(3x+1)(3)=18x+6 f(g(x))′=2(3x+1)(3)=18x+6
求 f ( g ( x ) ) = sin ( x 2 + 2 ) f(g(x))=\sin(x^2+2) f(g(x))=sin(x2+2) 的导数
解
f ( g ) = sin ( g ) , g ( x ) = x 2 + 2 f(g)=\sin(g)\ ,\ g(x)=x^2+2 f(g)=sin(g) , g(x)=x2+2
f ′ ( g ) = cos ( g ) , g ′ ( x ) = 2 x f'(g)=\cos(g)\ ,\ g'(x)=2x f′(g)=cos(g) , g′(x)=2x
f ( g ( x ) ) ′ = cos ( x 2 + 2 ) ⋅ 2 x = 2 x ⋅ cos ( x 2 + 2 ) f(g(x))'=\cos(x^2+2)\cdot2x=2x\cdot\cos(x^2+2) f(g(x))′=cos(x2+2)⋅2x=2x⋅cos(x2+2)
求 H ( x ) = ( 2 x + 1 ) 3 \mathrm{H}(\mathrm{x})=(2 \mathrm{x}+1)^{3} H(x)=(2x+1)3的导数
f ( x ) = 2 x + 1 f(x)=2 x+1 f(x)=2x+1 , g ( x ) = x 3 g(x)=x^{3} g(x)=x3
f ′ ( x ) = 2 f^{\prime}(x)=2 f′(x)=2
g ′ ( x ) = 3 x 2 g^{\prime}(x)=3 x^{2} g′(x)=3x2
H ( x ) = g ′ ( f ( x ) ) f ′ ( x ) = g ′ ( 2 x + 1 ) ( 2 ) = 3 ( 2 x + 1 ) 2 ( 2 ) = 6 ( 2 x + 1 ) 2 \begin{aligned} H(x) &=g^{\prime}(f(x)) f^{\prime}(x) \\ &=g^{\prime}(2 x+1)(2) \\ &=3(2 x+1)^{2}(2)=6(2 x+1)^{2} \end{aligned} H(x)=g′(f(x))f′(x)=g′(2x+1)(2)=3(2x+1)2(2)=6(2x+1)2
三个函数的组合
( f ∘ g ∘ h ) ′ ( a ) = f ′ ( ( g ∘ h ) ( a ) ) ⋅ ( g ∘ h ) ′ ( a ) = f ′ ( ( g ∘ h ) ( a ) ) ⋅ g ′ ( h ( a ) ) ⋅ h ′ ( a ) = ( f ′ ∘ g ∘ h ) ( a ) ⋅ ( g ′ ∘ h ) ( a ) ⋅ h ′ ( a ) \begin{aligned}(f \circ g \circ h)^{\prime}(a) &=f^{\prime}((g \circ h)(a)) \cdot(g \circ h)^{\prime}(a) \\ &=f^{\prime}((g \circ h)(a)) \cdot g^{\prime}(h(a)) \cdot h^{\prime}(a)=\left(f^{\prime} \circ g \circ h\right)(a) \cdot\left(g^{\prime} \circ h\right)(a) \cdot h^{\prime}(a) \end{aligned} (f∘g∘h)′(a)=f′((g∘h)(a))⋅(g∘h)′(a)=f′((g∘h)(a))⋅g′(h(a))⋅h′(a)=(f′∘g∘h)(a)⋅(g′∘h)(a)⋅h′(a)
简写
d y d x = d y d u ⋅ d u d v ⋅ d v d x \frac{d y}{d x}=\frac{d y}{d u} \cdot \frac{d u}{d v} \cdot \frac{d v}{d x} dxdy=dudy⋅dvdu⋅dxdv
如果使用非标准微积分的方式更容易证明
If y = f ( x ) and x = g ( t ) then Δ t ≠ 0 Δ x = g ( t + Δ t ) − g ( t ) Δ y = f ( x + Δ x ) − f ( x ) , so Δ y Δ t = Δ y Δ x Δ x Δ t the stdrandard part d y d t = d y d x d x d t \begin{array}{l}{\text { If } y=f(x) \text { and } x=g(t) \text { then }{ \Delta t \neq 0 } \quad \Delta x=g(t+\Delta t)-g(t) } \\\\ {\Delta y=f(x+\Delta x)-f(x), \text { so }}\\ \\ {\quad \frac{\Delta y}{\Delta t}=\frac{\Delta y}{\Delta x} \frac{\Delta x}{\Delta t}} \\\\ {\text { the stdrandard part}} \\ \\ {\quad \frac{d y}{d t}=\frac{d y}{d x} \frac{d x}{d t}}\end{array} If y=f(x) and x=g(t) then Δt=0Δx=g(t+Δt)−g(t)Δy=f(x+Δx)−f(x), so ΔtΔy=ΔxΔyΔtΔx the stdrandard partdtdy=dxdydtdx
非标准微积分就像我们在计算机程序中最后加个函数一样简单
st 表示Standard part function,有的书籍是std
f ′ ( x ) = st ( f ( x + h ) − f ( x ) h ) f^{\prime}(x)=\operatorname{st}\left(\frac{f(x+h)-f(x)}{h}\right) f′(x)=st(hf(x+h)−f(x))