上一篇博客粗略地讲了最优控制中不同情形下泛函取得极值的必要条件。但是那个所谓的“泛函”是比较抽象的,这个博客将会细化这个问题,并且将问题具象化为:如何使用变分法求解最优控制问题。即:最小化
J = φ [ x ( t 1 ) , t 1 ] + ∫ t 0 t 1 L ( x , u , t ) d t J=\varphi\left[x(t_1),t_1\right]+\int_{t_0}^{t_1}{L(x,u,t)}dt J=φ[x(t1),t1]+∫t0t1L(x,u,t)dt
其中, φ ( ⋅ ) \varphi(\cdot) φ(⋅)是末端状态惩罚项, u u u是控制输入,并且系统要时时刻刻满足微分方程约束 x ˙ = f ( x , t ) \dot{x}=f(x,t) x˙=f(x,t)。
与上一个博客类似,这里还是分几种情况分别叙述。注意:所有的最优控制问题的初始时刻和初始状态都是已知的,这是合理且必须的 (啥时候开始、在哪开始都不知道,还玩个啥…)。
这类问题的数学描述为:
u ∗ = arg min u φ [ x ( t 1 ) ] + ∫ t ) t 1 L ( x , u , t ) d t s . t . x ˙ = f ( x , u , t ) , x 0 = x ( t 0 ) , ψ [ x ( t 1 ) ] = 0 \begin{align} \begin{aligned} u^* &= \argmin_{u}{\varphi\left[x(t_1)\right]+\int_{t_)}^{t_1}{L\left(x,u,t\right)}dt}\\ & s.t.\quad\dot{x}=f(x,u,t),x_0=x(t_0),\psi\left[x(t_1)\right]=0 \end{aligned} \end{align} u∗=uargminφ[x(t1)]+∫t)t1L(x,u,t)dts.t.x˙=f(x,u,t),x0=x(t0),ψ[x(t1)]=0
这实际上是一个带有等式约束的泛函极值的问题。他可以通过引入拉格朗日乘子 γ \gamma γ 和 λ ( t ) \lambda(t) λ(t),构造广义泛函 J a J_a Ja,并定义哈密尔顿函数的方法来解决。
构造广义泛函如下:
J a = φ ( x 1 ) + γ ψ ( x 1 ) + ∫ t ) t 1 { L ( x , u , t ) + λ [ f ( x , u , t − x ˙ ) ] } d t \begin{align} \begin{aligned} J_a &= \varphi\left(x_1\right)+\gamma\psi(x_1)+\int_{t_)}^{t_1}{\left\{L\left(x,u,t\right)+\lambda\left[f(x,u,t-\dot{x})\right]\right\}}dt \end{aligned} \end{align} Ja=φ(x1)+γψ(x1)+∫t)t1{L(x,u,t)+λ[f(x,u,t−x˙)]}dt
定义哈密尔顿函数 (Hamiltonian function) 如下:
H ( x , u , λ , t ) = L ( x , u , t ) + λ f ( x , u , t ) \begin{align} \begin{aligned} H(x,u,\lambda,t)=L(x,u,t)+\lambda f(x,u,t) \end{aligned} \end{align} H(x,u,λ,t)=L(x,u,t)+λf(x,u,t)
将哈密尔顿函数代入 (2),并进行一次分部积分有:
J a = φ ( x 1 ) + γ ψ ( x 1 ) + ∫ t 0 t 1 H ( x , u , γ , λ ) − λ x ˙ d t = φ ( x 1 ) + γ ψ ( x 1 ) − λ x ∣ t 0 t 1 + ∫ t 0 t 1 H ( x , u , γ , λ ) + λ ˙ x d t \begin{align} \begin{aligned} J_a &= \varphi\left(x_1\right)+\gamma\psi(x_1)+\int_{t_0}^{t_1}{H(x,u,\gamma,\lambda)-\lambda\dot{x}}dt\\ &=\varphi\left(x_1\right)+\gamma\psi(x_1)-\left.\lambda x\right|_{t_0}^{t_1}+\int_{t_0}^{t_1}{H(x,u,\gamma,\lambda)+\dot{\lambda}x}dt \end{aligned} \end{align} Ja=φ(x1)+γψ(x1)+∫t0t1H(x,u,γ,λ)−λx˙dt=φ(x1)+γψ(x1)−λx∣t0t1+∫t0t1H(x,u,γ,λ)+λ˙xdt
这里需要计算 J a J_a Ja 的变分,注意 J a J_a Ja 仅仅会受 δ x \delta x δx 和 δ u \delta u δu 影响,而不会受 λ \lambda λ 和 γ \gamma γ 影响。这里推导一次,剩下的都类似~~
J a ( x + δ x , u + δ u ) = φ ( x 1 + δ x 1 ) + γ ψ ( x 1 + δ x 1 ) − λ ( x + δ x ) ∣ t 0 t 1 + ∫ t 0 t 1 H ( x + δ x , u + δ u , γ , λ ) + λ ˙ ( x + δ x ) d t \begin{align} \begin{aligned} J_a(x+\delta x,u+\delta u) &= \varphi(x_1+\delta x_1)+\gamma\psi(x_1+\delta x_1)-\left.\lambda (x+\delta x)\right|_{t_0}^{t_1}\\ &+\int_{t_0}^{t_1}{H(x+\delta x,u+\delta u,\gamma,\lambda)+\dot{\lambda}(x+\delta x)}dt \end{aligned} \end{align} Ja(x+δx,u+δu)=φ(x1+δx1)+γψ(x1+δx1)−λ(x+δx)∣t0t1+∫t0t1H(x+δx,u+δu,γ,λ)+λ˙(x+δx)dt
太长了,写不下,(5) 中第一行记为 J a 1 ( x + δ x , u + δ u ) = J a 1 ~ J_{a_1}(x+\delta x,u+\delta u)=\tilde{J_{a_1}} Ja1(x+δx,u+δu)=Ja1~,第二行记为 J a 2 ( x + δ x , u + δ u ) = J a 2 ~ J_{a_2}(x+\delta x,u+\delta u)=\tilde{J_{a_2}} Ja2(x+δx,u+δu)=Ja2~。则有
J a 1 ~ = φ ( x 1 + δ x 1 ) + γ ψ ( x 1 + δ x 1 ) − λ x ∣ t 0 t 1 − λ ( t 1 ) δ x 1 = φ ( x 1 ) + ∂ φ ∂ x ∣ x = x 1 δ x 1 + γ ψ ( x 1 ) + γ ∂ ψ ∂ x ∣ x = x 1 δ x 1 \begin{align} \begin{aligned} \tilde{J_{a_1}} &= \varphi(x_1+\delta x_1)+\gamma\psi(x_1+\delta x_1)-\left.\lambda x\right|_{t_0}^{t_1}-\lambda(t_1)\delta x_1\\ &=\varphi(x_1)+\left.\frac{\partial\varphi}{\partial x}\right|_{x=x_1}\delta x_1+\gamma\psi(x_1)+\left.\gamma\frac{\partial\psi}{\partial x}\right|_{x=x_1}\delta x_1 \end{aligned} \end{align} Ja1~=φ(x1+δx1)+γψ(x1+δx1)−λx∣t0t1−λ(t1)δx1=φ(x1)+∂x∂φ x=x1δx1+γψ(x1)+γ∂x∂ψ x=x1δx1
J a 2 ~ = ∫ t 0 t 1 H ( x + δ x , u + δ u , γ , λ ) + λ ˙ ( x + δ x ) d t = ∫ t 0 t 1 H ( x , u , γ , λ ) + λ ˙ x d t + ∫ t 0 t 1 ∂ H ∂ x δ x + ∂ H ∂ u δ u + λ ˙ δ x d t \begin{align} \begin{aligned} \tilde{J_{a_2}} &= \int_{t_0}^{t_1}{H(x+\delta x,u+\delta u,\gamma,\lambda)+\dot{\lambda}(x+\delta x)}dt\\ &=\int_{t_0}^{t_1}{H(x,u,\gamma,\lambda)+\dot{\lambda}x}dt+\int_{t_0}^{t_1}{\frac{\partial H}{\partial x}\delta x+\frac{\partial H}{\partial u}\delta u +\dot{\lambda}\delta x}dt \end{aligned} \end{align} Ja2~=∫t0t1H(x+δx,u+δu,γ,λ)+λ˙(x+δx)dt=∫t0t1H(x,u,γ,λ)+λ˙xdt+∫t0t1∂x∂Hδx+∂u∂Hδu+λ˙δxdt
则 J a J_a Ja 的变分为:
δ J a = J a 1 ~ + J a 2 ~ − J a = [ ∂ φ ∂ x + γ ∂ ψ ∂ x − λ ( t 1 ) ∣ t = t 1 , x = x 1 ] δ x 1 + ∫ t 0 t 1 [ ∂ H ∂ x + λ ˙ ] δ x + ∂ H ∂ u δ u d t \begin{align} \begin{aligned} \delta J_a&=\tilde{J_{a_1}}+\tilde{J_{a_2}}-J_a\\ &=\left[\left.\frac{\partial\varphi}{\partial x} + \gamma\frac{\partial\psi}{\partial x} - \lambda(t_1)\right|_{t=t_1,x=x_1}\right]\delta x_1\\ &+\int_{t_0}^{t_1}{\left[\frac{\partial H}{\partial x}+\dot{\lambda}\right]\delta x+\frac{\partial H}{\partial u}\delta u}dt \end{aligned} \end{align} δJa=Ja1~+Ja2~−Ja=[∂x∂φ+γ∂x∂ψ−λ(t1) t=t1,x=x1]δx1+∫t0t1[∂x∂H+λ˙]δx+∂u∂Hδudt
很自然地,(8) 若要恒等于零,那么必须有以下必要条件成立
∂ H ( x ∗ , u ∗ , λ ∗ , t ) ∂ x + λ ∗ ˙ ( t ) = 0 \begin{align} \begin{aligned} \frac{\partial H\left(x^*,u^*,\lambda^*,t\right)}{\partial x}+\dot{\lambda^*}(t)=0 \end{aligned} \end{align} ∂x∂H(x∗,u∗,λ∗,t)+λ∗˙(t)=0
∂ H ( x ∗ , u ∗ , λ ∗ , t ) ∂ u = 0 \begin{align} \begin{aligned} \frac{\partial H\left(x^*,u^*,\lambda^*,t\right)}{\partial u}=0 \end{aligned} \end{align} ∂u∂H(x∗,u∗,λ∗,t)=0
∂ φ [ x ∗ ( t ) ] ∂ x + γ ∗ ∂ ψ [ x ∗ ( t ) ] ∂ x − λ ∗ ( t ) ∣ t = t 1 = 0 \begin{align} \begin{aligned} \left.\frac{\partial\varphi\left[x^*(t)\right]}{\partial x} + \gamma^*\frac{\partial\psi\left[x^*(t)\right]}{\partial x} - \lambda^*(t)\right|_{t=t_1}=0 \end{aligned} \end{align} ∂x∂φ[x∗(t)]+γ∗∂x∂ψ[x∗(t)]−λ∗(t) t=t1=0
同时,根据 (3) 中 哈密尔顿函数的定义,有
x ˙ ∗ = f ( x ∗ , u ∗ , t ) = ∂ H ( x ∗ , u ∗ , λ ∗ , t ) ∂ λ \begin{align} \begin{aligned} \dot{x}^*=f\left(x^*,u^*,t\right)=\frac{\partial H\left(x^*,u^*,\lambda^*,t\right)}{\partial \lambda} \end{aligned} \end{align} x˙∗=f(x∗,u∗,t)=∂λ∂H(x∗,u∗,λ∗,t)
算了,还是写一遍吧,万一以后我忘了,还能回来查。终端时刻自由比终端时刻固定多了一个终端时刻的变分 δ t 1 \delta t_1 δt1。
J a ( x + δ x , u + δ u , t 1 + δ t 1 ) = φ ( x 1 + δ x 1 , t 1 + δ t 1 ) + γ ψ ( x 1 + δ x 1 , t 1 + δ t 1 ) + ∫ t 0 t 1 + δ t 1 H ( x + δ x , u + δ u , γ , λ ) − λ ( x ˙ + δ x ˙ ) d t \begin{align} \begin{aligned} J_a(x+\delta x,u+\delta u,t_1+\delta t_1)&=\varphi(x_1+\delta x_1,t_1+\delta t_1)+\gamma\psi(x_1+\delta x_1,t_1+\delta t_1)\\ &+\int_{t_0}^{t_1+\delta t_1}{H(x+\delta x,u+\delta u,\gamma,\lambda)-\lambda(\dot{x}+\delta \dot{x})}dt \end{aligned} \end{align} Ja(x+δx,u+δu,t1+δt1)=φ(x1+δx1,t1+δt1)+γψ(x1+δx1,t1+δt1)+∫t0t1+δt1H(x+δx,u+δu,γ,λ)−λ(x˙+δx˙)dt
同理,令 (19) 中第一行记为 J a 1 ~ \tilde{J_{a_1}} Ja1~,第二行记为 J a 2 ~ \tilde{J_{a_2}} Ja2~,则有
δ J a 1 ~ = φ ( x 1 + δ x 1 , t 1 + δ t 1 ) + γ ψ ( x 1 + δ x 1 , t 1 + δ t 1 ) = ∂ φ ∂ x ∣ x = x 1 δ x 1 + ∂ φ ∂ t 1 δ t 1 + γ ∂ ψ ∂ x ∣ x = x 1 δ x 1 + γ ∂ ψ ∂ t 1 δ t 1 \begin{align} \begin{aligned} \delta\tilde{J_{a_1}} &= \varphi(x_1+\delta x_1,t_1+\delta t_1)+\gamma\psi(x_1+\delta x_1,t_1+\delta t_1) \\ &= \left.\frac{\partial\varphi}{\partial x}\right|_{x=x_1}\delta x_1+\frac{\partial\varphi}{\partial t_1}\delta t_1+\gamma\left.\frac{\partial\psi}{\partial x}\right|_{x=x_1}\delta x_1+\gamma\frac{\partial\psi}{\partial t_1}\delta t_1 \end{aligned} \end{align} δJa1~=φ(x1+δx1,t1+δt1)+γψ(x1+δx1,t1+δt1)=∂x∂φ x=x1δx1+∂t1∂φδt1+γ∂x∂ψ x=x1δx1+γ∂t1∂ψδt1
类似地,
δ J a 2 ~ = ∫ t 0 t 1 + δ t 1 H ( x + δ x , u + δ u , γ , λ ) − λ ( x ˙ + δ x ˙ ) d t − ∫ t 0 t 1 H ( x , u , γ , λ ) + λ x ˙ d t = ∫ t 0 t 1 H ( x + δ x , u + δ u , γ , λ ) − λ ( x ˙ + δ x ˙ ) − H ( x , u , γ , λ ) − λ x ˙ d t + ∫ t 1 t 1 + δ t 1 H ( x + δ x , u + δ u , γ , λ ) − λ ( x ˙ + δ x ˙ ) d t = ∫ t 0 t 1 ∂ H ∂ x δ x + ∂ H ∂ u δ u − λ δ x ˙ d t + [ H ( x 1 + θ δ x 1 , u + θ δ u , γ , λ ) − λ x ˙ 1 ] δ t 1 = − λ δ x ( t 1 ) + ∫ t 0 t 1 ( ∂ H ∂ x + λ ˙ ) δ x + ∂ H ∂ u δ u d t + [ H ( x 1 , u , γ , λ ) − λ x ˙ 1 ] δ t 1 \begin{align} \begin{aligned} \delta\tilde{J_{a_2}} &= \int_{t_0}^{t_1+\delta t_1}{H(x+\delta x,u+\delta u,\gamma,\lambda)-\lambda(\dot{x}+\delta \dot{x})}dt\\ &-\int_{t_0}^{t_1}{H(x,u,\gamma,\lambda)+\lambda\dot{x}}dt\\ &= \int_{t_0}^{t_1}{H(x+\delta x,u+\delta u,\gamma,\lambda)-\lambda(\dot{x}+\delta \dot{x})-H(x,u,\gamma,\lambda)-\lambda\dot{x}}dt\\ &+\int_{t_1}^{t_1+\delta t_1}{H(x+\delta x,u+\delta u,\gamma,\lambda)-\lambda(\dot{x}+\delta \dot{x})}dt\\ &=\int_{t_0}^{t_1}{\frac{\partial H}{\partial x}\delta x+\frac{\partial H}{\partial u}\delta u-\lambda\delta \dot{x}}dt+\left[H(x_1+\theta\delta x_1,u+\theta\delta u,\gamma,\lambda)-\lambda\dot{x}_1\right]\delta t_1\\ &=-\lambda\delta x(t_1)+\int_{t_0}^{t_1}{\left(\frac{\partial H}{\partial x}+\dot{\lambda}\right)\delta x+\frac{\partial H}{\partial u}\delta u}dt+\left[H(x_1,u,\gamma,\lambda)-\lambda\dot{x}_1\right]\delta t_1 \end{aligned} \end{align} δJa2~=∫t0t1+δt1H(x+δx,u+δu,γ,λ)−λ(x˙+δx˙)dt−∫t0t1H(x,u,γ,λ)+λx˙dt=∫t0t1H(x+δx,u+δu,γ,λ)−λ(x˙+δx˙)−H(x,u,γ,λ)−λx˙dt+∫t1t1+δt1H(x+δx,u+δu,γ,λ)−λ(x˙+δx˙)dt=∫t0t1∂x∂Hδx+∂u∂Hδu−λδx˙dt+[H(x1+θδx1,u+θδu,γ,λ)−λx˙1]δt1=−λδx(t1)+∫t0t1(∂x∂H+λ˙)δx+∂u∂Hδudt+[H(x1,u,γ,λ)−λx˙1]δt1
这里需要复习一下,上篇博客第二个图对应的近似公式:
δ x ( t 1 ) = δ x 1 − x ˙ ( t 1 ) ⋅ δ t 1 \delta x(t_1)=\delta x_1-\dot{x}(t_1)\cdot\delta t_1 δx(t1)=δx1−x˙(t1)⋅δt1
把它带入到 (21) 中,进而,
δ J a = δ J a 1 ~ + δ J a 2 ~ = ∂ φ ∂ x ∣ x = x 1 δ x 1 + ∂ φ ∂ t 1 δ t 1 + γ ∂ ψ ∂ x ∣ x = x 1 δ x 1 + γ ∂ ψ ∂ t 1 δ t 1 − λ δ x 1 + ∫ t 0 t 1 ( ∂ H ∂ x + λ ˙ ) δ x + ∂ H ∂ u δ u d t + [ H ( x 1 , u , γ , λ ) − λ x ˙ 1 ] δ t 1 = [ ∂ φ ( x 1 ) ∂ x 1 + γ ∂ ψ ( x 1 ) ∂ x 1 − λ ( t 1 ) ] δ x 1 + [ ∂ φ ( t 1 ) ∂ t 1 + γ ∂ ψ ( t 1 ) ∂ t 1 + H ( t 1 ) ] δ t 1 + ∫ t 0 t 1 ( ∂ H ∂ x + λ ˙ ) δ x + ∂ H ∂ u δ u d t \begin{align} \begin{aligned} \delta J_a &= \delta\tilde{J_{a_1}}+\delta\tilde{J_{a_2}}\\ &= \left.\frac{\partial\varphi}{\partial x}\right|_{x=x_1}\delta x_1+\frac{\partial\varphi}{\partial t_1}\delta t_1+\gamma\left.\frac{\partial\psi}{\partial x}\right|_{x=x_1}\delta x_1+\gamma\frac{\partial\psi}{\partial t_1}\delta t_1\\ &-\lambda\delta x_1+\int_{t_0}^{t_1}{\left(\frac{\partial H}{\partial x}+\dot{\lambda}\right)\delta x+\frac{\partial H}{\partial u}\delta u}dt+\left[H(x_1,u,\gamma,\lambda)-\lambda\dot{x}_1\right]\delta t_1\\ &=\left[\frac{\partial\varphi(x_1)}{\partial x_1}+\gamma\frac{\partial\psi(x_1)}{\partial x_1}-\lambda(t_1)\right]\delta x_1+\left[\frac{\partial\varphi(t_1)}{\partial t_1}+\gamma\frac{\partial\psi(t_1)}{\partial t_1}+H(t_1)\right]\delta t_1\\ &+\int_{t_0}^{t_1}{\left(\frac{\partial H}{\partial x}+\dot{\lambda}\right)\delta x+\frac{\partial H}{\partial u}\delta u}dt \end{aligned} \end{align} δJa=δJa1~+δJa2~=∂x∂φ x=x1δx1+∂t1∂φδt1+γ∂x∂ψ x=x1δx1+γ∂t1∂ψδt1−λδx1+∫t0t1(∂x∂H+λ˙)δx+∂u∂Hδudt+[H(x1,u,γ,λ)−λx˙1]δt1=[∂x1∂φ(x1)+γ∂x1∂ψ(x1)−λ(t1)]δx1+[∂t1∂φ(t1)+γ∂t1∂ψ(t1)+H(t1)]δt1+∫t0t1(∂x∂H+λ˙)δx+∂u∂Hδudt
与之前同理,若要实现最优控制,那么两部分变分必须都恒为零才行。下边分别讨论
至此,本文分析了以下六种情况下的最优控制的必要条件:
结合上一篇博客,通过变分法求解最优控制必要条件的内容,基本结束。
哪里推导错的,欢迎批评指正~