Source: CMU 16-745 Study Notes, taught by Prof. Zac Manchester
Lecture 5 Optimization Part 3
min x ( t ) , u ( t ) J ( x ( t ) , u ( t ) ) = ∫ t 0 t f ℓ ( x ( t ) , u ( t ) ) d t + ℓ F ( x ( t f ) ) s.t. x ˙ ( t ) = f ( x ( t ) , u ( t ) ) \min_{x(t), u(t)} J(x(t), u(t)) = \int_{t_0}^{t_f} \ell(x(t), u(t)) \, dt + \ell_F(x(t_f)) \quad \text{s.t.} \quad \dot{x}(t) = f(x(t), u(t)) x(t),u(t)minJ(x(t),u(t))=∫t0tfℓ(x(t),u(t))dt+ℓF(x(tf))s.t.x˙(t)=f(x(t),u(t))
This is an infinite-dimensional optimization problem in the following sense:
u ( t ) = lim N → ∞ u 1 : N u(t) = \lim_{N \to \infty} u_{1:N} u(t)=N→∞limu1:N
Solutions are open-loop trajectories.
We will focus on the discrete-time setting which leads to tractable algorithms.
min x 1 : N , u 1 : N − 1 J ( x 1 : N , u 1 : N − 1 ) = ∑ k = 1 N − 1 ℓ ( x k , u k ) + ℓ F ( x N ) s.t. x k + 1 = f ( x k , u k ) \min_{x_{1:N}, u_{1:N-1}} J(x_{1:N}, u_{1:N-1}) = \sum_{k=1}^{N-1} \ell(x_k, u_k) + \ell_F(x_N) \quad \text{s.t.} \quad x_{k+1} = f(x_k, u_k) x1:N,u1:N−1minJ(x1:N,u1:N−1)=k=1∑N−1ℓ(xk,uk)+ℓF(xN)s.t.xk+1=f(xk,uk)
This is a finite-dimensional optimization problem.
Also known as the maximum principle if you maximize a reward.
First-order necessary conditions for deterministic optimal control problems.
We can form the Lagrangian:
L = ∑ k = 1 N − 1 [ ℓ ( x k , u k ) + λ k + 1 ⊤ ( f ( x k , u k ) − x k + 1 ) ] + ℓ F ( x N ) \mathcal{L} = \sum_{k=1}^{N-1} \left[ \ell(x_k, u_k) + \lambda_{k+1}^\top (f(x_k, u_k) - x_{k+1}) \right] + \ell_F(x_N) L=k=1∑N−1[ℓ(xk,uk)+λk+1⊤(f(xk,uk)−xk+1)]+ℓF(xN)
The Hamiltonian:
H ( x , u , λ ) = ℓ ( x , u ) + λ ⊤ f ( x , u ) \mathcal{H}(x, u, \lambda) = \ell(x, u) + \lambda^\top f(x, u) H(x,u,λ)=ℓ(x,u)+λ⊤f(x,u)
Plugging into L \mathcal{L} L:
L = H ( x 1 , u 1 , λ 2 ) + [ ∑ k = 2 N − 1 H ( x k , u k , λ k + 1 ) − λ k ⊤ x k ] + ℓ F ( x N ) − λ N ⊤ x N \mathcal{L} = \mathcal{H}(x_1, u_1, \lambda_2) + \left[ \sum_{k=2}^{N-1} \mathcal{H}(x_k, u_k, \lambda_{k+1}) - \lambda_k^\top x_k \right] + \ell_F(x_N) - \lambda_N^\top x_N L=H(x1,u1,λ2)+[k=2∑N−1H(xk,uk,λk+1)−λk⊤xk]+ℓF(xN)−λN⊤xN
Taking derivatives with respect to x k x_k xk and λ k \lambda_k λk:
For λ k \lambda_k λk:
∂ L ∂ λ k = ∂ H ∂ λ k − x k = f ( x k − 1 , u k − 1 ) − x k = 0 \frac{\partial \mathcal{L}}{\partial \lambda_k} = \frac{\partial \mathcal{H}}{\partial \lambda_k} - x_k = f(x_{k-1}, u_{k-1}) - x_k = 0 ∂λk∂L=∂λk∂H−xk=f(xk−1,uk−1)−xk=0
For x k x_k xk:
∂ L ∂ x k = ∂ H ∂ x k − λ k = ∂ ℓ ∂ x k + λ k + 1 ⊤ ∂ f ∂ x k − λ k ⊤ = 0 \frac{\partial \mathcal{L}}{\partial x_k} = \frac{\partial \mathcal{H}}{\partial x_k} - \lambda_k = \frac{\partial \ell}{\partial x_k} + \lambda_{k+1}^\top \frac{\partial f}{\partial x_k} - \lambda_k^\top = 0 ∂xk∂L=∂xk∂H−λk=∂xk∂ℓ+λk+1⊤∂xk∂f−λk⊤=0
For the N N N-th state:
∂ L ∂ x N = ∂ ℓ F ∂ x N − λ N ⊤ = 0 \frac{\partial \mathcal{L}}{\partial x_N} = \frac{\partial \ell_F}{\partial x_N} - \lambda_N^\top = 0 ∂xN∂L=∂xN∂ℓF−λN⊤=0
For u k u_k uk:
u k = arg min u H ( x k , u , λ k + 1 ) s.t. u ∈ U u_k = \arg \min_{u~} \mathcal{H}(x_k, u~, \lambda_{k+1}) \quad \text{s.t.} \quad u~ \in U uk=argu minH(xk,u ,λk+1)s.t.u ∈U
Summary
This is where the shooting method comes from.
Continuous-time version
x ˙ ( t ) = ∇ λ H ( x ( t ) , u ( t ) , λ ( t ) ) = f ( x ( t ) , u ( t ) ) \dot{x}(t) = \nabla_\lambda H(x(t), u(t), \lambda(t)) = f(x(t), u(t)) x˙(t)=∇λH(x(t),u(t),λ(t))=f(x(t),u(t)) − λ ˙ ( t ) = ∇ x H ( x ( t ) , u ( t ) , λ ( t ) ) = ∇ x ℓ ( x ( t ) , u ( t ) ) + ( ∂ f ∂ x ) ⊤ λ ( t ) -\dot{\lambda}(t) = \nabla_x H(x(t), u(t), \lambda(t)) = \nabla_x \ell(x(t), u(t)) + \left( \frac{\partial f}{\partial x} \right)^\top \lambda(t) −λ˙(t)=∇xH(x(t),u(t),λ(t))=∇xℓ(x(t),u(t))+(∂x∂f)⊤λ(t) u ( t ) = arg min u H ( x ( t ) , u , λ ( t ) ) s.t. u ∈ U u(t) = \arg \min_{u~} H(x(t), u~, \lambda(t)) \quad \text{s.t.} \quad u~ \in U u(t)=argu minH(x(t),u ,λ(t))s.t.u ∈U λ ( t f ) = ∂ ℓ F ∂ x ( t f ) \lambda(t_f) = \frac{\partial \ell_F}{\partial x}(t_f) λ(tf)=∂x∂ℓF(tf)
[1] Lecture 6 确定性最优控制(Deterministic Optimal Control
[2] 【Optimal Control (CMU 16-745)】Lecture 6 Deterministic Optimal Control Introduction
[3] CMU Optimal Control 16-745 Video From Bilibili
[4] CMU Optimal Control 16-745
感谢大家阅读这篇学习笔记!记录学习的内容,如果有问题或者不准确的地方,欢迎随时联系我改正,与大家共同进步!