之前看过崔华坤的《VINS论文推导及代码解析》还有深蓝学院的VIO课程,对VINS的后端非线性优化有了较为清晰的认识,但是一直没有时间整理写成笔记,最近看到Manni的博客VINS-Mono理论学习——后端非线性优化 概括得很不错,针对这三份资料还有自己的一些理解重新整理下,感谢优秀的大佬们提供的参考资料。
尽管非线性最小二乘是很常见的问题,可参考《SLAM14讲》第六章节,《多视图几何》附录6,乔治梅森大学Timothy Sauer的《数值分析》(忘记第几章了,书也没带回家,哭.jpg),在我之前的博客非线性最小二乘也对Guass-Newton和LM进行了介绍,这里简单回顾一下,由于VINS-Mono中在视觉重投影误差添加了鲁棒核函数,因此也整理下对 残差(也可以说是误差,一个意思,以下内容不分残差和误差) 添加鲁棒核函数对需要优化的状态量增量方程的影响。
对于最常见的最小二乘问题有: m i n x F ( x ) = m i n x 1 2 ∥ f ( x ) ∥ 2 2 \underset{x}{min}F(x) = \underset{x}{min}\frac{1}{2} \left \| f(x) \right \|_{2}^{2} xminF(x)=xmin21∥f(x)∥22 将残差写成组合向量的形式: f ( x ) = [ f 1 ( x ) ⋯ f m ( x ) ] f(x) = \begin{bmatrix} f_{1}(x) \\ \cdots \\ f_{m}(x)\end{bmatrix} f(x)=⎣⎡f1(x)⋯fm(x)⎦⎤, 则 F ( x ) = 1 2 ∥ f ( x ) ∥ 2 2 = 1 2 f T ( x ) f ( x ) = 1 2 ∑ i = 1 m ( f i ( x ) ) 2 F(x)=\frac{1}{2} \left \| f(x) \right \|_{2}^{2}=\frac{1}{2} f^{T}(x)f(x)=\frac{1}{2} \sum_{i=1}^{m}(f_{i}(x))^{2} F(x)=21∥f(x)∥22=21fT(x)f(x)=21i=1∑m(fi(x))2 同理,如果记 J i ( x ) = ∂ f i ( x ) ∂ x J_{i}(x) = \frac{\partial {f_{i}(x)}}{\partial x} Ji(x)=∂x∂fi(x), 则有: ∂ f ( x ) ∂ x = J m × n = [ J 1 ( x ) ⋯ J m ( x ) ] , J i ( x ) 1 × n = [ ∂ f i ( x ) ∂ x 1 ∂ f i ( x ) ∂ x 2 ⋯ ∂ f i ( x ) ∂ x n ] \frac{\partial {f(x)}}{\partial x} = J^{m \times n} = \begin{bmatrix} J_{1}(x) \\ \cdots \\ J_{m}(x)\end{bmatrix}\ , \ J_{i}(x) ^{1 \times n} = \begin{bmatrix} \frac{\partial f_{i}(x)}{\partial x_{1}} \ \frac{\partial f_{i}(x)}{\partial x_{2}}\ \ \cdots \ \frac{\partial f_{i}(x)} {\partial x_{n}}\end{bmatrix} ∂x∂f(x)=Jm×n=⎣⎡J1(x)⋯Jm(x)⎦⎤ , Ji(x)1×n=[∂x1∂fi(x) ∂x2∂fi(x) ⋯ ∂xn∂fi(x)]
对残差函数 f ( x ) f(x) f(x)进行一阶泰勒展开: f ( x + Δ x ) = f ( x ) + J Δ x f(x+\Delta x ) = f(x) + J \Delta x f(x+Δx)=f(x)+JΔx, 其中 J J J是残差函数 f f f的雅可比矩阵,代入损失函数(以下推导用 f f f代替 f ( x ) f(x) f(x)): F ( x + Δ x ) ≈ L ( Δ x ) = 1 2 ( f + J Δ x ) T ( f + J Δ x ) = 1 2 f T f + Δ x T J T f + 1 2 Δ x T J T J Δ x = F ( x ) + Δ x T J T f + 1 2 Δ x T J T J Δ x (1) F(x+\Delta x) \approx L(\Delta x) = \frac{1}{2} ( f + J \Delta x)^{T}(f + J \Delta x) = \frac{1}{2} f^{T}f + \Delta x^{T} J^{T} f + \frac{1}{2} \Delta x^{T}J^{T}J\Delta x \\ = F(x) + \Delta x^{T} J^{T} f + \frac{1}{2} \Delta x^{T}J^{T}J\Delta x \tag{1} F(x+Δx)≈L(Δx)=21(f+JΔx)T(f+JΔx)=21fTf+ΔxTJTf+21ΔxTJTJΔx=F(x)+ΔxTJTf+21ΔxTJTJΔx(1) 这样损失函数就近似成了一个二次函数,如果雅可比矩阵是满秩的(并不一定满秩),则 J T J J^{T}J JTJ正定,损失函数有最小值,且易得 F ′ ( x ) = ( J T f ) T , F ′ ′ ( x ) = J T J F'(x) = (J^{T}f)^{T}, \ F''(x) = J^{T}J F′(x)=(JTf)T, F′′(x)=JTJ
Gauss-Newton Method:
令公式(1)的一阶导数等于0,得到 ( J T J ) Δ x g n = − J T f (J^{T}J) \Delta x_{gn} = -J^{T}f (JTJ)Δxgn=−JTf 由于 H = J T J H= J^{T}J H=JTJ可能出现H矩阵奇异或者病态,此时高斯牛顿法增量的稳定性较差,导致算法不收敛。同时对于步长问题,若求出来的 Δ x g n Δx_{gn} Δxgn太长,则可能出现局部近似不够准确,无法保证迭代收敛。
Levenberg-Marquardt Method (LM):
在高斯牛顿法的基础上引入阻尼因子: ( J T J + μ I ) Δ x l m = − J T f , s . t . μ ≥ 0 (J^{T}J + \mu I) \Delta x_{lm} = -J^{T}f, \ \ \ s.t. \ \ \mu \geq 0 (JTJ+μI)Δxlm=−JTf, s.t. μ≥0 其中 μ \mu μ称为阻尼因子。
阻尼因子的作用: μ > 0 \mu > 0 μ>0保证 ( J T J + μ I ) (J^{T}J+\mu I) (JTJ+μI)正定,迭代朝着下降方向进行。
阻尼因子的初始化: μ 0 = τ m a x { ( J T J ) i j } \mu_{0} = \tau \ max \begin{Bmatrix} (J^{T}J)_{ij} \end{Bmatrix} μ0=τ max{(JTJ)ij}, 按需设定 τ ∈ [ 1 0 − 8 , 1 ] \tau \in [10^{-8}, 1] τ∈[10−8,1]。
阻尼因子 μ \mu μ的更新策略:
Δ x \Delta x Δx和阻尼因子 μ \mu μ的关系:
阻尼因子的更新由比例因子来确定:
ρ = F ( x ) − F ( x + Δ x l m ) L ( 0 ) − L ( Δ x l m ) = 实 际 下 降 近 似 下 降 \rho = \frac{F(x) - F(x+\Delta x_{lm})}{L(0) - L(\Delta x_{lm})} = \frac{{实际下降}}{近似下降} ρ=L(0)−L(Δxlm)F(x)−F(x+Δxlm)=近似下降实际下降 其中 L ( 0 ) − L ( Δ x l m ) = − Δ x l m T J T f − 1 2 Δ x l m T J T J Δ x = − 1 2 Δ x l m T ( 2 J T f + ( J T J + μ I − μ I ) Δ x l m ) = − 1 2 Δ x l m T ( J T f − μ I Δ x l m ) = 1 2 Δ x l m T ( μ Δ x l m − J T f ) L(0)-L(\Delta x_{lm}) = - \Delta x_{lm}^{T} J^{T} f - \frac{1}{2} \Delta x_{lm}^{T}J^{T}J\Delta x \\ =-\frac{1}{2} \Delta x_{lm}^{T} (2 J^{T}f + (J^{T}J+\mu I - \mu I) \Delta x_{lm}) \\ = -\frac{1}{2} \Delta x_{lm}^{T} ( J^{T}f - \mu I\Delta x_{lm}) \\ = \frac{1}{2} \Delta x_{lm}^{T}(\mu \Delta x_{lm} - J^{T}f) L(0)−L(Δxlm)=−ΔxlmTJTf−21ΔxlmTJTJΔx=−21ΔxlmT(2JTf+(JTJ+μI−μI)Δxlm)=−21ΔxlmT(JTf−μIΔxlm)=21ΔxlmT(μΔxlm−JTf) 首先比例因子分母始终大于 0 0 0,因为是沿负梯度方向进行的 Δ x l m \Delta x_{lm} Δxlm的调整,故 L ( 0 ) > L ( Δ x l m ) L(0) > L(\Delta x_{lm}) L(0)>L(Δxlm),如果:
由上得 F ( x ) = 1 2 ∥ f ( x ) ∥ 2 2 = 1 2 f T ( x ) f ( x ) = 1 2 ∑ i = 1 m ( f i ( x ) ) 2 F(x)=\frac{1}{2} \left \| f(x) \right \|_{2}^{2}=\frac{1}{2} f^{T}(x)f(x)=\frac{1}{2} \sum_{i=1}^{m}(f_{i}(x))^{2} F(x)=21∥f(x)∥22=21fT(x)f(x)=21i=1∑m(fi(x))2 鲁棒核函数直接作用于残差 f i ( x ) f_{i}(x) fi(x)上,则有
m i n x 1 2 ∑ i ρ ( ∥ f i ( x ) ∥ 2 2 ) = m i n x 1 2 ∑ i ρ ( f i ( x ) 2 ) \underset{x}{min} \frac{1}{2} \sum_{i} \rho (\left \| f_{i}(x) \right \|_{2}^{2}) = \underset{x}{min} \frac{1}{2} \sum_{i} \rho (f_{i}(x) ^{2}) xmin21i∑ρ(∥fi(x)∥22)=xmin21i∑ρ(fi(x)2) 将误差的平方项记作 s i = ∥ f i ( x ) ∥ 2 2 s_{i}=\left \|f_{i}(x) \right \|_{2}^{2} si=∥fi(x)∥22,对鲁棒核误差函数进行二阶泰勒展开有:
ρ ( s + Δ s ) ≈ ρ ( s ) + ρ ′ ( s ) Δ s + 1 2 ρ ′ ′ ( s ) Δ 2 s (2) \rho(s+\Delta s) \approx \rho(s) + \rho'(s) \Delta s + \frac{1}{2} \rho '' (s) \Delta^{2} s \tag{2} ρ(s+Δs)≈ρ(s)+ρ′(s)Δs+21ρ′′(s)Δ2s(2)
对于 Δ s \Delta s Δs的计算,我们有:
Δ s = ∥ f i ( x + Δ x ) ∥ 2 2 − ∥ f i ( x ) ∥ 2 2 ≈ ∥ f i ( x ) + J i Δ x ∥ 2 2 − ∥ f i ( x ) ∥ 2 2 = ( f i ( x ) + J i Δ x ) 2 − f i 2 ( x ) = 2 f i T ( x ) J i Δ x + ( Δ x ) T J i T J i Δ x (3.1) \Delta s = \left \| f_{i}(x + \Delta x) \right \|^{2}_{2} - \left \| f_{i}(x) \right \|^{2}_{2} \approx \left \| f_{i}(x) + J_{i} \Delta x \right \|^{2}_{2} - \left \| f_{i}(x) \right \|_{2}^{2} \\ = (f_{i}(x) + J_{i} \Delta x)^{2} - f_{i}^{2}(x) = 2f^{T}_{i}(x)J_{i}\Delta x + (\Delta x)^{T}J^{T}_{i}J_{i}\Delta x \tag{3.1} Δs=∥fi(x+Δx)∥22−∥fi(x)∥22≈∥fi(x)+JiΔx∥22−∥fi(x)∥22=(fi(x)+JiΔx)2−fi2(x)=2fiT(x)JiΔx+(Δx)TJiTJiΔx(3.1)
若误差项为 Σ \Sigma Σ范数(表征为误差项的协方差矩阵,协方差为对称矩阵,故协方差的逆(称为信息矩阵)亦为对称矩阵)而非二范数,我们有:
Δ s = ∥ f i ( x + Δ x ) ∥ Σ i 2 − ∥ f i ( x ) ∥ Σ i 2 ≈ ∥ f i ( x ) + J i Δ x ∥ Σ i 2 − ∥ f i ( x ) ∥ Σ i 2 = ( f i ( x ) + J i Δ x ) T Σ i − 1 ( f i ( x ) + J i Δ x ) − f i T ( x ) Σ i − 1 f i ( x ) = ( Σ i − T f i ( x ) ) T J i Δ x + ( J i Δ x ) T Σ i − 1 f i ( x ) + Δ x T J i T Σ i − 1 J i Δ x = 2 f i T ( x ) Σ i − 1 J i Δ x + Δ x T J i T Σ i − 1 J i Δ x (3.2) \Delta s = \left \| f_{i}(x + \Delta x) \right \|^{2}_{\Sigma_{i}} - \left \| f_{i}(x) \right \|^{2}_{\Sigma_{i}} \approx \left \| f_{i}(x) + J_{i} \Delta x \right \|^{2}_{\Sigma_{i}} - \left \| f_{i}(x) \right \|_{\Sigma_{i}}^{2} \\ = (f_{i}(x) + J_{i} \Delta x)^{T}\Sigma_{i}^{-1}(f_{i}(x) + J_{i} \Delta x) - f_{i}^{T}(x)\Sigma_{i}^{-1}f_{i}(x) \\ = (\Sigma_{i}^{-T}f_{i}(x))^{T}J_{i}\Delta x + (J_{i}\Delta x)^{T}\Sigma_{i}^{-1}f_{i}(x) + \Delta x^{T}J_{i}^{T}\Sigma^{-1}_{i}J_{i}\Delta x \\ = 2f^{T}_{i}(x)\Sigma_{i}^{-1}J_{i}\Delta x + \Delta x^{T}J_{i}^{T}\Sigma^{-1}_{i}J_{i}\Delta x \tag{3.2} Δs=∥fi(x+Δx)∥Σi2−∥fi(x)∥Σi2≈∥fi(x)+JiΔx∥Σi2−∥fi(x)∥Σi2=(fi(x)+JiΔx)TΣi−1(fi(x)+JiΔx)−fiT(x)Σi−1fi(x)=(Σi−Tfi(x))TJiΔx+(JiΔx)TΣi−1fi(x)+ΔxTJiTΣi−1JiΔx=2fiT(x)Σi−1JiΔx+ΔxTJiTΣi−1JiΔx(3.2)
将公式 ( 3 ) (3) (3)代入公式 ( 2 ) (2) (2),令 f i ( x ) = f i f_{i} (x)= f_{i} fi(x)=fi , ρ = ρ ( s ) \rho = \rho(s) ρ=ρ(s),可得
1 2 ρ ( s + Δ s ) ≈ 1 2 ( ρ ( s ) + ρ ′ ( s ) Δ s + 1 2 ρ ′ ′ ( s ) Δ 2 s ) = 1 2 ( ρ + ρ ′ [ 2 f i T J i Δ x + ( Δ x ) T J i T J i Δ x ] + 1 2 ρ ′ ′ [ 2 f i T J i Δ x + ( Δ x ) T J i T J i Δ x ] 2 ) ≈ ρ + ρ ′ f i T J i Δ x + 1 2 ρ ′ ( Δ x ) T J i T J i T Δ x + ρ ′ ′ ( Δ x ) T J i T f i f i T J i Δ x = ρ + ρ ′ f i T J i Δ x + 1 2 ( Δ x ) T J i T ( ρ ′ I + 2 ρ ′ ′ f i f i T ) J i Δ x (4.1) \frac{1}{2} \rho(s+\Delta s) \approx \frac{1}{2} (\rho(s) + \rho'(s) \Delta s + \frac{1}{2} \rho '' (s) \Delta^{2} s )\\ = \frac{1}{2}(\rho + \rho'[2f^{T}_{i}J_{i}\Delta x + (\Delta x)^{T}J^{T}_{i}J_{i}\Delta x]+ \frac{1}{2} \rho ''[2f^{T}_{i}J_{i}\Delta x + (\Delta x)^{T}J^{T}_{i}J_{i}\Delta x]^{2}) \\ \approx \rho + \rho'f_{i}^{T}J_{i}\Delta x + \frac{1}{2} \rho'(\Delta x)^{T}J_{i}^{T}J_{i}^{T}\Delta x + \rho ''(\Delta x)^{T}J_{i}^{T}f_{i}f_{i}^{T}J_{i}\Delta x \\ = \rho + \rho ' f_{i}^{T}J_{i}\Delta x + \frac{1}{2} (\Delta x)^{T}J_{i}^{T}(\rho ' I + 2 \rho '' f_{i}f_{i}^{T})J_{i}\Delta x \tag{4.1} 21ρ(s+Δs)≈21(ρ(s)+ρ′(s)Δs+21ρ′′(s)Δ2s)=21(ρ+ρ′[2fiTJiΔx+(Δx)TJiTJiΔx]+21ρ′′[2fiTJiΔx+(Δx)TJiTJiΔx]2)≈ρ+ρ′fiTJiΔx+21ρ′(Δx)TJiTJiTΔx+ρ′′(Δx)TJiTfifiTJiΔx=ρ+ρ′fiTJiΔx+21(Δx)TJiT(ρ′I+2ρ′′fifiT)JiΔx(4.1)
若误差项为 Σ \Sigma Σ范数:
1 2 ρ ( s + Δ s ) ≈ 1 2 ( ρ ( s ) + ρ ′ ( s ) Δ s + 1 2 ρ ′ ′ ( s ) Δ 2 s ) = 1 2 ( ρ + ρ ′ [ 2 f i T Σ i − 1 J i Δ x + ( Δ x ) T J i T Σ i − 1 J i Δ x ] + 1 2 ρ ′ ′ [ 2 f i T Σ i − 1 J i Δ x + ( Δ x ) T J i T Σ i − 1 J i Δ x ] 2 ) ≈ ρ + ρ ′ f i T Σ i − 1 J i Δ x + 1 2 ρ ′ ( Δ x ) T J i T Σ i − 1 J i Δ x + ρ ′ ′ ( Δ x ) T J i T Σ i − 1 f i f i T Σ i − 1 J i Δ x = ρ + ρ ′ f i T Σ i − 1 J i Δ x + 1 2 ( Δ x ) T J i T ( ρ ′ Σ i − 1 + 2 ρ ′ ′ Σ i − 1 f i f i T Σ i − 1 ) J i Δ x (4.2) \frac{1}{2} \rho(s+\Delta s) \approx \frac{1}{2} (\rho(s) + \rho'(s) \Delta s + \frac{1}{2} \rho '' (s) \Delta^{2} s )\\ = \frac{1}{2}(\rho + \rho'[2f^{T}_{i}\Sigma_{i}^{-1}J_{i}\Delta x + (\Delta x)^{T}J^{T}_{i}\Sigma_{i}^{-1}J_{i}\Delta x]+ \frac{1}{2} \rho ''[2f^{T}_{i}\Sigma_{i}^{-1}J_{i}\Delta x + (\Delta x)^{T}J^{T}_{i}\Sigma_{i}^{-1}J_{i}\Delta x]^{2}) \\ \approx \rho + \rho'f_{i}^{T}\Sigma_{i}^{-1}J_{i}\Delta x + \frac{1}{2} \rho'(\Delta x)^{T}J_{i}^{T}\Sigma_{i}^{-1}J_{i}\Delta x + \rho ''(\Delta x)^{T}J_{i}^{T}\Sigma_{i}^{-1}f_{i}f_{i}^{T}\Sigma_{i}^{-1}J_{i}\Delta x \\ = \rho + \rho ' f_{i}^{T}\Sigma_{i}^{-1}J_{i}\Delta x + \frac{1}{2} (\Delta x)^{T}J_{i}^{T}(\rho ' \Sigma_{i}^{-1} + 2 \rho '' \Sigma_{i}^{-1}f_{i}f_{i}^{T}\Sigma_{i}^{-1})J_{i}\Delta x \tag{4.2} 21ρ(s+Δs)≈21(ρ(s)+ρ′(s)Δs+21ρ′′(s)Δ2s)=21(ρ+ρ′[2fiTΣi−1JiΔx+(Δx)TJiTΣi−1JiΔx]+21ρ′′[2fiTΣi−1JiΔx+(Δx)TJiTΣi−1JiΔx]2)≈ρ+ρ′fiTΣi−1JiΔx+21ρ′(Δx)TJiTΣi−1JiΔx+ρ′′(Δx)TJiTΣi−1fifiTΣi−1JiΔx=ρ+ρ′fiTΣi−1JiΔx+21(Δx)TJiT(ρ′Σi−1+2ρ′′Σi−1fifiTΣi−1)JiΔx(4.2)
由于 Δ x \Delta x Δx是一个极小量,故其三阶及以上的项数值很小,近似忽略 Δ x \Delta x Δx三阶及其以上的项。
对公式 ( 4 ) (4) (4)中的 Δ x \Delta x Δx进行求导后,令其等于0,得到: ∑ i J i T ( ρ ′ I + 2 ρ ′ ′ f i f i T ) J i Δ x = − ∑ i ρ ′ J i T f i \sum_{i} J_{i}^{T}(\rho ' I+2 \rho '' f_{i}f_{i}^{T})J_{i} \Delta x = -\sum_{i}\rho ' J_{i}^{T}f_{i} i∑JiT(ρ′I+2ρ′′fifiT)JiΔx=−i∑ρ′JiTfi 化简得: ∑ i J i T W J i Δ x = − ∑ i ρ ′ J i T f i \sum_{i} J_{i}^{T} W J_{i}\Delta x = -\sum_{i}\rho ' J_{i}^{T}f_{i} i∑JiTWJiΔx=−i∑ρ′JiTfi 其中 W = ρ ′ I + 2 ρ ′ ′ f i f i T W=\rho ' I+2 \rho '' f_{i}f_{i}^{T} W=ρ′I+2ρ′′fifiT
若误差项为 Σ \Sigma Σ范数:
∑ i J i T ( ρ ′ Σ i − 1 + 2 ρ ′ ′ Σ i − 1 f i f i T Σ i − 1 ) J i Δ x = − ∑ i ρ ′ J i T Σ i − 1 f i \sum_{i} J_{i}^{T}(\rho ' \Sigma_{i}^{-1}+2 \rho '' \Sigma_{i}^{-1}f_{i}f_{i}^{T}\Sigma_{i}^{-1})J_{i} \Delta x = -\sum_{i}\rho ' J_{i}^{T}\Sigma_{i}^{-1}f_{i} i∑JiT(ρ′Σi−1+2ρ′′Σi−1fifiTΣi−1)JiΔx=−i∑ρ′JiTΣi−1fi 化简得: ∑ i J i T W J i Δ x = − ∑ i ρ ′ J i T Σ i − 1 f i \sum_{i} J_{i}^{T} W J_{i}\Delta x = -\sum_{i}\rho ' J_{i}^{T}\Sigma_{i}^{-1}f_{i} i∑JiTWJiΔx=−i∑ρ′JiTΣi−1fi 其中 W = ρ ′ Σ i − 1 + 2 ρ ′ ′ Σ i − 1 f i f i T Σ i − 1 W=\rho ' \Sigma_{i}^{-1}+2 \rho '' \Sigma_{i}^{-1}f_{i}f_{i}^{T}\Sigma_{i}^{-1} W=ρ′Σi−1+2ρ′′Σi−1fifiTΣi−1
对于需要优化的状态向量,包括滑动窗口内的所有IMU状态 x k \mathbf{x}_{k} xk(位姿 P P P、速度 V V V、旋转 Q Q Q、加速度偏置 b a b_{a} ba,陀螺仪偏置 b w b_{w} bw)、相机到IMU的外参 x c b \mathbf{x}_{c}^{b} xcb、所有3D点的逆深度 λ m \lambda_{m} λm:【论文公式(21)】 X = [ x 0 , x 1 , ⋯ , x n , x c b , λ 0 , λ 1 , ⋯ , λ m ] \mathcal{X} = [\mathbf{x}_{0}, \mathbf{x}_{1}, \cdots , \mathbf{x}_{n}, \mathbf{x}_{c}^{b}, \mathbf{\lambda}_{0}, \mathbf{\lambda}_{1}, \cdots, \mathbf{\lambda}_{m}] X=[x0,x1,⋯,xn,xcb,λ0,λ1,⋯,λm] x k = [ p b k w , v b k w , q b k w , b a , b g ] , k ∈ [ 0 , n ] \mathbf{x}_{k} = [\mathbf{p}_{b_{k}}^{w}, \mathbf{v}_{b_{k}}^{w}, \mathbf{q}_{b_{k}}^{w}, \mathbf{b}_{a}, \mathbf{b}_{g}], k\in[0,n] xk=[pbkw,vbkw,qbkw,ba,bg],k∈[0,n] x c b = [ p c b , q c b ] \mathbf{x}_{c}^{b} = [\mathbf{p}_{c}^{b}, \mathbf{q}_{c}^{b}] xcb=[pcb,qcb] 其中 x k \mathbf{x}_{k} xk代表相机获取图片时对应时刻 k k k的IMU状态, n n n为滑动窗口内的关键帧的数量, m m m为滑动窗口内3D视觉特征(空间点)的数量, λ m \lambda_{m} λm为第 m m m个3D视觉特征的逆深度(第一次被观察时计算得到的值)。
目标函数为 【论文公式(22)(23)】: m i n X { ∥ r p − H p X ∥ 2 + ∑ k ∈ B ∥ r B ( z ^ b k + 1 b k , X ) ∥ P b k + 1 b k 2 + ∑ ( l , j ) ∈ C ρ ( ∥ r C ( z ^ l c j , X ) ∥ P l c j 2 ) } (5) \underset{\mathbf{\mathcal{X}}}{min}\begin{Bmatrix}\left \| \mathbf{r}_{p}-\mathbf{H}_{p} \mathbf{\mathcal{X}} \right \|^{2} + \underset{k \in \mathcal{B}}{\sum} \left \| \mathbf{r}_{\mathcal{B}}(\hat{\mathbf{z}}_{b_{k+1}}^{b_{k}}, \mathbf{\mathcal{X}} )\right \| _{\mathbf{P}_{b_{k+1}}^{b_{k}}} ^{2} + \underset{(l,j)\in \mathcal{C}}{\sum} \rho(\left \| \mathbf{r}_{\mathcal{C}}(\hat{\mathbf{z}}_{l}^{c_{j}}, \mathbf{\mathcal{X}})\right \|_{\mathbf{P}_{l}^{c_{j}}}^{2}) \end{Bmatrix} \tag{5} Xmin{∥rp−HpX∥2+k∈B∑∥∥∥rB(z^bk+1bk,X)∥∥∥Pbk+1bk2+(l,j)∈C∑ρ(∥∥rC(z^lcj,X)∥∥Plcj2)}(5) ρ ( s ) = { 1 , s ≥ 1 2 s − 1 , s < 1 \rho(s)=\left\{\begin{matrix} 1, s\geq 1\\ 2\sqrt{s}-1, s<1 \end{matrix}\right. ρ(s)={1,s≥12s−1,s<1 其中这三项分别为边缘化的先验信息 (由滑动窗口产生的关键帧边缘化)、IMU的测量误差、视觉的重投影误差,三种残差都是用马氏距离表示。 P b k + 1 b k \mathbf{P}_{b_{k+1}}^{b_{k}} Pbk+1bk代表IMU残差的协方差, P l c j \mathbf{P}_{l}^{c_{j}} Plcj代表视觉重投影误差的协方差,通过协方差的逆 (即信息矩阵) 对各个变量计算残差时进行加权。 ρ ( s ) \rho(s) ρ(s)为Huber核函数。
在IMU预积分中我们已经得到了:
R w b k p b k + 1 w = R w b k ( p b k w + v b k w Δ t k − 1 2 g w Δ t k 2 ) + α b k + 1 b k R^{b_{k}}_{w}p^{w}_{b_{k+1}}=R^{b_{k}}_{w}(p_{b_{k}}^{w}+v^{w}_{b_{k}}\Delta t_{k}-\frac{1}{2}g^{w}\Delta t_{k}^{2}) + \alpha^{b_{k}}_{b_{k+1}} Rwbkpbk+1w=Rwbk(pbkw+vbkwΔtk−21gwΔtk2)+αbk+1bk R w b k v b k + 1 w = R w b k ( v b k w − g w Δ t k ) + β b k + 1 b k R^{b_{k}}_{w}v_{b_{k+1}}^{w} = R_{w}^{b_{k}}(v_{b_{k}}^{w}-g^{w}\Delta t_{k})+\beta^{b_{k}}_{b_{k+1}} Rwbkvbk+1w=Rwbk(vbkw−gwΔtk)+βbk+1bk q w b k ⊗ q b k + 1 w = γ b k + 1 b k q_{w}^{b_{k}} \otimes q_{b_{k+1}}^{w} = \gamma _{b_{k+1}} ^{b_{k}} qwbk⊗qbk+1w=γbk+1bk
因此俩帧之间的PVQ和bias的变化量(增量)误差为:【论文公式[24]】
r B 15 × 1 ( z ^ b k + 1 b k , X ) = [ δ α b k + 1 b k δ β b k + 1 b k δ θ b k + 1 b k δ b a δ b w ] = [ R w b k ( p b k + 1 w − p b k w − v k w Δ t k + 1 2 g w Δ t k 2 ) − α b k + 1 b k R w b k ( v b k + 1 w − v b k w + g w Δ t k ) − β b k + 1 b k 2 [ ( γ b k + 1 b k ) − 1 ⊗ ( q b k w ) − 1 ⊗ q b k + 1 w ] x y z b a b k + 1 − b a b k b w b k + 1 − b w b k ] (6) r_{\mathcal{B}}^{15 \times 1}(\hat{z}_{b_{k+1}}^{b_{k}},\mathcal{X}) = \begin{bmatrix} \delta \alpha_{b_{k+1}}^{b_{k}} \\ \delta \beta_{b_{k+1}}^{b_{k}} \\ \delta \theta_{b_{k+1}}^{b_{k}} \\ \delta b_{a} \\ \delta b_{w} \end{bmatrix} = \begin{bmatrix} R^{b_{k}}_{w}(p^{w}_{b_{k+1}}-p_{b_{k}}^{w} - v_{k}^{w} \Delta t_{k} + \frac{1}{2}g^{w} \Delta t_{k} ^{2}) - \alpha_{b_{k+1}}^{b_{k}} \\ R^{b_{k}}_{w}(v_{b_{k+1}}^{w}-v_{b_{k}}^{w}+g^{w}\Delta t_{k}) - \beta ^{b_{k}}_{b_{k+1}} \\ 2[ (\gamma_{b_{k+1}}^{b_{k}})^{-1} \otimes (q^{w}_{b_{k}})^{-1} \otimes q^{w}_{b_{k+1}} ]_{xyz} \\ b_{ab_{k+1}}-b_{ab_{k}} \\ b_{wb_{k+1}}-b_{wb_{k}}\end{bmatrix} \tag{6} rB15×1(z^bk+1bk,X)=⎣⎢⎢⎢⎢⎢⎡δαbk+1bkδβbk+1bkδθbk+1bkδbaδbw⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎡Rwbk(pbk+1w−pbkw−vkwΔtk+21gwΔtk2)−αbk+1bkRwbk(vbk+1w−vbkw+gwΔtk)−βbk+1bk2[(γbk+1bk)−1⊗(qbkw)−1⊗qbk+1w]xyzbabk+1−babkbwbk+1−bwbk⎦⎥⎥⎥⎥⎥⎤(6)
其中 [ ⋅ ] x y z [\cdot]_{xyz} [⋅]xyz代表取四元素虚部部分来表示状态误差, [ α b k + 1 b k , β b k + 1 b k , γ b k + 1 b k ] T [\alpha_{b_{k+1}}^{b_{k}}, \beta_{b_{k+1}}^{b_{k}}, \gamma_{b_{k+1}}^{b_{k}}]^{T} [αbk+1bk,βbk+1bk,γbk+1bk]T为IMU的预积分量,具体推导可见上一篇博客VINS-Mono之IMU预积分,预积分对应的误差、协方差及雅克比矩阵递推方程的推导,同时,加速度计和陀螺仪的偏置同样包含在残差中以在线校正。
优化变量主要包含第 k k k时刻、第 k + 1 k+1 k+1时刻(其中 k k k和 k + 1 k+1 k+1为前后捕获俩帧图像对应的时间)的PVQ和加速度计的偏置 b a b_{a} ba、陀螺仪的偏置 b w b_{w} bw:
[ p b k w , q b k ] , [ v b k w , b a k , b w k ] , [ p b k + 1 w , q b k + 1 w ] , [ v b k + 1 w , b a k + 1 , b w k + 1 ] [p_{b_{k}}^{w}, q_{b_{k}}], [v_{b_{k}}^{w}, b_{a_{k}}, b_{w_{k}}], [p_{b_{k+1}}^{w}, q_{b_{k+1}}^{w}], [v_{b_{k+1}}^{w}, b_{a_{k+1}}, b_{w_{k+1}}] [pbkw,qbk],[vbkw,bak,bwk],[pbk+1w,qbk+1w],[vbk+1w,bak+1,bwk+1]
这部分在imu_factor.h中的class IMUFactor:public ceres::SizedCostFunction<15,7,9,7,9>
的函数virtual bool Evaluate()
中实现,其中parameters[0~3]
分别对应了以上4组优化变量的参数块,4组参数的长度依次是7,9,7,9。
代码IMUFactor::Evaluate()
中redidual
还乘以一个信息矩阵sqrt_info
,这是由于在IMU测量误差和视觉的重投影误差通过协方差的逆进行了加权,因此实际优化的是 m i n ( d ) = m i n ( r k T P − 1 r k ) min(d) = min(r_{k}^{T}P^{-1}r_{k}) min(d)=min(rkTP−1rk), 这里 r k r_{k} rk代表 k k k时刻时候的残差, P P P代表协方差。而Ceres只接受诸如 m i n ( r k T r k ) min(r_{k}^{T}r_{k}) min(rkTrk)的优化,因此在代码里,需要将信息矩阵 P − 1 P^{-1} P−1做 L L T LLT LLT分解,即 L L T = P − 1 LL^{T} = P^{-1} LLT=P−1 ,代码中矩阵L对应sqrt_info
,这样就可以将优化进行转换: m i n ( d ) = m i n ( r k T P − 1 r k ) = m i n ( ( L T r k ) T ( L T r k ) ) = m i n ( r k ′ T r k ) min(d) = min(r_{k}^{T}P^{-1}r_{k}) = min ((L^{T}r_{k})^{T}(L^{T}r_{k})) = min(r_{k}'^{T} r_{k}) min(d)=min(rkTP−1rk)=min((LTrk)T(LTrk))=min(rk′Trk)
计算雅克比时,残差对应的求偏差对象为上面的优化变量,但是计算时采用扰动方式计算,即扰动为 [ δ p b k w , δ θ b k w ] , [ v b k w , δ b a k , δ b w k ] , [ δ p b k + 1 w , δ θ b k + 1 w ] , [ δ v k + 1 w , δ a k + 1 , δ b w k + 1 ] [\delta p_{b_{k}}^{w}, \delta \theta_{b_{k}}^{w}], [v_{b_{k}}^{w}, \delta b_{a_{k}}, \delta b_{w_{k}}], [\delta p_{b_{k+1}}^{w}, \delta \theta_{b_{k+1}}^{w}], [\delta v_{k+1}^{w}, \delta a_{k+1}, \delta b_{w_{k+1}}] [δpbkw,δθbkw],[vbkw,δbak,δbwk],[δpbk+1w,δθbk+1w],[δvk+1w,δak+1,δbwk+1]
这里给出先给出雅克比 J J J的结果(对应上面的parameters[3]
)
J [ 0 ] 15 × 7 = [ ∂ r B ∂ p b k w , ∂ r B ∂ q b k w ] = [ − R w b k [ R w b k ( p b k + 1 w − p b k w − v b k w Δ t k + 1 2 g w Δ t k 2 ) ] × 0 [ R w b k ( v b k + 1 w − v b k w + g w Δ t k ) ] × 0 − L 3 × 3 [ ( q k + 1 w ) − 1 ⊗ q b k w ] R 3 × 3 [ γ b k + 1 b k ] 0 0 0 0 ] (7) J[0]^{15 \times 7} = [\frac{\partial{r_{\mathcal{B}}}}{\partial{p_{b_{k}}^{w}}}, \frac{\partial{r_{\mathcal{B}}}}{\partial{q_{b_{k}}^{w}}}] = \begin{bmatrix} -R_{w}^{b_{k}} & [R_{w}^{b_{k}}(p^{w}_{b_{k+1}}-p_{b_{k}}^{w} - v_{b_{k}}^{w} \Delta t_{k}+\frac{1}{2}g^{w}\Delta t_{k}^{2})]_{\times} \\ 0 & [R_{w}^{b_{k}}(v_{b_{k+1}}^{w} - v_{b_{k}}^{w} + g^{w}\Delta t _{k})]_{\times} \\ 0 & -\mathcal{L}_{3 \times 3}[(q^{w}_{k+1})^{-1} \otimes q_{b_{k}}^{w}] \mathcal{R}_{3 \times 3}[\gamma_{b_{k+1}}^{b_{k}}] \\ 0 & 0 \\ 0 & 0\end{bmatrix} \tag{7} J[0]15×7=[∂pbkw∂rB,∂qbkw∂rB]=⎣⎢⎢⎢⎢⎡−Rwbk0000[Rwbk(pbk+1w−pbkw−vbkwΔtk+21gwΔtk2)]×[Rwbk(vbk+1w−vbkw+gwΔtk)]×−L3×3[(qk+1w)−1⊗qbkw]R3×3[γbk+1bk]00⎦⎥⎥⎥⎥⎤(7) J [ 1 ] 15 × 9 = [ ∂ r B ∂ v b k w , ∂ r B ∂ b a k , ∂ r B ∂ b w k ] = [ − R w b k Δ t k − J b a α − J b w α − R w b k − J b a β − J b w β 0 0 − L 3 × 3 [ ( q b k + 1 w ) − 1 ⊗ q b k w ⊗ γ b k + 1 b k ] J b w γ 0 − I 0 0 0 − I ] (8) J[1]^{15 \times9} = [\frac{\partial{r_{\mathcal{B}}}}{\partial{v_{b_{k}}^{w}}}, \frac{\partial{r_{\mathcal{B}}}}{\partial{b_{a_{k}}}}, \frac{\partial{r_{\mathcal{B}}}}{\partial{b_{w_{k}}}}] = \begin{bmatrix} -R_{w}^{b_{k}}\Delta t_{k} & -J_{b_{a}}^{\alpha} & -J_{b_{w}}^{\alpha} \\ -R_{w}^{b_{k}} & -J_{b_{a}}^{\beta} & -J_{b_{w}}^{\beta}\\ 0 & 0 & -\mathcal{L}_{3 \times 3}[(q_{b_{k+1}}^{w})^{-1} \otimes q_{b_{k}}^{w} \otimes \gamma_{b_{k+1}}^{b_{k}}]J_{b_{w}}^{\gamma}\\0 & -I & 0 \\ 0 & 0 & -I \end{bmatrix} \tag{8} J[1]15×9=[∂vbkw∂rB,∂bak∂rB,∂bwk∂rB]=⎣⎢⎢⎢⎢⎡−RwbkΔtk−Rwbk000−Jbaα−Jbaβ0−I0−Jbwα−Jbwβ−L3×3[(qbk+1w)−1⊗qbkw⊗γbk+1bk]Jbwγ0−I⎦⎥⎥⎥⎥⎤(8)
J [ 2 ] 15 × 7 = [ ∂ r B ∂ p b k + 1 w , ∂ r B ∂ q b k + 1 w ] = [ R w b k 0 0 0 0 L 3 × 3 [ ( γ b k + 1 b k ) − 1 ⊗ ( q b k w ) − 1 ⊗ q b k + 1 w ] 0 0 0 0 ] (9) J[2]^{15 \times 7} = [\frac{\partial{r_{\mathcal{B}}}}{\partial{p_{b_{k+1}}^{w}}}, \frac{\partial{r_{\mathcal{B}}}}{\partial{q_{b_{k+1}}^{w}}}] = \begin{bmatrix} R_{w}^{b_{k}} & 0 \\ 0 & 0 \\ 0 & \mathcal{L}_{3 \times 3}[(\gamma_{b_{k+1}}^{b_{k}})^{-1}\otimes (q_{b_{k}}^{w})^{-1} \otimes q_{b_{k+1}}^{w}] \\ 0 & 0 \\ 0 & 0\end{bmatrix} \tag{9} J[2]15×7=[∂pbk+1w∂rB,∂qbk+1w∂rB]=⎣⎢⎢⎢⎢⎡Rwbk000000L3×3[(γbk+1bk)−1⊗(qbkw)−1⊗qbk+1w]00⎦⎥⎥⎥⎥⎤(9)
J [ 3 ] 15 × 9 = [ ∂ r B ∂ v b k + 1 w , ∂ r B ∂ b a k + 1 , ∂ r B ∂ b w k + 1 ] = [ 0 0 0 R w b k 0 0 0 0 0 0 I 0 0 0 I ] (10) J[3]^{15 \times 9} = [\frac{\partial{r_{\mathcal{B}}}}{\partial{v_{b_{k+1}}^{w}}}, \frac{\partial{r_{\mathcal{B}}}}{\partial{b_{a_{k+1}}}}, \frac{\partial{r_{\mathcal{B}}}}{\partial{b_{w_{k+1}}}}] = \begin{bmatrix} 0 & 0 & 0 \\ R_{w}^{b_{k}} & 0 & 0\\ 0 & 0 & 0\\ 0 & I & 0\\ 0 & 0 & I \end{bmatrix} \tag{10} J[3]15×9=[∂vbk+1w∂rB,∂bak+1∂rB,∂bwk+1∂rB]=⎣⎢⎢⎢⎢⎡0Rwbk000000I00000I⎦⎥⎥⎥⎥⎤(10)
对于上式,PVQ的求导可以直接采用对误差增量进行计算,而对 b a b_{a} ba, b g b_{g} bg的求导,因为 k k k时刻的bias相关的预计分计算是通过迭代一步一步递推的,直接求导太复杂,这里直接对预积分量在 k k k时刻的bias附近用一阶泰勒展开来近似,而不是真的取迭代计算:
α b k + 1 b k ≈ α ^ b k + 1 b k + J b a α δ b a + J b w α δ b w \alpha_{b_{k+1}}^{b_{k}} \approx \hat{\alpha}_{b_{k+1}}^{b_{k}} + J_{b_{a}}^{ \alpha} \delta b_{a} + J_{b_{w}}^{ \alpha} \delta b_{w} αbk+1bk≈α^bk+1bk+Jbaαδba+Jbwαδbw β b k + 1 b k ≈ β ^ b k + 1 b k + J b a β δ b a + J b w β δ b w \beta_{b_{k+1}}^{b_{k}} \approx \hat{\beta}_{b_{k+1}}^{b_{k}} + J_{b_{a}}^{ \beta} \delta b_{a} + J_{b_{w}}^{ \beta} \delta b_{w} βbk+1bk≈β^bk+1bk+Jbaβδba+Jbwβδbw γ b k + 1 b k ≈ γ ^ b k + 1 b k ⊗ [ 1 1 2 J b w γ δ b w ] \gamma_{b_{k+1}}^{b_{k}} \approx \hat{\gamma}_{b_{k+1}}^{b_{k}} \otimes \begin{bmatrix} 1 \\ \frac{1}{2}J_{b_{w}}^{\gamma}\delta b_{w} \end{bmatrix} γbk+1bk≈γ^bk+1bk⊗[121Jbwγδbw]
其中 J b a α = ∂ α b k + 1 b k ∂ δ b a , J b w α = ∂ α b k + 1 b k ∂ δ b w , J b a β = ∂ β b k + 1 b k ∂ δ b a , J b w β = ∂ β b k + 1 b k ∂ δ b w , J b w γ = ∂ γ b k + 1 b k ∂ δ b w J^{\alpha}_{b_{a}} = \frac{\partial \alpha_{b_{k+1}}^{b_{k}}}{\partial \delta b_{a}}, J^{\alpha}_{b_{w}} = \frac{\partial \alpha_{b_{k+1}}^{b_{k}}}{\partial \delta b_{w}},J^{\beta}_{b_{a}} = \frac{\partial \beta_{b_{k+1}}^{b_{k}}}{\partial \delta b_{a}},J^{\beta}_{b_{w}} = \frac{\partial \beta_{b_{k+1}}^{b_{k}}}{\partial \delta b_{w}},J^{\gamma}_{b_{w}} = \frac{\partial \gamma_{b_{k+1}}^{b_{k}}}{\partial \delta b_{w}} Jbaα=∂δba∂αbk+1bk,Jbwα=∂δbw∂αbk+1bk,Jbaβ=∂δba∂βbk+1bk,Jbwβ=∂δbw∂βbk+1bk,Jbwγ=∂δbw∂γbk+1bk表示预积分量对 k k k时刻的偏置的求导。关于偏置的雅克比矩阵可以根据IMU预积分讨论(VINS-Mono之IMU预积分,预积分对应的误差、协方差及雅克比矩阵递推方程的推导)的协方差递推公式,一步步递推获得。递推公式为: J k + 1 = F J k , J 0 = I J_{k+1}=FJ_{k}, \ J_{0}=I Jk+1=FJk, J0=I
以下是对公式 ( 7 ) ( 8 ) ( 9 ) ( 10 ) (7)(8)(9)(10) (7)(8)(9)(10)的推导:
由于 δ α b k + 1 b k \delta \alpha_{b_{k+1}}^{b_{k}} δαbk+1bk(预积分Pose误差)和 δ β b k + 1 b k \delta \beta_{b_{k+1}}^{b_{k}} δβbk+1bk(预积分Velocity误差)形式很接近,因此其对各个状态量求导的雅克比形式也很相似。
首先明确下反对称符号的性质:有 R [ δ θ ] × = − [ δ θ ] × R R[\delta \theta]_{\times} = -[\delta \theta]_{\times} R R[δθ]×=−[δθ]×R [ δ θ ] × ( R p ) = − [ R p ] × δ θ [\delta \theta]_{\times}(Rp) = -[Rp]_{\times}\delta \theta [δθ]×(Rp)=−[Rp]×δθ ( R e x p ( [ δ θ ] × ) ) − 1 = e x p ( − [ δ θ ] × ) R − 1 (Rexp([\delta \theta]_{\times}))^{-1} = exp(-[\delta \theta]_{\times})R^{-1} (Rexp([δθ]×))−1=exp(−[δθ]×)R−1 ( e x p ( [ δ θ ] × ) R ) − 1 = R − 1 e x p ( − [ δ θ ] × ) (exp([\delta \theta]_{\times})R)^{-1}=R^{-1}exp(-[\delta \theta]_{\times}) (exp([δθ]×)R)−1=R−1exp(−[δθ]×)
1) 预积分Pose误差 δ α b k + 1 b k \delta \alpha_{b_{k+1}}^{b_{k}} δαbk+1bk 对 k k k 时刻状态量雅克比进行推导 ( k + 1 k+1 k+1时刻的雅克比同理):