样例代码给出了使用 LM 算法来估计曲线 y = exp(ax2 + bx + c)参数 a, b, c 的完整过程。
1 请绘制样例代码中 LM 阻尼因子 µ 随着迭代变化的曲线图
2 将曲线函数改成 y = ax2 + bx + c,请修改样例代码中残差计算,雅克比计算等函数,完成曲线参数估计。
3 如果有实现其他阻尼因子更新策略可加分(选做)
(2)代码修改如下:
残差计算:
virtual void ComputeResidual() override
{
Vec3 abc = verticies_[0]->Parameters(); // 估计的参数
residual_(0) = abc(0)*x_*x_ + abc(1)*x_ + abc(2) - y_;
}
雅克比计算:
virtual void ComputeJacobians() override
{
Vec3 abc = verticies_[0]->Parameters();
Eigen::Matrix jaco_abc; // 误差为1维,状态量 3 个,所以是 1x3 的雅克比矩阵
jaco_abc << x_ * x_, x_ , 1;
jacobians_[0] = jaco_abc;
}
计算结果如下(为了达到良好的迭代效果,将数据点从100个调整到1000个):
Test CurveFitting start...
iter: 0 , chi= 3.21386e+06 , Lambda= 19.95
iter: 1 , chi= 974.658 , Lambda= 6.65001
iter: 2 , chi= 973.881 , Lambda= 2.21667
iter: 3 , chi= 973.88 , Lambda= 1.47778
problem solve cost: 35.4232 ms
makeHessian cost: 26.0736 ms
-------After optimization, we got these parameters :
0.999588 2.0063 0.968786
-------ground truth:
1.0, 2.0, 1.0
(3)尝试阻尼因子更新策略如下: if ρ > 0 μ : = μ ∗ max { 1 3 , 1 − ( 2 ρ − 1 ) 3 } ; ν : = 2 else μ : = μ ∗ ν ; ν : = 2 ∗ ν \begin{array}{l}{\text { if } \rho>0} \\ {\qquad \mu :=\mu * \max \left\{\frac{1}{3}, 1-(2 \rho-1)^{3}\right\} ; \quad \nu :=2} \\ {\text { else }} \\ {\qquad \mu :=\mu * \nu ; \quad \nu :=2 * \nu}\end{array} if ρ>0μ:=μ∗max{31,1−(2ρ−1)3};ν:=2 else μ:=μ∗ν;ν:=2∗ν
修改阻尼因子更新代码如下:
bool Problem::IsGoodStepInLM() {
double scale = 0;
scale = delta_x_.transpose() * (currentLambda_ * delta_x_ + b_);
scale += 1e-3; // make sure it's non-zero :)
// recompute residuals after update state
// 统计所有的残差
double tempChi = 0.0;
for (auto edge: edges_) {
edge.second->ComputeResidual();
tempChi += edge.second->Chi2();
}
double rho = (currentChi_ - tempChi) / scale;
if (rho > 0.75 && isfinite(tempChi)) // last step was good, 误差在下降
{
currentLambda_ *= 1/3;
currentChi_ = tempChi;
return true;
} else {
currentLambda_ *= 2;
currentChi_ = tempChi;
return false;
}
}
结果如下:
Test CurveFitting start...
iter: 0 , chi= 3.21386e+06 , Lambda= 19.95
iter: 1 , chi= 974.658 , Lambda= 0
iter: 2 , chi= 973.88 , Lambda= 0
problem solve cost: 27.2341 ms
makeHessian cost: 20.6501 ms
-------After optimization, we got these parameters :
0.999589 2.00628 0.968821
-------ground truth:
1.0, 2.0, 1.0
公式推导,根据课程知识,完成 F, G 中如下两项的推导过程:
f 15 = ∂ α b i b k + 1 ∂ δ b k g = − 1 4 ( R b i b k + 1 [ ( a b k + 1 − b k a ) ] × δ t 2 ) ( − δ t ) \mathbf{f}_{15}=\frac{\partial \boldsymbol{\alpha}_{b_{i} b_{k+1}}}{\partial \delta \mathbf{b}_{k}^{g}}=-\frac{1}{4}\left(\mathbf{R}_{b_{i} b_{k+1}}\left[\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right)\right]_{ \times} \delta t^{2}\right)(-\delta t) f15=∂δbkg∂αbibk+1=−41(Rbibk+1[(abk+1−bka)]×δt2)(−δt) g 12 = ∂ α b i b k + 1 ∂ n k g = − 1 4 ( R b i b k + 1 [ ( a b k + 1 − b k a ) ] × δ t 2 ) ( 1 2 δ t ) \mathbf{g}_{12}=\frac{\partial \boldsymbol{\alpha}_{b_{i} b_{k+1}}}{\partial \mathbf{n}_{k}^{g}}=-\frac{1}{4}\left(\mathbf{R}_{b_{i} b_{k+1}}\left[\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right)\right]_{ \times} \delta t^{2}\right)\left(\frac{1}{2} \delta t\right) g12=∂nkg∂αbibk+1=−41(Rbibk+1[(abk+1−bka)]×δt2)(21δt)
说明:这里的公式推导约定保持和PPT中的一样的,比如求导公式: ∂ x a ∂ δ θ = lim δ θ → 0 R a b exp ( [ [ δ θ ] x ) x b − R a b x b δ θ \frac{\partial \mathbf{x}_{a}}{\partial \delta \boldsymbol{\theta}}=\lim _{\delta \theta \rightarrow 0} \frac{\mathbf{R}_{a b} \exp \left(\left[[\delta \boldsymbol{\theta}]_{\mathrm{x}}\right) \mathbf{x}_{b}-\mathbf{R}_{a b} \mathbf{x}_{b}\right.}{\delta \boldsymbol{\theta}} ∂δθ∂xa=δθ→0limδθRabexp([[δθ]x)xb−Rabxb后续直接简写为 ∂ x a ∂ δ θ = ∂ R a b exp ( [ δ θ ] × ) x b ∂ δ θ \frac{\partial \mathbf{x}_{a}}{\partial \delta \boldsymbol{\theta}}=\frac{\partial \mathbf{R}_{a b} \exp \left([\delta \boldsymbol{\theta}]_{ \times}\right) \mathbf{x}_{b}}{\partial \delta \boldsymbol{\theta}} ∂δθ∂xa=∂δθ∂Rabexp([δθ]×)xb
下面开始推导
(1)公式 f 15 \mathbf{f}_{15} f15推导如下: α b i b k + 1 = α b i b k + β b i b k δ t + 1 2 a δ t 2 \boldsymbol{\alpha}_{b_{i} b_{k+1}}=\boldsymbol{\alpha}_{b_{i} b_{k}}+\boldsymbol{\beta}_{b_{i} b_{k}} \delta t+\frac{1}{2} \mathbf{a} \delta t^{2} αbibk+1=αbibk+βbibkδt+21aδt2其中 a = 1 2 ( q b i b k ( a b k − b k a ) + q b i b k + 1 ( a b k + 1 − b k a ) ) = 1 2 ( q b i b k ( a b k − b k a ) + q b i b k ⊗ [ 1 1 2 ω δ t ] ( a b k + 1 − b k a ) ) \begin{aligned} \mathbf{a}&=\frac{1}{2}\left(\mathbf{q}_{b_{i} b_{k}}\left(\mathbf{a}^{b_{k}}-\mathbf{b}_{k}^{a}\right)+\mathbf{q}_{b_{i} b_{k+1}}\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right)\right) \\&=\frac{1}{2}\left(\mathbf{q}_{b_{i} b_{k}}\left(\mathbf{a}^{b_{k}}-\mathbf{b}_{k}^{a}\right)+\mathbf{q}_{b_{i} b_{k}}\otimes\left[\begin{array}{c}{1} \\ {\frac{1}{2} \omega \delta t}\end{array}\right]\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right)\right) \end{aligned} a=21(qbibk(abk−bka)+qbibk+1(abk+1−bka))=21(qbibk(abk−bka)+qbibk⊗[121ωδt](abk+1−bka))其中 δ b k g \delta \mathbf{b}_{k}^{g} δbkg只和 1 2 ω δ t \frac{1}{2} \omega \delta t 21ωδt这一项有关,因此 f 15 \mathbf{f}_{15} f15的推导可以进行如下简化: f 15 = ∂ α b i b k + 1 ∂ δ b k g = ∂ 1 4 q b i b k ⊗ [ 1 1 2 ω δ t ] ⊗ [ 1 − 1 2 δ b k g δ t ] ( a b k + 1 − b k a ) δ t 2 ∂ δ b k g = 1 4 ∂ R b i b k + 1 exp ( [ − δ b k g δ t ] × ) ( a b k + 1 − b k a ) δ t 2 ∂ δ b k g = 1 4 ∂ R b i b k + 1 ( I + [ − δ b k g δ t ] × ) ( a b k + 1 − b k a ) δ t 2 ∂ δ b k g = 1 4 ∂ − R b i b k + 1 ( [ ( a b k + 1 − b k a ) δ t 2 ] × ) ( − δ b k g δ t ) ∂ δ b k g = − 1 4 ( R b i b k + 1 [ ( a b k + 1 − b k a ) ] × δ t 2 ) ( − δ t ) \begin{aligned} \mathbf{f}_{15}&=\frac{\partial \boldsymbol{\alpha}_{b_{i} b_{k+1}}}{\partial \delta \mathbf{b}_{k}^{g}} \\&=\frac{\partial \frac{1}{4} \mathbf{q}_{b_{i} b_{k}} \otimes\left[\begin{array}{c}{1} \\ {\frac{1}{2} \boldsymbol{\omega} \delta t}\end{array}\right] \otimes\left[\begin{array}{c}{1} \\ {-\frac{1}{2} \delta \mathbf{b}_{k}^{g} \delta t}\end{array}\right]\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right) \delta t^{2}}{\partial \delta \mathbf{b}_{k}^{g}} \\&=\frac{1}{4} \frac{\partial \mathbf{R}_{b_{i} b_{k+1}} \exp \left(\left[-\delta \mathbf{b}_{k}^{g} \delta t\right]_{ \times}\right)\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right) \delta t^{2}}{\partial \delta \mathbf{b}_{k}^{g}} \\&=\frac{1}{4} \frac{\partial \mathbf{R}_{b_{i} b_{k+1}}\left(\mathbf{I}+\left[-\delta \mathbf{b}_{k}^{g} \delta t\right]_{ \times}\right)\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right) \delta t^{2}}{\partial \delta \mathbf{b}_{k}^{g}} \\&=\frac{1}{4} \frac{\partial-\mathbf{R}_{b_{i} b_{k+1}}\left(\left[\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right) \delta t^{2}\right]_{ \times}\right)\left(-\delta \mathbf{b}_{k}^{g} \delta t\right)}{\partial \delta \mathbf{b}_{k}^{g}} \\&=-\frac{1}{4}\left(\mathbf{R}_{b_{i} b_{k+1}}\left[\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right)\right]_{ \times} \delta t^{2}\right)(-\delta t) \end{aligned} f15=∂δbkg∂αbibk+1=∂δbkg∂41qbibk⊗[121ωδt]⊗[1−21δbkgδt](abk+1−bka)δt2=41∂δbkg∂Rbibk+1exp([−δbkgδt]×)(abk+1−bka)δt2=41∂δbkg∂Rbibk+1(I+[−δbkgδt]×)(abk+1−bka)δt2=41∂δbkg∂−Rbibk+1([(abk+1−bka)δt2]×)(−δbkgδt)=−41(Rbibk+1[(abk+1−bka)]×δt2)(−δt)
(2)公式 g 12 \mathbf{g}_{12} g12推导如下:
g 12 \mathbf{g}_{12} g12和 f 15 \mathbf{f}_{15} f15的推导是类似的
α b i b k + 1 = α b i b k + β b i b k δ t + 1 2 a δ t 2 \boldsymbol{\alpha}_{b_{i} b_{k+1}}=\boldsymbol{\alpha}_{b_{i} b_{k}}+\boldsymbol{\beta}_{b_{i} b_{k}} \delta t+\frac{1}{2} \mathbf{a} \delta t^{2} αbibk+1=αbibk+βbibkδt+21aδt2其中 a = 1 2 ( q b i b k ( a b k − b k a ) + q b i b k + 1 ( a b k + 1 − b k a ) ) = 1 2 ( q b i b k ( a b k − b k a ) + q b i b k ⊗ [ 1 1 2 ω δ t ] ( a b k + 1 − b k a ) ) \begin{aligned} \mathbf{a}&=\frac{1}{2}\left(\mathbf{q}_{b_{i} b_{k}}\left(\mathbf{a}^{b_{k}}-\mathbf{b}_{k}^{a}\right)+\mathbf{q}_{b_{i} b_{k+1}}\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right)\right) \\&=\frac{1}{2}\left(\mathbf{q}_{b_{i} b_{k}}\left(\mathbf{a}^{b_{k}}-\mathbf{b}_{k}^{a}\right)+\mathbf{q}_{b_{i} b_{k}}\otimes\left[\begin{array}{c}{1} \\ {\frac{1}{2} \omega \delta t}\end{array}\right]\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right)\right) \end{aligned} a=21(qbibk(abk−bka)+qbibk+1(abk+1−bka))=21(qbibk(abk−bka)+qbibk⊗[121ωδt](abk+1−bka))其中 ω = 1 2 ( ( ω b k + n k g − b k g ) + ( ω b k + 1 + n k + 1 g − b k g ) ) \omega=\frac{1}{2}\left(\left(\boldsymbol{\omega}^{b_{k}}+\mathbf{n}_{k}^{g}-\mathbf{b}_{k}^{g}\right)+\left(\boldsymbol{\omega}^{b_{k+1}}+\mathbf{n}_{k+1}^{g}-\mathbf{b}_{k}^{g}\right)\right) ω=21((ωbk+nkg−bkg)+(ωbk+1+nk+1g−bkg))因此 n k g \mathbf{n}_{k}^{g} nkg只和 1 2 ω δ t \frac{1}{2} \omega \delta t 21ωδt这一项有关,同理: f 15 = ∂ α b i b k + 1 ∂ δ b k g = ∂ 1 4 q b i b k ⊗ [ 1 1 2 ω δ t ] ⊗ [ 1 1 2 ( 1 2 δ n k g ) δ t ] ( a b k + 1 − b k a ) δ t 2 ∂ δ n k g = 1 4 ∂ R b i b k + 1 exp ( [ 1 2 δ n k g δ t ] × ) ( a b k + 1 − b k a ) δ t 2 ∂ δ n k g = 1 4 ∂ R b i b k + 1 ( I + [ 1 2 δ n k g δ t ] × ) ( a b k + 1 − b k a ) δ t 2 ∂ δ n k g = 1 4 ∂ − R b i b k + 1 ( [ ( a b k + 1 − b k a ) δ t 2 ] × ) ( 1 2 δ n k g δ t ) ∂ δ n k g = − 1 4 ( R b i b k + 1 [ ( a b k + 1 − b k a ) ] × δ t 2 ) ( 1 2 δ t ) \begin{aligned} \mathbf{f}_{15}&=\frac{\partial \boldsymbol{\alpha}_{b_{i} b_{k+1}}}{\partial \delta \mathbf{b}_{k}^{g}} \\&=\frac{\partial \frac{1}{4} \mathbf{q}_{b_{i} b_{k}} \otimes\left[\begin{array}{c}{1} \\ {\frac{1}{2} \boldsymbol{\omega} \delta t}\end{array}\right] \otimes\left[\begin{array}{c}{1} \\ {\frac{1}{2} (\frac{1}{2}\delta \mathbf{n}_{k}^{g} )\delta t}\end{array}\right]\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right) \delta t^{2}}{\partial \delta \mathbf{n}_{k}^{g}} \\&=\frac{1}{4} \frac{\partial \mathbf{R}_{b_{i} b_{k+1}} \exp \left(\left[\frac{1}{2}\delta \mathbf{n}_{k}^{g} \delta t\right]_{ \times}\right)\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right) \delta t^{2}}{\partial \delta \mathbf{n}_{k}^{g}} \\&=\frac{1}{4} \frac{\partial \mathbf{R}_{b_{i} b_{k+1}}\left(\mathbf{I}+\left[\frac{1}{2}\delta\mathbf{n}_{k}^{g} \delta t\right]_{ \times}\right)\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right) \delta t^{2}}{\partial \delta\mathbf{n}_{k}^{g}} \\&=\frac{1}{4} \frac{\partial-\mathbf{R}_{b_{i} b_{k+1}}\left(\left[\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right) \delta t^{2}\right]_{ \times}\right)\left(\frac{1}{2}\delta \mathbf{n}_{k}^{g} \delta t\right)}{\partial \delta \mathbf{n}_{k}^{g}} \\&=-\frac{1}{4}\left(\mathbf{R}_{b_{i} b_{k+1}}\left[\left(\mathbf{a}^{b_{k+1}}-\mathbf{b}_{k}^{a}\right)\right]_{ \times} \delta t^{2}\right)(\frac{1}{2}\delta t) \end{aligned} f15=∂δbkg∂αbibk+1=∂δnkg∂41qbibk⊗[121ωδt]⊗[121(21δnkg)δt](abk+1−bka)δt2=41∂δnkg∂Rbibk+1exp([21δnkgδt]×)(abk+1−bka)δt2=41∂δnkg∂Rbibk+1(I+[21δnkgδt]×)(abk+1−bka)δt2=41∂δnkg∂−Rbibk+1([(abk+1−bka)δt2]×)(21δnkgδt)=−41(Rbibk+1[(abk+1−bka)]×δt2)(21δt)
证明公式: Δ x l m = − ∑ j = 1 n v j ⊤ F ′ ⊤ λ j + μ v j \Delta \mathbf{x}_{\mathrm{lm}}=-\sum_{j=1}^{n} \frac{\mathbf{v}_{j}^{\top} \mathbf{F}^{\prime \top}}{\lambda_{j}+\mu} \mathbf{v}_{j} Δxlm=−j=1∑nλj+μvj⊤F′⊤vj
证明主要是利用正规矩阵的谱分解性质,LM公式如下: ( J ⊤ J + μ I ) Δ x l m = − J ⊤ f \left(\mathbf{J}^{\top} \mathbf{J}+\mu \mathbf{I}\right) \Delta \mathbf{x}_{\mathrm{lm}}=-\mathbf{J}^{\top} \mathbf{f} (J⊤J+μI)Δxlm=−J⊤f对 J ⊤ J \mathbf{J}^{\top} \mathbf{J} J⊤J进行特征值分解,并加以变换: ( V Λ V ⊤ + μ I ) Δ x l m = − F ′ ⊤ \left(\mathbf{V} \mathbf{\Lambda} \mathbf{V}^{\top}+\mu \mathbf{I}\right) \Delta \mathbf{x}_{\mathrm{lm}}=-\mathbf{F'}^{\top} (VΛV⊤+μI)Δxlm=−F′⊤ V ( Λ + μ I ) V ⊤ Δ x l m = − F ′ ⊤ \mathbf{V}\left( \mathbf{\Lambda} +\mu \mathbf{I}\right)\mathbf{V}^{\top} \Delta \mathbf{x}_{\mathrm{lm}}=-\mathbf{F'}^{\top} V(Λ+μI)V⊤Δxlm=−F′⊤ Δ x l m = − V ( Λ + μ I ) − 1 V ⊤ F ′ ⊤ \Delta \mathbf{x}_{\mathrm{lm}}=-\mathbf{V}\left( \mathbf{\Lambda} +\mu \mathbf{I}\right)^{-1}\mathbf{V}^{\top} \mathbf{F'}^{\top} Δxlm=−V(Λ+μI)−1V⊤F′⊤根据正规矩阵的谱分解有: Δ x l m = − ∑ j = 1 n v j v j ⊤ λ j + μ F ′ ⊤ \Delta \mathbf{x}_{\mathrm{lm}}=-\sum_{j=1}^{n} \frac{\mathbf{v}_{j}\mathbf{v}_{j}^{\top} }{\lambda_{j}+\mu} \mathbf{F}^{\prime \top} Δxlm=−j=1∑nλj+μvjvj⊤F′⊤根据矩阵的结合率可以直接过得结果: Δ x 1 m = − ∑ j = 1 n v j ⊤ F ′ ⊤ λ j + μ v j \Delta \mathbf{x}_{1 \mathrm{m}}=-\sum_{j=1}^{n} \frac{\mathbf{v}_{j}^{\top} \mathbf{F}^{\prime \top}}{\lambda_{j}+\mu} \mathbf{v}_{j} Δx1m=−j=1∑nλj+μvj⊤F′⊤vj