代码:https://github.com/TixiaoShan/LIO-SAM
视频:https://www.youtube.com/watch?v=A0H8CoORZJU
We propose a framework for tightly-coupled lidar inertial odometry via smoothing and mapping, LIO-SAM, that achieves highly accurate, real-time mobile robot trajectory estimation and map-building. LIO-SAM formulates lidar-inertial odometry atop a factor graph, allowing a multitude of relative and absolute measurements, including loop closures, to be incorporated from different sources as factors into the system. The estimated motion from inertial measurement unit (IMU) pre-integration de-skews point clouds and produces an initial guess for lidar odometry optimization. The obtained lidar odometry solution is used to estimate the bias of the IMU. To ensure high performance in real-time, we marginalize old lidar scans for pose optimization, rather than matching lidar scans to a global map. Scan-matching at a local scale instead of a global scale significantly improves the real-time performance of the system, as does the selective introduction of keyframes, and an efficient sliding window approach that registers a new keyframe to a fixed-size set of prior “sub-keyframes.” The proposed method is extensively evaluated on datasets gathered from three platforms over various scales and environments.
我们提出了一个通过smoothing and mapping(gtsam优化库)实现紧耦合激光雷达惯性里程计的框架LIO-SAM,该framework可实现高精度、实时的移动机器人轨迹估计和地图构建。LIO-SAM在因子图( factor graph)上实现了激光雷达惯性里程计,允许从different sources将大量相对和绝对测量(包括回环)作为因子整合到系统中。可以通过IMU预积分的运动估计消除点云的畸变(de-skew),并生成激光雷达里程计优化的 initial guess。获得的激光雷达里程计的解被用于估计IMU的偏差(bias)。为了确保高精度的实时性,我们将旧的激光雷达scans边缘化以优化姿态,而不是将激光雷达scans与全局地图匹配。在局部尺度而不是全局尺度上进行scan匹配可以显著提高系统的实时性能,引入关键帧以及滑动窗口的方法可以将新关键帧对齐到固定大小的prior“子关键帧”集合(位于滑窗中的关键帧)中。我们在三个平台收集的在不同规模和环境生成的数据集上对所提出的方法进行了广泛的评估。
State estimation, localization and mapping are fundamental prerequisites for a successful intelligent mobile robot, required for feedback control, obstacle avoidance, and planning, among many other capabilities. Using vision-based and lidar-based sensing, great efforts have been devoted to achieving high-performance real-time simultaneous localization and mapping (SLAM) that can support a mobile robot’s six degree-of-freedom state estimation. Vision-based methods typically use a monocular or stereo camera and triangulate features across successive images to determine the camera motion. Although vision-based methods are especially suitable for place recognition, their sensitivity to initialization, illumination, and range make them unreliable when they alone are used to support an autonomous navigation system. On the other hand, lidar-based methods are largely invariant to illumination change. Especially with the recent availability of long-range, high-resolution 3D lidar, such as the Velodyne VLS-128 and Ouster OS1-128, lidar becomes more suitable to directly capture the fine details of an environment in 3D space. Therefore, this paper focuses on lidar-based state estimation and mapping methods.
状态估计、定位和建图是智能移动机器人的基本先决条件,需要反馈控制、避障和规划等许多功能。利用基于视觉和基于激光雷达的传感技术,人们致力于实现高性能的实时同步定位和映射(SLAM),以支持移动机器人的六自由度状态估计。基于视觉的方法通常使用单目或立体相机,并在连续图像上对特征进行三角化,以确定相机的运动。尽管基于视觉的方法特别适合于位置识别,但它们对初始化、照明和范围的敏感性使得它们单独用于支持自主导航系统时不可靠。另一方面,基于激光雷达的方法在很大程度上对光照变化保持不变。特别是最近,随着远程、高分辨率3D激光雷达的出现,例如Velodyne VLS-128和Outster OS1-128,激光雷达变得更适合直接捕捉3D空间中环境的细节。因此,本文主要研究基于激光雷达的状态估计和建图方法。
Many lidar-based state estimation and mapping methods have been proposed in the last two decades. Among them, the lidar odometry and mapping (LOAM) method proposed in [1] for low-drift and real-time state estimation and mapping is among the most widely used. LOAM, which uses a lidar and an inertial measurement unit (IMU), achieves state-ofthe-art performance and has been ranked as the top lidarbased method since its release on the KITTI odometry benchmark site [2]. Despite its success, LOAM presents some limitations - by saving its data in a global voxel map, it is often difficult to perform loop closure detection and incorporate other absolute measurements, e.g., GPS, for pose correction. Its online optimization process becomes less efficient when this voxel map becomes dense in a feature-rich environment. LOAM also suffers from drift in large-scale tests, as it is a scan-matching based method at its core
在过去的二十年中,人们提出了许多基于激光雷达的状态估计和建图方法。其中,在[1]中提出的用于低漂移和实时状态估计和映射的激光雷达里程计和建图(LOAM)方法是应用最广泛的方法之一。LOAM使用激光雷达和惯性测量单元(IMU),具有最先进的性能,自其在KITTI里程计基准站点发布以来,一直被列为基于激光的最佳方法[2]。尽管取得了成功,但LOAM仍存在一些局限性—通过将其数据保存在全局体素图(voxel map)中,通常很难执行回环检测并结合其他绝对测量(例如GPS)进行位姿校正。当voxel map在特征丰富的环境中变得稠密时,其在线优化的效率会降低。在大型场景的测试中,LOAM也会受到漂移的影响,因为它的核心是一种基于扫描匹配的方法。
In this paper, we propose a framework for tightly-coupled lidar inertial odometry via smoothing and mapping, LIO-SAM, to address the aforementioned problems. We assume a nonlinear motion model for point cloud de-skew, estimating the sensor motion during a lidar scan using raw IMU measurements. In addition to de-skewing point clouds, the estimated motion serves as an initial guess for lidar odometry optimization. The obtained lidar odometry solution is then used to estimate the bias of the IMU in the factor graph. By introducing a global factor graph for robot trajectory estimation, we can efficiently perform sensor fusion using lidar and IMU measurements, incorporate place recognition among robot poses, and introduce absolute measurements, such as GPS positioning and compass heading, when they are available. This collection of factors from various sources is used for joint optimization of the graph. Additionally, we marginalize old lidar scans for pose optimization, rather than matching scans to a global map like LOAM. Scan-matching at a local scale instead of a global scale significantly improves the real-time performance of the system, as does the selective introduction of keyframes, and an efficient sliding window approach that registers a new keyframe to a fixedsize set of prior “sub-keyframes.” The main contributions of our work can be summarized as follows:
在本文中,我们提出了一个通过smoothing and mapping的紧耦合激光雷达惯性里程计框架LIO-SAM,以解决上述问题。我们假定一个点云去畸变(de-skew)的非线性运动模型,使用原始IMU测量值估计激光雷达扫描期间的传感器运动。除了对点云去畸变外,估计的运动还可作为激光雷达里程计优化的initial guess。然后使用获得的激光雷达里程计的解来估计因子图中IMU的偏差。通过引入用于机器人轨迹估计的全局因子图,我们可以使用激光雷达和IMU测量有效地执行传感器融合,在机器人位姿之间加入位置识别,并在可用时引入绝对测量,如GPS定位和指南针朝向。This collection of factors from various sources用于因子图的联合优化。此外,为了优化姿态,我们将旧的激光雷达scans边缘化,而不是将scans与global map匹配(不同于LOAM的方法)。在局部尺度而不是全局尺度上进行扫描匹配,可以显著提高系统的实时性能,选择性引入关键帧以及高效的滑动窗口方法(将新关键帧对齐到先前“子关键帧”的固定大小set)也是如此。我们工作的主要贡献可总结如下:
• A tightly-coupled lidar inertial odometry framework built atop a factor graph, that is suitable for multi-sensor
fusion and global optimization.
• An efficient, local sliding window-based scan-matching approach that enables real-time performance by registering selectively chosen new keyframes to a fixed-size set of prior sub-keyframes.
• The proposed framework is extensively validated with tests across various scales, vehicles, and environments.
•一个紧耦合的激光雷达惯性里程计框架,构建在因子图之上,适用于多传感器融合和全局优化。
•一种高效、基于局部滑动窗口的扫描匹配方法,通过选择性地将新关键帧对齐到固定大小的prior子关键帧集中,实现实时性能。
•提出的框架在各种规模、车辆和环境的测试中得到广泛验证。
We first define frames and notation that we use throughout the paper. We denote the world frame as W and the robot body frame as B. We also assume the IMU frame coincides with the robot body frame for convenience. The robot state x can be written as:
我们首先定义了在本文中使用的框架和符号。我们将世界坐标系表示为W,机器人body坐标系表示为B。为了方便起见,我们还假设IMU坐标系与body坐标系重合。机器人状态x(15维9+3+1+2)可以写为:
where R ∈ S O ( 3 ) \mathbf{R} \in S O(3) R∈SO(3) is the rotation matrix, p ∈ R 3 \mathbf{p} \in \mathbb{R}^{3} p∈R3 is the position vector, v \mathbf{v} v is the speed, and b \mathbf{b} b is the IMU bias. The transformation T ∈ S E ( 3 ) \mathbf{T} \in S E(3) T∈SE(3) from B \mathbf{B} B to W \mathbf{W} W is represented as T = [ R ∣ p ] \mathbf{T}=[\mathbf{R} \mid \mathbf{p}] T=[R∣p].
其中, R ∈ S O ( 3 ) \mathbf{R} \in S O(3) R∈SO(3) 是旋转矩阵, p ∈ R 3 \mathbf{p} \in \mathbb{R}^{3} p∈R3是位置向量, v \mathbf{v} v是速度, b \mathbf{b} b是IMU偏差。从 B \mathbf{B} B到 W \mathbf{W} W的变换 T ∈ S E ( 3 ) \mathbf{T} \in S E(3) T∈SE(3)被表示为 T = [ R ∣ p ] \mathbf{T}=[\mathbf{R} \mid \mathbf{p}] T=[R∣p]。
An overview of the proposed system is shown in Figure 1 . The system receives sensor data from a 3D lidar, an IMU and optionally a GPS. We seek to estimate the state of the robot and its trajectory using the observations of these sensors. This state estimation problem can be formulated as a maximum a posteriori (MAP) problem. We use a factor graph to model this problem, as it is better suited to perform inference when compared with Bayes nets. With the assumption of a Gaussian noise model, the MAP inference for our problem is equivalent to solving a nonlinear least-squares problem [18]. Note that without loss of generality, the proposed system can also incorporate measurements from other sensors, such as elevation from an altimeter or heading from a compass.
系统的overview如图1所示。该系统从3D激光雷达、IMU和可选的GPS中接收传感器数据。我们试图利用这些传感器的观测来估计机器人的状态及其轨迹。这种状态估计问题可以表述为最大后验概率(MAP)问题。我们使用因子图来模拟这个问题,因为与贝叶斯网络相比,因子图更适合进行推理(inference)。在高斯噪声模型的假设下,我们的最大后验问题等价于解决非线性最小二乘问题[18]。请注意,在不丢失通用性的情况下,提出的系统还可以包含来自其他传感器的测量,例如来自高度计的高度信息或来自指南针的方向信息。
Fig. 1: The system structure of LIO-SAM. The system receives input from a 3D lidar, an IMU and optionally a GPS. Four types of factors are introduced to construct the factor graph.: (a) IMU preintegration factor, (b) lidar odometry factor, © GPS factor, and (d) loop closure factor. The generation of these factors is discussed in Section III.
图1:LIO-SAM的系统结构。该系统接收来自3D激光雷达、IMU和(可选)GPS的输入。引入四种类型的因子来构建因子图:(a) IMU预积分系数,(b)激光雷达里程计系数,(c)GPS系数和(d)回环系数。Section III讨论了这些因素的产生。
We introduce four types of factors along with one variable type for factor graph construction. This variable, representing the robot’s state at a specific time, is attributed to the nodes of the graph. The four types of factors are: (a) IMU preintegration factors, (b) lidar odometry factors, © GPS factors, and (d) loop closure factors. A new robot state node x is added to the graph when the change in robot pose exceeds a user-defined threshold. The factor graph is optimized upon the insertion of a new node using incremental smoothing and mapping with the Bayes tree (iSAM2) [19]. The process for generating these factors is described in the following sections.
我们介绍了四种类型的因子以及一种用于因子图构造的变量类型。此变量表示机器人在特定时间的状态,主要来自于因子图的节点。这四类因素是:(a)IMU预积分系数,(b)激光雷达里程计系数,(c)GPS系数,以及(d)回环系数。当机器人姿势的变化超过用户定义的阈值时,新的机器人状态节点x将添加到因子图中。在插入新节点时,使用增量平滑(incremental smoothing )和贝叶斯树(iSAM2)映射优化因子图[19]。生成这些因素的过程将在以下章节中描述。
The measurements of angular velocity and acceleration from an IMU are defined using Eqs. 2 and 3:
IMU的角速度和加速度测量值使用等式2,3定义:
where ω t ^ \hat{ω_t} ωt^ and a t ^ \hat{a_t} at^ are the raw IMU measurements in B B B at time t t t. ω t ^ \hat{ω_t} ωt^ and a t ^ \hat{a_t} at^ are affected by a slowly varying bias b t b_t bt and white noise n t n_t nt. R t B W R^{BW}_t RtBW is the rotation matrix from W W W to B B B. g g g is the constant gravity vector in W W W
其中 ω t ^ \hat{ω_t} ωt^和 a t ^ \hat{a_t} at^是 t t t时刻在 B B Body坐标系下原始的IMU测量值。 ω t ^ \hat{ω_t} ωt^和 a t ^ \hat{a_t} at^受到缓慢变化的偏差 b t b_t bt和白噪声 n t n_t nt的影响。 R t B W R^{BW}_t RtBW是从 W W W到 B B B的旋转矩阵。 g g g是在World坐标系下的恒定重力矢量。
We can now use the measurements from the IMU to infer the motion of the robot. The velocity, position and rotation of the robot at time t + ∆ t t + ∆t t+∆t can be computed as follows:
我们现在可以使用IMU的测量值来推断机器人的运动。机器人在时间 t + ∆ t t + ∆t t+∆t的速度、位置和旋转的计算方法如下:
对比VINS,LIO-SAM这里也是在世界坐标系下(
vins中认为加速度与重力加速度同向,所以用+,LIO-SAM假定为反向,所以是-,其实正负无所谓,依据个人定义而定)VINS中采用了积分的形式,只是表达方式不同,与LIO-SAM的公式含义是相同的。最多一个是中值积分,另一个是欧拉积分
p b k + 1 w = p b k w + v b k w Δ t k + ∬ t ∈ [ t k , t k + 1 ] ( R t w ( a ^ t − b a t − n a ) − g w ) d t 2 \mathbf{p}_{b_{k+1}}^{w}=\mathbf{p}_{b_{k}}^{w}+\mathbf{v}_{b_{k}}^{w} \Delta t_{k}+\iint_{t \in\left[t_{k}, t_{k+1}\right]}\left(\mathbf{R}_{t}^{w}\left(\hat{\mathbf{a}}_{t}-\mathbf{b}_{a_{t}}-\mathbf{n}_{a}\right)-\mathbf{g}^{w}\right) d t^{2} pbk+1w=pbkw+vbkwΔtk+∬t∈[tk,tk+1](Rtw(a^t−bat−na)−gw)dt2
v b k + 1 w = v b k w + ∫ t ∈ [ t k , t k + 1 ] ( R t w ( a ^ t − b a t − n a ) − g w ) d t \mathbf{v}_{b_{k+1}}^{w}=\mathbf{v}_{b_{k}}^{w}+\int_{t \in\left[t_{k}, t_{k+1}\right]}\left(\mathbf{R}_{t}^{w}\left(\hat{\mathbf{a}}_{t}-\mathbf{b}_{a_{t}}-\mathbf{n}_{a}\right)-\mathbf{g}^{w}\right) d t vbk+1w=vbkw+∫t∈[tk,tk+1](Rtw(a^t−bat−na)−gw)dt
q b k + 1 w = q b k w ⊗ ∫ t ∈ [ t k , t k + 1 ] 1 2 Ω ( ω ^ t − b w t − n w ) q t b k d t \mathbf{q}_{b_{k+1}}^{w}=\mathbf{q}_{b_{k}}^{w} \otimes \int_{t \in\left[t_{k}, t_{k+1}\right]} \frac{1}{2} \boldsymbol{\Omega}\left(\hat{\boldsymbol{\omega}}_{t}-\mathbf{b}_{w_{t}}-\mathbf{n}_{w}\right) \mathbf{q}_{t}^{b_{k}} d t qbk+1w=qbkw⊗∫t∈[tk,tk+1]21Ω(ω^t−bwt−nw)qtbkdt
其中, Ω ( ω ) = [ ω 0 ] R = [ − ⌊ ω ⌋ x ω − ω T 0 ] , ⌊ ω ⌋ × = [ 0 − ω z ω y ω z 0 − ω x − ω y ω x 0 ] \boldsymbol{\Omega}(\boldsymbol{\omega})=\left[\begin{array}{c}\boldsymbol{\omega} \\ 0\end{array}\right]_{R}=\left[\begin{array}{cc}-\lfloor\boldsymbol{\omega}\rfloor_{x} & \boldsymbol{\omega} \\ -\boldsymbol{\omega}^{T} & 0\end{array}\right], \quad\lfloor\boldsymbol{\omega}\rfloor_{\times}=\left[\begin{array}{ccc}0 & -\omega_{z} & \omega_{y} \\ \omega_{z} & 0 & -\omega_{x} \\ -\omega_{y} & \omega_{x} & 0\end{array}\right] Ω(ω)=[ω0]R=[−⌊ω⌋x−ωTω0],⌊ω⌋×=⎣⎡0ωz−ωy−ωz0ωxωy−ωx0⎦⎤为四元数右乘矩阵的虚部
where R t = R t W B = R t B W T R_t = R^{WB}_ t = {R^{BW}_t}^T Rt=RtWB=RtBWT. Here we assume that the angular velocity and the acceleration of B B B remain constant during the above integration.
其中 R t = R t W B = R t B W T R_t = R^{WB}_ t = {R^{BW}_t}^T Rt=RtWB=RtBWT。这里我们假设 B B B的角速度和加速度在上述积分过程中保持不变。
We then apply the IMU preintegration method proposed in [20] to obtain the relative body motion between two timesteps. The preintegrated measurements ∆ v i j , ∆ p i j ∆v_{ij}, ∆p_{ij} ∆vij,∆pij, and ∆ R i j ∆R_{ij} ∆Rij between time i i i and j j jcan be computed using:
然后,我们应用[20]中提出的IMU预积分方法来获得两个时间步之间的相对body运动。时间 i i i和 j j j之间的预积分测量 ∆ v i j , ∆ p i j ∆v_{ij},∆p_{ij} ∆vij,∆pij,以及 ∆ R i j ∆R_{ij} ∆Rij可以使用以下公式计算:
Due to space limitations, we refer the reader to the description from [20] for the detailed derivation of Eqs. 7 -9. Besides its efficiency, applying IMU preintegration also naturally gives us one type of constraint for the factor graph - IMU preintegration factors. The IMU bias is jointly optimized alongside the lidar odometry factors in the graph.
由于篇幅限制,我们请读者参考[20]中的描述,以了解等式7 -9的详细推导。除了效率之外,使用IMU预积分也自然地为因子图提供了一种类型的约束——IMU预积分因子。IMU偏差与因子图中的激光雷达里程计因子一起进行了联合优化。
When a new lidar scan arrives, we first perform feature extraction. Edge and planar features are extracted by evaluating the roughness of points over a local region. Points with a large roughness value are classified as edge features. Similarly, a planar feature is categorized by a small roughness value. We denote the extracted edge and planar features from a lidar scan at time i i i as F i e \mathrm{F}_{i}^{e} Fie and F i p \mathrm{F}_{i}^{p} Fip respectively. All the features extracted at time i i i compose a lidar frame F i \mathbb{F}_{i} Fi, where F i = { F i e , F i p } \mathbb{F}_{i}=\left\{\mathrm{F}_{i}^{e}, \mathrm{~F}_{i}^{p}\right\} Fi={Fie, Fip}. Note that a lidar frame F \mathbb{F} F is represented in B. A more detailed description of the feature extraction process can be found in [1], or [7] if a range image is used.
当新的激光雷达scan到达时,我们首先执行特征提取。通过评估局部区域上点的曲率(roughness)来提取边缘和平面特征。曲率较大的点被分类为边缘特征。类似地,平面特征通过较小的曲率进行分类。我们将从i时刻的激光雷达scan中提取的边缘和平面特征分别表示为 F i e \mathrm{F}_{i}^{e} Fie 和 F i p \mathrm{F}_{i}^{p} Fip。在i时刻提取的所有特征组成一个激光雷达帧 F i \mathbb{F}_{i} Fi, 其中 F i = { F i e , F i p } \mathbb{F}_{i}=\left\{\mathrm{F}_{i}^{e}, \mathrm{~F}_{i}^{p}\right\} Fi={Fie, Fip}。注意,激光雷达帧在B中表示为 F \mathbb{F} F。更详细的特征提取过程描述可在[1]中找到,如果使用范围图像(range image),则在[7]中找到。
Using every lidar frame for computing and adding factors to the graph is computationally intractable, so we adopt the concept of keyframe selection, which is widely used in the visual SLAM field. Using a simple but effective heuristic approach, we select a lidar frame F i + 1 \mathbb{F}_{i+1} Fi+1 as a keyframe when the change in robot pose exceeds a user-defined threshold when compared with the previous state x i \mathbf{x}_{i} xi. The newly saved keyframe, F i + 1 \mathbb{F}_{i+1} Fi+1, is associated with a new robot state node, x i + 1 \mathbf{x}_{i+1} xi+1, in the factor graph. The lidar frames between two keyframes are discarded. Adding keyframes in this way not only achieves a balance between map density and memory consumption but also helps maintain a relatively sparse factor graph, which is suitable for real-time nonlinear optimization. In our work, the position and rotation change thresholds for adding a new keyframe are chosen to be 1 m 1 \mathrm{~m} 1 m and 1 0 ∘ 10^{\circ} 10∘.
使用每个激光雷达帧进行计算并向图形中添加因子在计算量上非常有难度,因此我们采用了关键帧选择的策略,这在视觉SLAM领域中有着广泛的应用。我们使用了一种简单但有效的启发式方法,当机器人姿势的变化与先前状态 x i \mathbf{x}_{i} xi相比,超过用户定义的阈值时,我们选择一个激光雷达帧 F i + 1 \mathbb{F}_{i+1} Fi+1 作为关键帧。新保存的关键帧 F i + 1 \mathbb{F}_{i+1} Fi+1与因子图中的新机器人状态节点 x i + 1 \mathbf{x}_{i+1} xi+1相关联。两个关键帧之间的激光雷达帧将被丢弃。以这种方式添加关键帧不仅实现了map密度和内存消耗之间的平衡,而且有助于保持相对稀疏的因子图,这适合于实时非线性优化。在我们的工作中,用于添加新关键帧的位置和旋转变化阈值被选择为 1 m 1\mathrm{~m} 1 m和 1 0 ∘ 10^{\circ} 10∘。
Let us assume we wish to add a new state node x i + 1 \mathbf{x}_{i+1} xi+1 to the factor graph. The lidar keyframe that is associated with this state is F i + 1 \mathbb{F}_{i+1} Fi+1. The generation of a lidar odometry factor is described in the following steps:
假设我们希望向因子图添加一个新的状态节点 x i + 1 \mathbf{x}_{i+1} xi+1。与此状态关联的激光雷达关键帧为 F i + 1 \mathbb{F}_{i+1} Fi+1。以下步骤描述了激光雷达里程计系数的生成过程:
(1) Sub-keyframes for voxel map: We implement a sliding window approach to create a point cloud map containing a fixed number of recent lidar scans. Instead of optimizing the transformation between two consecutive lidar scans, we extract the n n n most recent keyframes, which we call the sub-keyframes, for estimation. The set of sub-keyframes { F i − n , … , F i } \left\{\mathbb{F}_{i-n}, \ldots, \mathbb{F}_{i}\right\} {Fi−n,…,Fi} is then transformed into frame W \mathbf{W} W using the transformations { T i − n , … , T i } \left\{\mathbf{T}_{i-n}, \ldots, \mathbf{T}_{i}\right\} {Ti−n,…,Ti} associated with them. The transformed sub-keyframes are merged together into a voxel map M i \mathbf{M}_{i} Mi. Since we extract two types of features in the previous feature extraction step, M i \mathbf{M}_{i} Mi is composed of two subvoxel maps that are denoted M i e \mathbf{M}_{i}^{e} Mie, the edge feature voxel map, and M i p \mathbf{M}_{i}^{p} Mip, the planar feature voxel map. The lidar frames and voxel maps are related to each other as follows:
(1) 子关键帧的voxel map(局部地图):我们实现了一种滑动窗口方法来创建包含固定数量的最近(recent)激光雷达扫描的点云map。我们提取 n n n个最近的关键帧(我们称之为子关键帧)进行估计,而不是优化两次连续激光雷达扫描之间的变换。然后子关键帧集 { F i − n , … , F i } \left\{\mathbb{F}_{i-n}, \ldots, \mathbb{F}_{i}\right\} {Fi−n,…,Fi} 使用与之相关联的变换 { T i − n , … , T i } \left\{\mathbf{T}_{i-n}, \ldots, \mathbf{T}_{i}\right\} {Ti−n,…,Ti} 转换到 W \mathbf{W} W坐标系。变换后的子关键帧被合并到一个voxel map M i \mathbf{M}_{i} Mi中。由于我们在前面的特征提取步骤中提取了两种类型的特征, M i \mathbf{M}_{i} Mi由两个子voxel map组成,分别表示为 M i e \mathbf{M}_{i}^{e} Mie,即边缘特征体素图和 M i p \mathbf{M}_{i}^{p} Mip,即平面特征体素图。激光雷达帧和voxel maps相互关联如下:
M i = { M i e , M i p } where : M i e = ′ F i e ∪ F i − 1 e ∪ … ∪ ′ F i − n e M i p = ′ F i p ∪ ′ F i − 1 p ∪ … ∪ F i − n p \begin{aligned} &\mathbf{M}_{i}=\left\{\mathbf{M}_{i}^{e}, \mathbf{M}_{i}^{p}\right\} \\ &\text { where }: \mathbf{M}_{i}^{e}={ }^{\prime} \mathrm{F}_{i}^{e} \cup \mathrm{F}_{i-1}^{e} \cup \ldots \cup^{\prime} \mathrm{F}_{i-n}^{e} \\ &\mathbf{M}_{i}^{p}={ }^{\prime} \mathrm{F}_{i}^{p} \cup^{\prime} \mathrm{F}_{i-1}^{p} \cup \ldots \cup \mathrm{F}_{i-n}^{p} \end{aligned} Mi={Mie,Mip} where :Mie=′Fie∪Fi−1e∪…∪′Fi−neMip=′Fip∪′Fi−1p∪…∪Fi−np
每个地图是各个关键帧的总和
′ F i e { }^{\prime} \mathrm{F}_{i}^{e} ′Fie and ′ F i p { }^{\prime} \mathrm{F}_{i}^{p} ′Fip are the transformed edge and planar features in W . M i e \mathbf{W} . \mathbf{M}_{i}^{e} W.Mie and M i p \mathbf{M}_{i}^{p} Mip are then downsampled to eliminate the duplicated features that fall in the same voxel cell. In this paper, n n n is chosen to be 25 . The downsample resolutions for M i e \mathbf{M}_{i}^{e} Mie and M i p \mathbf{M}_{i}^{p} Mip are 0.2 m 0.2 \mathrm{~m} 0.2 m and 0.4 m 0.4 \mathrm{~m} 0.4 m, respectively.
′ F i e { }^{\prime} \mathrm{F}_{i}^{e} ′Fie 和 ′ F i p { }^{\prime} \mathrm{F}_{i}^{p} ′Fip 是 W \mathbf{W} W中变换后的边和平面特征。然后对 M i e \mathbf{M}_{i}^{e} Mie 和 M i p \mathbf{M}_{i}^{p} Mip 进行下采样,以消除落在同一体素单元中的重复特征。在本文中, n n n被选择为25(临近关键帧的选取个数)。 M i e \mathbf{M}_{i}^{e} Mie 和 M i p \mathbf{M}_{i}^{p} Mip的下采样分辨率分别为 0.2 m 0.2\mathrm{~m} 0.2 m和 0.4 m 0.4\mathrm{~m} 0.4 m。
(2) Scan-matching: We match a newly obtained lidar frame F i + 1 \mathbb{F}_{i+1} Fi+1, which is also { F i + 1 e , F i + 1 p } \left\{\mathrm{F}_{i+1}^{e}, \mathrm{~F}_{i+1}^{p}\right\} {Fi+1e, Fi+1p}, to M i \mathbf{M}_{i} Mi via scan matching. Various scan-matching methods, such as [3] and [4], can be utilized for this purpose. Here we opt to use the method proposed in [1] due to its computational efficiency and robustness in various challenging environments.
(2) 扫描匹配:我们通过扫描匹配将新获得的激光雷达帧 F i + 1 \mathbb{F}_{i+1} Fi+1,也就是 { F i + 1 e , F i + 1 p } \left\{\mathrm{F}_{i+1}^{e}, \mathrm{~F}_{i+1}^{p}\right\} {Fi+1e, Fi+1p}匹配到 M i \mathbf{M}_{i} Mi。为此,可以使用各种扫描匹配方法,例如[3]和[4]。在这里,我们选择使用[1]中提出的方法(kdtree搜索最近邻),因为它在各种具有挑战性的环境中具有计算效率和鲁棒性。
We first transform { F i + 1 e , F i + 1 p } \left\{\mathrm{F}_{i+1}^{e}, \mathrm{~F}_{i+1}^{p}\right\} {Fi+1e, Fi+1p} from B \mathbf{B} B to W \mathbf{W} W and obtain { ′ F i + 1 e , ′ F i + 1 p } \left\{{ }^{\prime} \mathrm{F}_{i+1}^{e},^{\prime} \mathrm{F}_{i+1}^{p}\right\} {′Fi+1e,′Fi+1p}. This initial transformation is obtained by using the predicted robot motion, T ~ i + 1 \tilde{\mathbf{T}}_{i+1} T~i+1, from the IMU. For each feature in ′ F i + 1 e { }^{\prime}F^e_{i+1} ′Fi+1e or ′ F i + 1 p { }^{\prime}F^p_{i+1} ′Fi+1p , we then find its edge or planar correspondence in M i e M^e_i Mie or M i p M^p_i Mip . For the sake of brevity, the detailed procedures for finding these correspondences are omitted here, but are described thoroughly in [1].
我们首先将 { F i + 1 e , F i + 1 p } \left\{\mathrm{F}_{i+1}^{e}, \mathrm{~F}_{i+1}^{p}\right\} {Fi+1e, Fi+1p}从 B \mathbf{B} B转换为 W \mathbf{W} W,获得 { ′ F i + 1 e , ′ F i + 1 p } \left\{{ }^{\prime} \mathrm{F}_{i+1}^{e},^{\prime} \mathrm{F}_{i+1}^{p}\right\} {′Fi+1e,′Fi+1p}。该初始转换是通过使用来自IMU的预测机器人运动 T ~ i + 1 \tilde{\mathbf{T}}_{i+1} T~i+1,获得的。对于 ′ F i + 1 e { }^{\prime}F^e_{i+1} ′Fi+1e 或 ′ F i + 1 p { }^{\prime}F^p_{i+1} ′Fi+1p中的每个特征,我们可以在 M i e M^e_i Mie 或 M i p M^p_i Mip中找到其边或平面的对应关系。为简洁起见,此处省略了查找这些对应关系的详细程序,但在[1]中进行了详细描述。
(3) Relative transformation: The distance between a feature and its edge or planar patch correspondence can be computed using the following equations:
(3) 相对变换:特征与其边缘或planar patch对应关系之间的距离可以使用以下公式计算:
(10)点到线的距离
(11)点到面的距离
where k , u , v k, u, v k,u,v, and w w w are the feature indices in their corresponding sets. For an edge feature p i + 1 , k e \mathbf{p}_{i+1, k}^{e} pi+1,ke in ′ F i + 1 e , p i , u e { }^{\prime} \mathrm{F}_{i+1}^{e}, \mathbf{p}_{i, u}^{e} ′Fi+1e,pi,ue and p i , v e \mathbf{p}_{i, v}^{e} pi,ve are the points that form the corresponding edge line in M i e . \mathbf{M}_{i}^{e} . Mie. For a planar feature p i + 1 , k p \mathbf{p}_{i+1, k}^{p} pi+1,kp in ′ F i + 1 p , p i , u p , p i , v p { }^{\prime} \mathrm{F}_{i+1}^{p}, \mathbf{p}_{i, u}^{p}, \mathbf{p}_{i, v}^{p} ′Fi+1p,pi,up,pi,vp, and p i , w p \mathbf{p}_{i, w}^{p} pi,wp form the corresponding planar patch in M i p \mathbf{M}_{i}^{p} Mip. The GaussNewton method is then used to solve for the optimal transformation by minimizing:
其中, k , u , v k,u,v k,u,v和 w w w是其相应集合中的特征索引。对于在 ′ F i + 1 e { }^{\prime} \mathrm{F}_{i+1}^{e} ′Fi+1e的边特征 p i + 1 , k e \mathbf{p}_{i+1, k}^{e} pi+1,ke而言, p i , u e \mathbf{p}_{i, u}^{e} pi,ue 和 p i , v e \mathbf{p}_{i, v}^{e} pi,ve 是 M i e \mathbf{M}_{i}^{e} Mie中构成相应边线的两个点。对于在 ′ F i + 1 p , p i , u p , p i , v p { }^{\prime} \mathrm{F}_{i+1}^{p}, \mathbf{p}_{i, u}^{p}, \mathbf{p}_{i, v}^{p} ′Fi+1p,pi,up,pi,vp的平面特征 p i + 1 , k p \mathbf{p}_{i+1, k}^{p} pi+1,kp而言, p i , u p , p i , v p \mathbf{p}_{i, u}^{p}, \mathbf{p}_{i, v}^{p} pi,up,pi,vp,和 p i , w p \mathbf{p}_{i, w}^{p} pi,wp是 M i p \mathbf{M}_{i}^{p} Mip中构成相应面点的三个点。然后,使用GaussNewton方法通过最小化以下各项来求解最佳变换:
min T i + 1 { ∑ p i + 1 , k e ∈ F i + 1 e d e k + ∑ p i + 1 , k p ∈ F i + 1 p d p k } . \min _{\mathbf{T}_{i+1}}\left\{\sum_{\mathbf{p}_{i+1, k}^{e} \in{F}_{i+1}^{e}} \mathbf{d}_{e_{k}}+\sum_{\mathbf{p}_{i+1, k}^{p} \in{F}_{i+1}^{p}} \mathbf{d}_{p_{k}}\right\} . Ti+1min⎩⎨⎧pi+1,ke∈Fi+1e∑dek+pi+1,kp∈Fi+1p∑dpk⎭⎬⎫.
At last, we can obtain the relative transformation Δ T i , i + 1 \Delta \mathbf{T}_{i, i+1} ΔTi,i+1 between x i \mathbf{x}_{i} xi and x i + 1 \mathbf{x}_{i+1} xi+1, which is the lidar odometry factor linking these two poses:
最后,我们可以得到介于 x i \mathbf{x}_{i} xi和 x i + 1 \mathbf{x}_{i+1} xi+1之间的相对变换 Δ T i , i + 1 \Delta\mathbf{T}_{i,i+1} ΔTi,i+1,它是连接这两个姿势的激光雷达里程计因子:
对第i帧和第i+1帧的激光位姿进行约束
We note that an alternative approach to obtain Δ T i , i + 1 \Delta \mathbf{T}_{i, i+1} ΔTi,i+1 is to transform sub-keyframes into the frame of x i . \mathbf{x}_{i} . xi. In other words, we match F i + 1 \mathbb{F}_{i+1} Fi+1 to the voxel map that is represented in the frame of x i \mathbf{x}_{i} xi. In this way, the real relative transformation Δ T i , i + 1 \Delta \mathbf{T}_{i, i+1} ΔTi,i+1 can be directly obtained. Because the transformed features ′ F i e { }^{\prime} \mathrm{F}_{i}^{e} ′Fie and ′ F i p { }^{\prime} \mathrm{F}_{i}^{p} ′Fip can be reused multiple times, we instead opt to use the approach described in Sec. III-C.1 for its computational efficiency.
我们注意到,获得 Δ T i , i + 1 \Delta\mathbf{T}_{i,i+1} ΔTi,i+1的另一种方法是将子关键帧转换为 x i \mathbf{x}_{i} xi帧。换句话说,我们将 F i + 1 \mathbb{F}_{i+1} Fi+1与在 x i \mathbf{x}_{i} xi帧中表示的 voxel map相匹配。这样,可以直接获得真实的相对变换 Δ T i , i + 1 \Delta\mathbf{T}_ {i,i+1} ΔTi,i+1。由于转换后的特征 ′ F i e {}^{\prime}\mathrm{F}_{i}^{e} ′Fie和 ′ F i p {}^{\prime}\mathrm{F}_{i}^{p} ′Fip可以多次重用,因此我们选择使用Sec. III-C.1中描述的方法来提高计算效率。
为什么要投影到世界坐标系而不是投影到第 x i \mathbf{x}_{i} xi帧?比如我们想从第10帧获得第12帧,如果是投影到相对帧的形式,我们需要把第10帧投到第11帧再投到12帧来获得第十二帧,而在投到世界坐标系下的情况下,每次都只需要进行一次转换即可获得,属于一劳永逸
Though we can obtain reliable state estimation and mapping by utilizing only IMU preintegration and lidar odometry factors, the system still suffers from drift during longduration navigation tasks. To solve this problem, we can introduce sensors that offer absolute measurements for eliminating drift. Such sensors include an altimeter, compass, and GPS. For the purposes of illustration here, we discuss GPS, as it is widely used in real-world navigation systems.
虽然我们可以仅利用IMU预积分和激光雷达里程计因子获得可靠的状态估计和建图,但系统在长时间导航任务中仍会出现漂移。为了解决这个问题,我们可以引入提供绝对测量以消除漂移的传感器。这些传感器包括高度表、指南针和GPS。为了便于说明,我们将讨论GPS,因为它广泛应用于现实世界的导航系统中。
When we receive GPS measurements, we first transform them to the local Cartesian coordinate frame using the method proposed in [21]. Upon the addition of a new node to the factor graph, we then associate a new GPS factor with this node. If the GPS signal is not hardware-synchronized with the lidar frame, we interpolate among GPS measurements linearly based on the timestamp of the lidar frame.
当我们接收到GPS测量值时,我们首先使用[21]中提出的方法将其转换为局部笛卡尔坐标系(因为GPS得到的数据是经纬度,不能直接拿来应用)。在因子图中添加新节点后,我们将新的GPS因子与该节点关联。如果GPS信号与激光雷达帧不同步,我们将根据激光雷达帧的时间戳在GPS测量值之间进行线性插值。
We note that adding GPS factors constantly when GPS reception is available is not necessary because the drift of lidar inertial odometry grows very slowly. In practice, we only add a GPS factor when the estimated position covariance is larger than the received GPS position covariance.
我们注意到,当GPS接收可用时,不需要不断添加GPS因子,因为激光雷达惯性里程计的漂移增长非常缓慢。实际上,我们仅在估计的位置协方差大于接收到的GPS位置协方差时添加GPS因子。
Thanks to the utilization of a factor graph, loop closures can also be seamlessly incorporated into the proposed system, as opposed to LOAM and LIOM. For the purposes of illustration, we describe and implement a naive but effective Euclidean distance-based loop closure detection approach. We also note that our proposed framework is compatible with other methods for loop closure detection, for example, [22] and [23], which generate a point cloud descriptor and use it for place recognition.
由于使用了因子图,回环检测也可以很好地整合到提出的系统中,而这是LOAM和LIOM不能实现的。为了便于说明,我们描述并实现了一种简单但有效的基于欧几里德距离的回环检测方法(两帧距离较近而时间较远,就可尝试回环)。我们还注意到,我们提出的框架与回环检测的其他方法兼容,例如,[22]和[23],它们生成点云描述符并将其用于位置识别。
When a new state x i + 1 \mathbf{x}_{i+1} xi+1 is added to the factor graph, we first search the graph and find the prior states that are close to x i + 1 \mathbf{x}_{i+1} xi+1 in Euclidean space. As is shown in Fig. 1, for example, x 3 x_{3} x3 is one of the returned candidates. We then try to match F i + 1 \mathbb{F}_{i+1} Fi+1 to the sub-keyframes { F 3 − m , … , F 3 , … , F 3 + m } \left\{\mathbb{F}_{3-m}, \ldots, \mathbb{F}_{3}, \ldots, \mathbb{F}_{3+m}\right\} {F3−m,…,F3,…,F3+m} using scan-matching. Note that F i + 1 \mathbb{F}_{i+1} Fi+1 and the past sub-keyframes are first transformed into W \mathbf{W} W before scan-matching. We obtain the relative transformation Δ T 3 , i + 1 \Delta \mathbf{T}_{3, i+1} ΔT3,i+1 and add it as a loop closure factor to the graph. Throughout this paper, we choose the index m m m to be 12 , and the search distance for loop closures is set to be 15 m 15 \mathrm{~m} 15 m from a new state x i + 1 \mathbf{x}_{i+1} xi+1.
当一个新状态 x i + 1 \mathbf{x}_{i+1} xi+1被添加到因子图中时,我们首先搜索该图,并在欧氏空间中找到接近 x i + 1 \mathbf{x}_{i+1} xi+1的先前状态。如图1所示,例如, x 3 x_{3} x3是返回的候选者之一。我们尝试使用scan匹配将 F i + 1 \mathbb{F}_{i+1} Fi+1与子关键帧 { F 3 − m , … , F 3 , … , F 3 + m } \left\{\mathbb{F}_{3-m}, \ldots, \mathbb{F}_{3}, \ldots, \mathbb{F}_{3+m}\right\} {F3−m,…,F3,…,F3+m} 匹配(使用ICP)。请注意 F i + 1 \mathbb{F}_{i+1} Fi+1和过去的子关键帧在scan匹配之前会首先转换到 W \mathbf{W} W。我们获得相对变换 Δ T 3 , i + 1 \Delta \mathbf{T}_{3, i+1} ΔT3,i+1,并将其作为回环因子添加到图中。在本文中,我们将索引 m m m设置为12( 往左边找12,往右边找12,一共25个关键帧构成的子地图进行配准),回环检测的搜索距离设置为从新状态 x i + 1 \mathbf{x}_{i+1} xi+1开始的 15 m 15\mathrm{~m} 15 m。
帧与帧配准的精度一定小于帧与地图配准的精度,详细见LOAM论文
In practice, we find adding loop closure factors is especially useful for correcting the drift in a robot’s altitude, when GPS is the only absolute sensor available. This is because the elevation measurement from GPS is very inaccurate - giving rise to altitude errors approaching 100 m 100 \mathrm{~m} 100 m in our tests, in the absence of loop closures.
在实践中,我们发现,当GPS是唯一可用的绝对传感器时,添加回环因子对于校正机器人高度漂移特别有用。这是因为GPS的高程测量非常不准确—在没有回路闭合的情况下,在我们的测试中,高度误差接近 100 m 100\mathrm{~m} 100 m。
We have proposed LIO-SAM, a framework for tightlycoupled lidar inertial odometry via smoothing and mapping, for performing real-time state estimation and mapping in complex environments. By formulating lidar-inertial odometry atop a factor graph, LIO-SAM is especially suitable for multi-sensor fusion. Additional sensor measurements can easily be incorporated into the framework as new factors. Sensors that provide absolute measurements, such as a GPS, compass, or altimeter, can be used to eliminate the drift of lidar inertial odometry that accumulates over long durations, or in feature-poor environments. Place recognition can also be easily incorporated into the system. To improve the real-time performance of the system, we propose a sliding window approach that marginalizes old lidar frames for scan matching. Keyframes are selectively added to the factor graph, and new keyframes are registered only to a fixedsize set of sub-keyframes when both lidar odometry and loop closure factors are generated. This scan-matching at a local scale rather than a global scale facilitates the realtime performance of the LIO-SAM framework. The proposed method is thoroughly evaluated on datasets gathered on three platforms across a variety of environments. The results show that LIO-SAM can achieve similar or better accuracy when compared with LOAM and LIOM. Future work involves testing the proposed system on unmanned aerial vehicles.
我们提出了LIO-SAM,一个通过smoothing和mapping实现紧耦合激光雷达惯性里程测量的框架,用于在复杂环境中执行实时状态估计和建图。LIO-SAM通过在因子图上建立激光雷达惯性里程计,特别适用于多传感器融合。额外的传感器测量可以很容易地作为新的因素纳入到框架中。提供绝对测量值的传感器,如GPS、指南针或高度计,可用于消除激光雷达惯性里程计在长时间内或在特征较少的环境中的累积漂移。位置识别也可以很容易地集成到系统中。为了提高系统的实时性,我们提出了一种滑动窗口方法来边缘化旧的激光雷达帧进行scan匹配。有选择地将关键帧添加到因子图中,并且在生成激光雷达里程计和环路闭合因子时,新关键帧仅对齐(registered)到固定大小的子关键帧集。这种局部范围而非全局范围的scan匹配有助于LIO-SAM框架的实时性能。我们在三个平台收集的数据集上对所提出的方法进行了全面的评估,这些数据集跨越各种环境。结果表明,与LOAM和LIOM相比,LIO-SAM可以达到相似或更好的精度。未来的工作包括在无人机上测试拟提出的系统。