ORB-SLAM 学习

ORB-SLAM 学习笔记

1. 系统框架

(A) Tracking, Local Mapping and Loop Closing 线程

  • The tracking is in charge of localizing the camera with every frame and deciding when to insert a new keyframe. For each two successive frames, the optimisation is realised via Motion Only Bundle Adjustment. Furthermore, a local visible map is retrieved using the covisibility graph of keyframes that is maintained by the system.
  • Local Mapping 线程处理keyframes以及使用local BA 来优化相机pose,重要的是该线程中使用 an exigent point culling policy 来保证map points的质量。
  • Loop Closing 线程在每一keyframe来的时候检测有无loop closure的出现。

(B) Map Points, KeyFrames and their Selection

  • Map Points: 每个地图点 p i p_{i} pi 都存着以下信息。

    1. 其在世界坐标系中的坐标 X w , i \mathbf{X}_{w, i} Xw,i
    2. The viewing direction n i \mathbf{n}_{i} ni, which is the mean unit vector of all its viewing directions
    3. 具有代表性 的OBR Descriptor, which is the associated ORB descriptor whose hamming distance is minimum with respect to all other associated descriptors in the keyframes in which the point is observed
    4. 根据ORB feature 的 scale invariance 存储着该特征点的最大和最小可观测到的距离 d max ⁡ d_{\max } dmax, d min ⁡ d_{\min } dmin
  • Keyframe: 每个关键帧 K i \mathbf{K}_i Ki 都存着以下信息。

    1. 相对于世界坐标系的本关键帧的 transformation matrix T i w \mathbf{T}_{i w} Tiw
    2. 相机 intrinsic matrix
    3. 该关键帧检测到的所有ORB Feature Points 包含 associated or not to a map point

地图点和关键帧的选择和建立都是非常 generous 的,然后会有一个 exigent point culling policy 来剔除多余的关键帧,这保证了系统建图的 robustness。

(C) Covisibility Graph and Essential Graph

Covisibility Graph 被表示成一个 Undirected Weighted Graph,其中每一个 node 表示一个 keyframe, 每条edge表示了两个keyframes有至少15个共同可观测到的map point, edge 的 weight θ \theta θ 为共同可观测到的点的数量。

相比于 covisibility graph, essential graph 有同样的关键帧,但是较少的edge,这是因为如果在loop closure 里使用covisibility graph的话,整个地图还是太dense了,优化起来计算量比较大。地图系统不断的扩展关键帧的数量,生成covisibility graph, 同时地,系统也在生成essential graph,当一个新关键帧生成的时候,这个关键帧会被链接到和他共有最多共同观测点当关键帧上。 The Essential Graph 包含了 spanning tree, 还有covisibility graph edges的高共视度(with high covisibility θ m i n = 100 \theta_{min} = 100 θmin=100)的子集,还有loop closure 的edge,保证了相机poses形成了一个强健的网络。

下图是ORBSLAM论文1里面关于 Covisibility Graph and Essential Graph 的描述图。

ORB-SLAM 学习_第1张图片

(D) Bags of Words Place Recognition

The vocabulary is created offline with the ORB descriptors extracted from a large set of images.2


2. 自动地图初始化

3. Tracking 线程

Tracking 线程是ORBSLAM系统的主线程,我们从系统的启动来分析。

首先,系统通过System.h文件constructor初始化,且Tracking.h包含的constructor也一并初始化。

接下来,对于每一张输入的图片,执行以下函数,其中包含对于当前状态的判断,需不需要reset,是不是localisation only的模式,然后跳到Tracking.h文件里面。

cv::Mat System::TrackMonocular(const cv::Mat &im, const double &timestamp)
{
    if(mSensor!=MONOCULAR)
    {
        cerr << "ERROR: you called TrackMonocular but input sensor was not set to Monocular." << endl;
        exit(-1);
    }

    // Check mode change
    {
        unique_lock<mutex> lock(mMutexMode);
        if(mbActivateLocalizationMode)
        {
            mpLocalMapper->RequestStop();

            // Wait until Local Mapping has effectively stopped
            while(!mpLocalMapper->isStopped())
            {
                //usleep(1000);
                std::this_thread::sleep_for(std::chrono::milliseconds(1));
            }

            mpTracker->InformOnlyTracking(true);// 定位时,只跟踪
            mbActivateLocalizationMode = false;// 防止重复执行
        }
        if(mbDeactivateLocalizationMode)
        {
            mpTracker->InformOnlyTracking(false);
            mpLocalMapper->Release();
            mbDeactivateLocalizationMode = false;// 防止重复执行
        }
    }

    // Check reset
    {
        unique_lock<mutex> lock(mMutexReset);
        if(mbReset)
        {
            mpTracker->Reset();
            mbReset = false;
        }
    }

    return mpTracker->GrabImageMonocular(im,timestamp);
}

接下来进入,mpTracker->GrabImageMonocular(im,timestamp); 在这个函数里面,注意mpTracker指的是我们之前初始化过的Tracking类。

cv::Mat Tracking::GrabImageMonocular(const cv::Mat &im, const double &timestamp)
{
    mImGray = im;

    // 步骤1:将RGB或RGBA图像转为灰度图像
    if(mImGray.channels()==3)
    {
        if(mbRGB)
            cvtColor(mImGray,mImGray,CV_RGB2GRAY);
        else
            cvtColor(mImGray,mImGray,CV_BGR2GRAY);
    }
    else if(mImGray.channels()==4)
    {
        if(mbRGB)
            cvtColor(mImGray,mImGray,CV_RGBA2GRAY);
        else
            cvtColor(mImGray,mImGray,CV_BGRA2GRAY);
    }

    // 步骤2:构造Frame
    if(mState==NOT_INITIALIZED || mState==NO_IMAGES_YET)// 没有成功初始化的前一个状态就是NO_IMAGES_YET
        mCurrentFrame = Frame(mImGray,timestamp,mpIniORBextractor,mpORBVocabulary,mK,mDistCoef,mbf,mThDepth);
    else
        mCurrentFrame = Frame(mImGray,timestamp,mpORBextractorLeft,mpORBVocabulary,mK,mDistCoef,mbf,mThDepth);

    // 步骤3:跟踪
    Track();

    return mCurrentFrame.mTcw.clone();
}

在这个函数里,我们主要做了以下几件事情:

  1. 将RGB图像转换成灰度图。
  2. 构造此张图像对应的Frame。
  3. Track();

那么我们先来看一下ORBSLAM是如何对每张输入图片构建Frame的。 **注意:**这里氛围初始化还是不是初始化的情况,初始化的时候mpIniORBextractor使用了两倍的特征点数量,代码如下,在Tracking.h的constructor定义。

    // tracking过程都会用到mpORBextractorLeft作为特征点提取器
    mpORBextractorLeft = new ORBextractor(nFeatures,fScaleFactor,nLevels,fIniThFAST,fMinThFAST);

    // 在单目初始化的时候,会用mpIniORBextractor来作为特征点提取器
    if(sensor==System::MONOCULAR)
        mpIniORBextractor = new ORBextractor(2*nFeatures,fScaleFactor,nLevels,fIniThFAST,fMinThFAST);

mK,mDistCoef 分别是intrinsic matrix 和 distortion 矫正参数。
mbf,mThDepth这两个参数在Monocular里面是用不到的,但是还是传递了进去。

回到上面的第2点,构造此图像对应的Frame。Frame的构造函数,也就是代码里的步骤二展示的部分,展开可以看到:

// 单目初始化
Frame::Frame(const cv::Mat &imGray, const double &timeStamp, ORBextractor* extractor,ORBVocabulary* voc, cv::Mat &K, cv::Mat &distCoef, const float &bf, const float &thDepth)
    :mpORBvocabulary(voc),mpORBextractorLeft(extractor),mpORBextractorRight(static_cast<ORBextractor*>(NULL)),
     mTimeStamp(timeStamp), mK(K.clone()),mDistCoef(distCoef.clone()), mbf(bf), mThDepth(thDepth)
{
    // Frame ID
    mnId=nNextId++;

    // Scale Level Info
    mnScaleLevels = mpORBextractorLeft->GetLevels();
    mfScaleFactor = mpORBextractorLeft->GetScaleFactor();
    mfLogScaleFactor = log(mfScaleFactor);
    mvScaleFactors = mpORBextractorLeft->GetScaleFactors();
    mvInvScaleFactors = mpORBextractorLeft->GetInverseScaleFactors();
    mvLevelSigma2 = mpORBextractorLeft->GetScaleSigmaSquares();
    mvInvLevelSigma2 = mpORBextractorLeft->GetInverseScaleSigmaSquares();

    // ORB extraction
    ExtractORB(0,imGray);

    N = mvKeys.size();

    if(mvKeys.empty())
        return;

    // 调用OpenCV的矫正函数矫正orb提取的特征点
    UndistortKeyPoints();

    // Set no stereo information
    mvuRight = vector<float>(N,-1);
    mvDepth = vector<float>(N,-1);

    mvpMapPoints = vector<MapPoint*>(N,static_cast<MapPoint*>(NULL));
    mvbOutlier = vector<bool>(N,false);

    // This is done only for the first Frame (or after a change in the calibration)
    if(mbInitialComputations)
    {
        ComputeImageBounds(imGray);

        mfGridElementWidthInv=static_cast<float>(FRAME_GRID_COLS)/static_cast<float>(mnMaxX-mnMinX);
        mfGridElementHeightInv=static_cast<float>(FRAME_GRID_ROWS)/static_cast<float>(mnMaxY-mnMinY);

        fx = K.at<float>(0,0);
        fy = K.at<float>(1,1);
        cx = K.at<float>(0,2);
        cy = K.at<float>(1,2);
        invfx = 1.0f/fx;
        invfy = 1.0f/fy;

        mbInitialComputations=false;
    }

    mb = mbf/fx;

    AssignFeaturesToGrid();
}

首先设定关于ORB feature的参数,然后提取特征点:

    // Extract ORB on the image. 0 for left image and 1 for right image.
    // 提取的关键点存放在mvKeys和mDescriptors中
    // ORB是直接调orbExtractor提取的
    void ExtractORB(int flag, const cv::Mat &im);

构建Frame这一步最重要的就是提取ORB feature了,然后我们就可以进行到第3步,Track();

4. Local Mapping 线程

5. Loop Closing 线程


  1. ORB-SLAM: a Versatile and Accurate Monocular SLAM System ↩︎

  2. Bags of Binary Words for Fast Place Recognition in Image Sequences ↩︎

你可能感兴趣的:(SLAM)