【泡泡机器人原创专栏】DBoW3 视觉词袋模型、视觉字典和图像数据库分析
浅谈回环检测中的词袋模型(bag of words)
开源词袋模型DBow3原理&源码(一)整体结构
class KeyFrame
{
public:
KeyFrame(Frame &F, Map* pMap, KeyFrameDatabase* pKFDB);
// Pose functions
void SetPose(const cv::Mat &Tcw);
cv::Mat GetPose();
cv::Mat GetPoseInverse();
cv::Mat GetCameraCenter();
cv::Mat GetStereoCenter();
cv::Mat GetRotation();
cv::Mat GetTranslation();
// Bag of Words Representation
//计算此关键帧的mBowVec,mFeatVec
void ComputeBoW();
// Covisibility graph functions
/**在共视图Covisibility graph中添加边,并调用UpdateBestCovisibles()更新essential graph
* @param pKF 具有共视关系的其他关键帧
* @param weight 和pKF共视的mappoint数量
*/
void AddConnection(KeyFrame* pKF, const int &weight);
void EraseConnection(KeyFrame* pKF);
//更新共视图Covisibility graph,essential graph和spanningtree,以及共视关系
void UpdateConnections();
//更新 mvpOrderedConnectedKeyFrames,mvOrderedWeights,也就是共视图covisibility
void UpdateBestCovisibles();
//返回此关键帧有共视关系的节点
std::set<KeyFrame *> GetConnectedKeyFrames();
//返回Covisibility graph中与此节点连接的节点(即关键帧)
std::vector<KeyFrame* > GetVectorCovisibleKeyFrames();
//返回Covisibility graph中与此节点连接的权值前N的节点
std::vector<KeyFrame*> GetBestCovisibilityKeyFrames(const int &N);
std::vector<KeyFrame*> GetCovisiblesByWeight(const int &w);
int GetWeight(KeyFrame* pKF);
// Spanning tree functions
//Spanning tree的节点为关键帧,共视程度最高的那个关键帧设置为节点在Spanning Tree中的父节点
void AddChild(KeyFrame* pKF);
void EraseChild(KeyFrame* pKF);
void ChangeParent(KeyFrame* pKF);
std::set<KeyFrame*> GetChilds();
KeyFrame* GetParent();
bool hasChild(KeyFrame* pKF);
// Loop Edges
//添加一个闭环检测帧
void AddLoopEdge(KeyFrame* pKF);
std::set<KeyFrame*> GetLoopEdges();
// MapPoint observation functions
/**向关键帧添加mappoint,让keyframe知道自己可以看到哪些mappoint
* @param pMP 添加的mappoint
* @param idx mappoint在此帧对应的特征点的序号
*/
void AddMapPoint(MapPoint* pMP, const size_t &idx);
void EraseMapPointMatch(const size_t &idx);
void EraseMapPointMatch(MapPoint* pMP);
void ReplaceMapPointMatch(const size_t &idx, MapPoint* pMP);
std::set<MapPoint*> GetMapPoints();
//外部接口,大小是mvKeys大小,表示mappoint和此帧特征点的联系。如果没有联系则为NULL
std::vector<MapPoint*> GetMapPointMatches();
//返回此keyframe可以看到的mappoint,minObs表示返回的mappoint能被共视的最小值
int TrackedMapPoints(const int &minObs);
MapPoint* GetMapPoint(const size_t &idx);
// KeyPoint functions
/**参考Frame的GetFeaturesInArea
* 找到在 以x, y为中心,边长为2r的方形内的特征点
* @param x 图像坐标u
* @param y 图像坐标v
* @param r 边长
* @return 满足条件的特征点的序号
*/
std::vector<size_t> GetFeaturesInArea(const float &x, const float &y, const float &r) const;
cv::Mat UnprojectStereo(int i);
// Image
bool IsInImage(const float &x, const float &y) const;
// Enable/Disable bad flag changes
// Avoid that a keyframe can be erased while it is being process by loop closing
void SetNotErase();
void SetErase();
// Set/check bad flag
void SetBadFlag();
bool isBad();
// Compute Scene Depth (q=2 median). Used in monocular.
//返回mappoint集合在此帧的深度的中位数
float ComputeSceneMedianDepth(const int q);
static bool weightComp( int a, int b){
return a>b;
}
static bool lId(KeyFrame* pKF1, KeyFrame* pKF2){
return pKF1->mnId<pKF2->mnId;
}
// The following variables are accesed from only 1 thread or never change (no mutex needed).
public:
//下一个keyframe的序号
static long unsigned int nNextId;
//当前keyframe的序号
long unsigned int mnId;
//生成这个keyframe的Frame id
const long unsigned int mnFrameId;
const double mTimeStamp;
// Grid (to speed up feature matching)
const int mnGridCols;
const int mnGridRows;
const float mfGridElementWidthInv;
const float mfGridElementHeightInv;
// Variables used by the tracking
//防止重复添加局部地图关键帧
long unsigned int mnTrackReferenceForFrame;
//在LocalMapping::SearchInNeighbors()中,标记此keyframe将要融合(fuse)的keyframe ID
long unsigned int mnFuseTargetForKF;
// Variables used by the local mapping
//local mapping的local BA在使用此关键帧的标记
long unsigned int mnBALocalForKF;
long unsigned int mnBAFixedForKF;
// Variables used by the keyframe database
//此为frame ID,标记是哪个keyframe查询过和此keyframe有相同的单词,且在essentialgraph中连接
long unsigned int mnLoopQuery;
//mnLoopQuery指向的Frame和此keyframe有多少共同的单词
int mnLoopWords;
//此为通过dbow计算的mnLoopQuery指向的Frame与此keyframe之间相似度的得分
float mLoopScore;
//此为frame ID,标记是哪个frame查询过和此keyframe有相同的单词
long unsigned int mnRelocQuery;
//mnRelocQuery指向的Frame和此keyframe有多少共同的单词
int mnRelocWords;
//当mnRelocWords大于阈值值时,就会被计算此值。
//此为通过dbow计算的mnRelocQuery指向的Frame与此keyframe之间相似度的得分
float mRelocScore;
// Variables used by loop closing
//global BA的结果
cv::Mat mTcwGBA;
cv::Mat mTcwBefGBA;
long unsigned int mnBAGlobalForKF;
// Calibration parameters
const float fx, fy, cx, cy, invfx, invfy, mbf, mb, mThDepth;
// Number of KeyPoints
const int N;
// KeyPoints, stereo coordinate and descriptors (all associated by an index)
const std::vector<cv::KeyPoint> mvKeys;
//
const std::vector<cv::KeyPoint> mvKeysUn;
const std::vector<float> mvuRight; // negative value for monocular points
const std::vector<float> mvDepth; // negative value for monocular points
const cv::Mat mDescriptors;
//BoW
//mBowVec本质是一个map
//对于某幅图像A,它的特征点可以对应多个单词,组成它的bow
DBoW2::BowVector mBowVec;
//mFeatVec是一个std::map >
//将此帧的特征点分配到mpORBVocabulary树各个结点,从而得到mFeatVec
//mFeatVec->first代表结点ID
//mFeatVec->second代表在mFeatVec->first结点的特征点序号的vector集合
DBoW2::FeatureVector mFeatVec;
// Pose relative to parent (this is computed when bad flag is activated)
cv::Mat mTcp;
// Scale
const int mnScaleLevels;
const float mfScaleFactor;
const float mfLogScaleFactor;
const std::vector<float> mvScaleFactors;
const std::vector<float> mvLevelSigma2;
const std::vector<float> mvInvLevelSigma2;
// Image bounds and calibration
const int mnMinX;
const int mnMinY;
const int mnMaxX;
const int mnMaxY;
const cv::Mat mK;
// The following variables need to be accessed trough a mutex to be thread safe.
protected:
// SE3 Pose and camera center
cv::Mat Tcw;
cv::Mat Twc;
//关键帧光心在世界坐标系中的坐标
cv::Mat Ow;
cv::Mat Cw; // Stereo middel point. Only for visualization
// MapPoints associated to keypoints
//此keyframe可以看到哪些mappoint
//大小是mvKeys大小,表示mappoint和此帧特征点的联系。如果没有联系则为NULL
std::vector<MapPoint*> mvpMapPoints;
// BoW
KeyFrameDatabase* mpKeyFrameDB;
ORBVocabulary* mpORBvocabulary;
// Grid over the image to speed up feature matching
//储存这各个窗格的特征点在mvKeysUn中的序号
std::vector< std::vector <std::vector<size_t> > > mGrid;
//与此关键帧其他关键帧的共视关系及其mappoint共视数量
std::map<KeyFrame*,int> mConnectedKeyFrameWeights;
//mvpOrderedConnectedKeyFrames和mvOrderedWeights共同组成了论文里的covisibility graph
//与此关键帧具有连接关系的关键帧,其顺序按照共视的mappoint数量递减排序
std::vector<KeyFrame*> mvpOrderedConnectedKeyFrames;
//mvpOrderedConnectedKeyFrames中共视的mappoint数量,也就是共视图covisibility graph权重,递减排列
std::vector<int> mvOrderedWeights;
// Spanning Tree and Loop Edges
//此keyframe
bool mbFirstConnection;
//此keyframe在Spanning Tree中的父节点
KeyFrame* mpParent;
//此keyframe在Spanning Tree中的子节点集合
std::set<KeyFrame*> mspChildrens;
std::set<KeyFrame*> mspLoopEdges;
// Bad flags
// Avoid that a keyframe can be erased while it is being process by loop closing
bool mbNotErase;
bool mbToBeErased;
bool mbBad;
float mHalfBaseline; // Only for visualization
Map* mpMap;
std::mutex mMutexPose;
std::mutex mMutexConnections;
std::mutex mMutexFeatures;
};
更新共视图Covisibility graph,essential graph和spanningtree,以及共视关系
void KeyFrame::UpdateConnections()
{
//和此关键帧有共视关系的关键帧及其可以共视的mappoint数量
map<KeyFrame*,int> KFcounter;
//此关键帧可以看到的mappoint
vector<MapPoint*> vpMP;
{
unique_lock<mutex> lockMPs(mMutexFeatures);
vpMP = mvpMapPoints;
}
//For all map points in keyframe check in which other keyframes are they seen
//Increase counter for those keyframes
// 计算每一个关键帧都有多少其他关键帧与它存在共视关系,结果放在KFcounter
//遍历此关键帧可以看到的mappoint
for(vector<MapPoint*>::iterator vit=vpMP.begin(), vend=vpMP.end(); vit!=vend; vit++)
{
MapPoint* pMP = *vit;
if(!pMP)
continue;
if(pMP->isBad())
continue;
//此mappoint可以被看到的所有关键帧
map<KeyFrame*,size_t> observations = pMP->GetObservations();
//遍历此mappoint可以被看到的所有关键帧
for(map<KeyFrame*,size_t>::iterator mit=observations.begin(), mend=observations.end(); mit!=mend; mit++)
{
if(mit->first->mnId==mnId)
continue;
KFcounter[mit->first]++;
}
}
// This should not happen
//没有其他关键帧和此关键帧有关系
if(KFcounter.empty())
return;
//If the counter is greater than threshold add connection
//In case no keyframe counter is over threshold add the one with maximum counter
//记录和其他关键帧共视mappoint数量最多的关键帧和数量
int nmax=0;
KeyFrame* pKFmax=NULL;
//判断两关键帧是否共生成covisibility graph的一条边
int th = 15;
//当共视mappoint点数量达到一定阈值th的情况下,在covisibility graph添加一条边
vector<pair<int,KeyFrame*> > vPairs;
vPairs.reserve(KFcounter.size());
for(map<KeyFrame*,int>::iterator mit=KFcounter.begin(), mend=KFcounter.end(); mit!=mend; mit++)
{
if(mit->second>nmax)
{
nmax=mit->second;
pKFmax=mit->first;
}
if(mit->second>=th)
{
vPairs.push_back(make_pair(mit->second,mit->first));
(mit->first)->AddConnection(this,mit->second);
}
}
if(vPairs.empty())
{
//如果共视数量达不到th阈值,那么就把最大的pKFmax添加进去
vPairs.push_back(make_pair(nmax,pKFmax));
pKFmax->AddConnection(this,nmax);
}
//根据视图covisibility graph权重由小到大排序
sort(vPairs.begin(),vPairs.end());
list<KeyFrame*> lKFs;
list<int> lWs;
//把权重大的放在前面,这样
for(size_t i=0; i<vPairs.size();i++)
{
lKFs.push_front(vPairs[i].second);
lWs.push_front(vPairs[i].first);
}
{
unique_lock<mutex> lockCon(mMutexConnections);
// mspConnectedKeyFrames = spConnectedKeyFrames;
// 更新共视图covisibility graph的连接
mConnectedKeyFrameWeights = KFcounter;
//更新covisibility graph连接
//mvpOrderedConnectedKeyFrames权重由大到小排列
mvpOrderedConnectedKeyFrames = vector<KeyFrame*>(lKFs.begin(),lKFs.end());
mvOrderedWeights = vector<int>(lWs.begin(), lWs.end());
//如果不是第一个关键帧,且此节点没有父节点
if(mbFirstConnection && mnId!=0)
{
//共视程度最高的那个关键帧设置为此节点在Spanning Tree中的父节点
mpParent = mvpOrderedConnectedKeyFrames.front();
//共视程度最高的那个关键帧在在Spanning Tree中的子节点设置为此关键帧
mpParent->AddChild(this);
mbFirstConnection = false;
}
}
}