深蓝学院-视觉SLAM课程-第5讲笔记

0. 内容

前面四讲都是基础内容,现在开始讲解一些具体的SLAM的内容。
视觉里程计有两种方法:特征点法和直接法,本次讲特征点法。

1. 特征点

深蓝学院-视觉SLAM课程-第5讲笔记_第1张图片

深蓝学院-视觉SLAM课程-第5讲笔记_第2张图片
实际中的描述子用二进制的比较多,好计算且内存小。

1. FAST关键点

特征点相关:以ORB为例
连续n(如取9个)个点的像素的亮度逗比改点像素的亮度大/小一个阈值(如20),这时认为是关键点,否则认为不是,则这种特征点就叫FAST-9,还有FAST-11用的也较多。
FAST的优势是提取的速度很快,缺点是不是很精确,容易扎堆,一般提取完之后要进行以下非最大值抑制,仅保留响应极大值的角点。
深蓝学院-视觉SLAM课程-第5讲笔记_第3张图片

2. BRIEF描述子


随机选择和按照一定的规则选择结果不会相差很大。

3. 特征匹配

暴力匹配、只匹配某些区域内的点、一些加速算法,如快速最近邻

2. 2D-2D对极几何

深蓝学院-视觉SLAM课程-第5讲笔记_第4张图片
所有的算法当中,2D-2D是最难的,因为知道的信息最少。
深蓝学院-视觉SLAM课程-第5讲笔记_第5张图片
注意用的旋转矩阵的方向,从哪个到哪个。

深蓝学院-视觉SLAM课程-第5讲笔记_第6张图片
2*2有4个可能的解,但是深度只能为正,所以最后只有一个正确的解。

深蓝学院-视觉SLAM课程-第5讲笔记_第7张图片
对八点法进行了一些讨论:
目的是对单目进行init,
确定尺度,纯旋转无法init问题,
尺度不确定性问题(t只代表方向,尺度t变大或者缩小时,观测点的相对位置不变),还可能出现尺度漂移,
以及估计的R,t不对时,使用RANSAC检测,重新估计(RANSAC思路:200对匹配点中取8对计算R,t,用剩下的192对去验证,如果大部分,如160对都符合,则R,t可能对,如果只有很少的符合(可能8对点中有误匹配的外点),如20对,则可能错,重新选8对,再计算R,t,总会找到合适的8对点)

还有一种情况:
当场景中的特征点都落在同一个平面上,那么两张图中的位置之间存在直接的线性关系,这个关系可以用矩阵来描述,这个矩阵就是单应矩阵(Homography),常用于无人机这种相机看到的场景提取出来的特征点都在同一平面。
深蓝学院-视觉SLAM课程-第5讲笔记_第8张图片
通过匹配出来的点算出H(据说是4组点)->R,t
实际中E和H这两种方式都会尝试一下,看哪种好用就用哪种。

深蓝学院-视觉SLAM课程-第5讲笔记_第9张图片

3. 3D-2D PnP

1. DLT(直接线性变换法)

当已知空间当中的投影点和空间中的深度信息时,用PnP方法求相机外参。
深蓝学院-视觉SLAM课程-第5讲笔记_第10张图片

深蓝学院-视觉SLAM课程-第5讲笔记_第11张图片
这里和算E,F,H矩阵挺像的,
方程组12个未知数,一对点提供两个方程,需要12/2=6对点。
转化为求解线性方程组。

3. P3P法

书上写了。
EPnP,UPnP有兴趣看,没兴趣会调用,知道是啥即可。

2. 使用最小化重投影误差的方法来求解PnP(BA方法)

深蓝学院-视觉SLAM课程-第5讲笔记_第12张图片
这部分数学很多。
深蓝学院-视觉SLAM课程-第5讲笔记_第13张图片

4. 3D-3D ICP

深蓝学院-视觉SLAM课程-第5讲笔记_第14张图片

深蓝学院-视觉SLAM课程-第5讲笔记_第15张图片

深蓝学院-视觉SLAM课程-第5讲笔记_第16张图片
深蓝学院-视觉SLAM课程-第5讲笔记_第17张图片

5. 三角测量(单目深度恢复)

对不同位置观测到的路标点进行特征匹配,使用几何关系求得路标点在两个左右相机坐标系下的深度值,就是三角测量。
教材上的代码为:使用了OpenCV的cv::triangulatePoints函数,传入匹配成功的特征点,通过三角测量计算出这些特征点在 I 1 I_1 I1相机系下的深度值(按理来说应该可以同时得到 I 1 I_1 I1 I 2 I_2 I2的深度值,无需借助估计的变换矩阵,这里应该只是为了验证三角化点与特征点的重投影关系)

void triangulation(
  const vector<KeyPoint> &keypoint_1,
  const vector<KeyPoint> &keypoint_2,
  const std::vector<DMatch> &matches,
  const Mat &R, const Mat &t,
  vector<Point3d> &points) {
  Mat T1 = (Mat_<float>(3, 4) <<
    1, 0, 0, 0,
    0, 1, 0, 0,
    0, 0, 1, 0);
  Mat T2 = (Mat_<float>(3, 4) <<
    R.at<double>(0, 0), R.at<double>(0, 1), R.at<double>(0, 2), t.at<double>(0, 0),
    R.at<double>(1, 0), R.at<double>(1, 1), R.at<double>(1, 2), t.at<double>(1, 0),
    R.at<double>(2, 0), R.at<double>(2, 1), R.at<double>(2, 2), t.at<double>(2, 0)
  );

  Mat K = (Mat_<double>(3, 3) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1);
  vector<Point2f> pts_1, pts_2;
  for (DMatch m:matches) {
    // 将像素坐标转换至相机坐标
    pts_1.push_back(pixel2cam(keypoint_1[m.queryIdx].pt, K));
    pts_2.push_back(pixel2cam(keypoint_2[m.trainIdx].pt, K));
  }

  Mat pts_4d;
  cv::triangulatePoints(T1, T2, pts_1, pts_2, pts_4d);  //返回的是特征点三角测量之后的其次坐标,将其归一化为最后一维为1就可以取前三维为非其次坐标

  // 转换成非齐次坐标
  for (int i = 0; i < pts_4d.cols; i++)
  {
    Mat x = pts_4d.col(i);
    cout<<"x before coordinate:\n"<<x<<endl;
    x /= x.at<float>(3, 0); // 归一化
    cout<<"x after coordinate:\n"<<x<<endl;
    Point3d p(
      x.at<float>(0, 0),
      x.at<float>(1, 0),
      x.at<float>(2, 0)
    );
    points.push_back(p);
  }
}

三角化之后的结果是4N的矩阵,每一列都是41的齐次坐标,需要归一化取得非其次坐标(最后一维为1,见教材 P 46 P_{46} P46)。
深蓝学院-视觉SLAM课程-第5讲笔记_第18张图片

三角化点与特征点的重投影关系,对得到的I1的深度值使用变换T,得到了I2中的深度值

  //-- 验证三角化点与特征点的重投影关系
  Mat K = (Mat_<double>(3, 3) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1);
  Mat img1_plot = img_1.clone();
  Mat img2_plot = img_2.clone();
  for (int i = 0; i < matches.size(); i++)
  {
    // 第一个图
    float depth1 = points[i].z;
    cout << "depth: " << depth1 << endl;
    Point2d pt1_cam = pixel2cam(keypoints_1[matches[i].queryIdx].pt, K);
    cv::circle(img1_plot, keypoints_1[matches[i].queryIdx].pt, 2, get_color(depth1), 2);

    // 第二个图()
    Mat pt2_trans = R * (Mat_<double>(3, 1) << points[i].x, points[i].y, points[i].z) + t;
    float depth2 = pt2_trans.at<double>(2, 0);
    cv::circle(img2_plot, keypoints_2[matches[i].trainIdx].pt, 2, get_color(depth2), 2);
  }
  cv::imshow("img 1", img1_plot);
  cv::imshow("img 2", img2_plot);
  cv::waitKey();

使用重投影验证三角化结果,证明三角化整体是对的。
深蓝学院-视觉SLAM课程-第5讲笔记_第19张图片
但是三角测量存在以下问题:
深蓝学院-视觉SLAM课程-第5讲笔记_第20张图片
单目需要依靠平移来进行三角测量,当平移量小时,三角化的精度不够;平移量大时,被物体发生的变化可能较大,可能导致匹配失败。这就是三角化的矛盾,被称为“”视差(parallax)。可以计算每个特征点的位置和不确定性,不断观测,方差减小并收敛,即为深度滤波器(在建图部分的单目稠密重建讲到)。

6. 实践环节

feature_extraction
看opencv文档

1. 2D-2D结果,对极几何

-- Max dist : 94.000000 
-- Min dist : 4.000000 
一共找到了79组匹配点
fundamental_matrix is 
[4.544437503937326e-06, 0.0001333855576988952, -0.01798499246457619;
 -0.0001275657012959839, 2.266794804637672e-05, -0.01416678429258694;
 0.01814994639952877, 0.004146055871509035, 1]
essential_matrix is 
[0.01097677480088526, 0.2483720528258777, 0.03167429207291153;
 -0.2088833206039177, 0.02908423961781584, -0.6744658838357441;
 0.008286777636447118, 0.6614041624098427, 0.01676523772725936]
homography_matrix is 
[0.9261214281175537, -0.1445322024509824, 33.26921085699328;
 0.04535424464910424, 0.9386696693816731, 8.570979966717406;
 -1.006197561051869e-05, -3.008140280167741e-05, 0.9999999999999999]
R is 
[0.9969387384754708, -0.05155574188737422, 0.05878058527591362;
 0.05000441581290405, 0.998368531736214, 0.02756507279306545;
 -0.06010582439453526, -0.02454140006844053, 0.9978902793175882]
t is 
[-0.9350802885437915;
 -0.03514646275858852;
 0.3526890700495534]
t^R=
[-0.01552350379276751, -0.3512511256212342, -0.04479421342842761;
 0.2954056249512841, -0.04113132611923882, 0.9538388002772655;
 -0.0117192733223717, -0.9353667366911764, -0.02370962656974232]
epipolar constraint = [-0.0005415187881110395]
epipolar constraint = [-0.002158321995511289]
epipolar constraint = [0.0003344241044139079]
epipolar constraint = [-8.762782289159499e-06]
epipolar constraint = [0.000212132191399178]
epipolar constraint = [0.0001088766997727197]
epipolar constraint = [0.0004246621771667111]
epipolar constraint = [-0.003173128212208817]
epipolar constraint = [-3.716645801081497e-05]
epipolar constraint = [-0.001451851842241985]
epipolar constraint = [-0.0009607717979376318]
epipolar constraint = [-0.0008051993498427168]
epipolar constraint = [-0.001424813546545535]
epipolar constraint = [-0.0004339424745821371]
epipolar constraint = [-0.000478610965853582]
epipolar constraint = [-0.000196569283305803]
epipolar constraint = [0.001542309777196771]
epipolar constraint = [0.003107154828888181]
epipolar constraint = [0.0006774648890596965]
epipolar constraint = [-0.001176167495933316]
epipolar constraint = [-0.003987986032032258]
epipolar constraint = [-0.001255263863164498]
epipolar constraint = [-0.001553941958847024]
epipolar constraint = [0.002009914869739976]
epipolar constraint = [-0.0006068096307232096]
epipolar constraint = [-2.769881004420494e-05]
epipolar constraint = [0.001274573011402158]
epipolar constraint = [-0.004169986054360234]
epipolar constraint = [-0.001108104733689524]
epipolar constraint = [-0.0005154295520383712]
epipolar constraint = [-0.001776993208794389]
epipolar constraint = [6.677362292956124e-07]
epipolar constraint = [-0.001853977732928294]
epipolar constraint = [-0.0004823070333038748]
epipolar constraint = [0.0004740904480800279]
epipolar constraint = [-0.002961041748859255]
epipolar constraint = [-0.003347655248297339]
epipolar constraint = [-0.0001975680967504778]
epipolar constraint = [-0.002824643387150695]
epipolar constraint = [-0.0004311798170555103]
epipolar constraint = [0.001181203684013504]
epipolar constraint = [1.756266610475343e-07]
epipolar constraint = [-0.002137829859245668]
epipolar constraint = [0.001153415527396465]
epipolar constraint = [-0.002120357728713398]
epipolar constraint = [2.741529055910741e-06]
epipolar constraint = [0.000904458201820401]
epipolar constraint = [-0.002100484435898768]
epipolar constraint = [-0.0003517789219357401]
epipolar constraint = [-2.72046484559238e-05]
epipolar constraint = [-0.003784823353263345]
epipolar constraint = [-0.001588562608841583]
epipolar constraint = [-0.002928516011805465]
epipolar constraint = [-0.001016592266471106]
epipolar constraint = [0.0001241874577868896]
epipolar constraint = [-0.0009797104629406181]
epipolar constraint = [0.001457875224685545]
epipolar constraint = [-1.153738078288336e-05]
epipolar constraint = [-0.00273324742758773]
epipolar constraint = [0.00141572125781092]
epipolar constraint = [0.001790255871328722]
epipolar constraint = [-0.002547929671675515]
epipolar constraint = [-0.006257078861794559]
epipolar constraint = [-0.001874877415315751]
epipolar constraint = [-0.002104542912833518]
epipolar constraint = [1.39253152928176e-06]
epipolar constraint = [-0.004013502582090489]
epipolar constraint = [-0.004726241071896793]
epipolar constraint = [-0.001328682673648302]
epipolar constraint = [3.995925173527759e-07]
epipolar constraint = [-0.005080511723986339]
epipolar constraint = [-0.001977141850700581]
epipolar constraint = [-0.00166133426468739]
epipolar constraint = [0.004285762870626424]
epipolar constraint = [0.004087325194305855]
epipolar constraint = [0.0001482534285787533]
epipolar constraint = [-0.000962530113462208]
epipolar constraint = [-0.002341076011427135]
epipolar constraint = [-0.006138005035457875]

深蓝学院-视觉SLAM课程-第5讲笔记_第21张图片

2. 3D-2D结果,调用PnP函数

这里的t是精确的,是带尺度的。

wrk@MyPC:~/SLAM/DeepBlueCurriculum/VSLAM/ch5_特征点法视觉里程计/ch7_slambook2/cmake-build-debug$ ./pose_estimation_3d2d 1.png 2.png 1_depth.png 2_depth.png 
-- Max dist : 94.000000 
-- Min dist : 4.000000 
一共找到了79组匹配点
3d-2d pairs: 75
solve pnp in opencv cost time: 0.000188566 seconds.
R=
[0.9978745555297591, -0.05102729297915373, 0.04052883908410459;
 0.04983267620066928, 0.9983081506312504, 0.02995898472731967;
 -0.04198899628432293, -0.02787564805402974, 0.9987291286613218]
t=
[-0.1273034869580363;
 -0.01157187487421139;
 0.05667408337434332]
calling bundle adjustment by gauss newton
iteration 0 cost=40517.7576706
iteration 1 cost=410.547029116
iteration 2 cost=299.76468142
iteration 3 cost=299.763574327
pose by g-n: 
   0.997905909549  -0.0509194008562   0.0398874705187   -0.126782139096
   0.049818662505    0.998362315745   0.0281209417649 -0.00843949683874
 -0.0412540489424  -0.0260749135374    0.998808391199   0.0603493575229
                0                 0                 0                 1
solve pnp by gauss newton cost time: 7.4831e-05 seconds.
calling bundle adjustment by g2o
iteration= 0     chi2= 410.547029        time= 1.7213e-05        cumTime= 1.7213e-05     edges= 75       schur= 0
iteration= 1     chi2= 299.764681        time= 1.2173e-05        cumTime= 2.9386e-05     edges= 75       schur= 0
iteration= 2     chi2= 299.763574        time= 9.498e-06         cumTime= 3.8884e-05     edges= 75       schur= 0
iteration= 3     chi2= 299.763574        time= 8.225e-06         cumTime= 4.7109e-05     edges= 75       schur= 0
iteration= 4     chi2= 299.763574        time= 9.427e-06         cumTime= 5.6536e-05     edges= 75       schur= 0
iteration= 5     chi2= 299.763574        time= 8.055e-06         cumTime= 6.4591e-05     edges= 75       schur= 0
iteration= 6     chi2= 299.763574        time= 7.925e-06         cumTime= 7.2516e-05     edges= 75       schur= 0
iteration= 7     chi2= 299.763574        time= 8.016e-06         cumTime= 8.0532e-05     edges= 75       schur= 0
iteration= 8     chi2= 299.763574        time= 8.175e-06         cumTime= 8.8707e-05     edges= 75       schur= 0
iteration= 9     chi2= 299.763574        time= 8.737e-06         cumTime= 9.7444e-05     edges= 75       schur= 0
optimization costs time: 0.000296681 seconds.
pose estimated by g2o =
    0.99790590955  -0.0509194008911   0.0398874704367   -0.126782138956
  0.0498186625425    0.998362315744   0.0281209417542 -0.00843949681823
 -0.0412540488609  -0.0260749135293    0.998808391203   0.0603493574888
                0                 0                 0                 1
solve pnp by g2o cost time: 0.00041274 seconds.

3. 3D-3D,ICP算法

分别使用SVD方法和BA的方法(非线性优化)实现了一遍,本身这个问题是有解析解的,所以BA的方法不是必要的,但是可以看出结果是一样的,且优化了两轮就收敛了。

-- Max dist : 94.000000 
-- Min dist : 4.000000 
一共找到了79组匹配点
3d-3d pairs: 72
W=  10.871 -1.01948  2.54771
-2.16033  3.85307 -5.77742
 3.94738 -5.79979  9.62203
U=  0.558087  -0.829399 -0.0252034
 -0.428009  -0.313755   0.847565
  0.710878   0.462228   0.530093
V=  0.617887  -0.784771 -0.0484806
 -0.399894  -0.366747   0.839989
  0.676979   0.499631   0.540434
ICP via SVD results: 
R = [0.9969452349468715, 0.05983347698056557, -0.05020113095482046;
 -0.05932607657705309, 0.9981719679735133, 0.01153858699565957;
 0.05079975545906246, -0.008525103184062521, 0.9986724725659557]
t = [0.144159841091821;
 -0.06667849443812729;
 -0.03009747273569774]
R_inv = [0.9969452349468715, -0.05932607657705309, 0.05079975545906246;
 0.05983347698056557, 0.9981719679735133, -0.008525103184062521;
 -0.05020113095482046, 0.01153858699565957, 0.9986724725659557]
t_inv = [-0.1461462958593589;
 0.05767443542067568;
 0.03806388018483625]
calling bundle adjustment
iteration= 0     chi2= 1.816112  time= 3.2451e-05        cumTime= 3.2451e-05     edges= 72       schur= 0        lambda= 0.000758        levenbergIter= 1
iteration= 1     chi2= 1.815514  time= 2.684e-05         cumTime= 5.9291e-05     edges= 72       schur= 0        lambda= 0.000505        levenbergIter= 1
iteration= 2     chi2= 1.815514  time= 1.8555e-05        cumTime= 7.7846e-05     edges= 72       schur= 0        lambda= 0.000337        levenbergIter= 1
iteration= 3     chi2= 1.815514  time= 1.8244e-05        cumTime= 9.609e-05      edges= 72       schur= 0        lambda= 0.000225        levenbergIter= 1
iteration= 4     chi2= 1.815514  time= 1.7342e-05        cumTime= 0.000113432    edges= 72       schur= 0        lambda= 0.000150        levenbergIter= 1
iteration= 5     chi2= 1.815514  time= 1.5409e-05        cumTime= 0.000128841    edges= 72       schur= 0        lambda= 0.000100        levenbergIter= 1
iteration= 6     chi2= 1.815514  time= 1.5179e-05        cumTime= 0.00014402     edges= 72       schur= 0        lambda= 0.000067        levenbergIter= 1
iteration= 7     chi2= 1.815514  time= 2.5178e-05        cumTime= 0.000169198    edges= 72       schur= 0        lambda= 0.068117        levenbergIter= 4
optimization costs time: 0.000460198 seconds.

after optimization:
T=
  0.996945  0.0598335 -0.0502011    0.14416
-0.0593261   0.998172  0.0115386 -0.0666785
 0.0507998 -0.0085251   0.998672 -0.0300979
         0          0          0          1
p1 = [-0.0374123, -0.830816, 2.7448]
p2 = [-0.0111479, -0.746763, 2.7652]
(R*p2+t) = [-0.05045153727143986;
 -0.7795087351232993;
 2.737231009770682]

p1 = [-0.243698, -0.117719, 1.5848]
p2 = [-0.297211, -0.0956614, 1.6558]
(R*p2+t) = [-0.24099014946244;
 -0.125427050612953;
 1.609221205035477]

p1 = [-0.627753, 0.160186, 1.3396]
p2 = [-0.709645, 0.159033, 1.4212]
(R*p2+t) = [-0.6251478077232009;
 0.1505624178119533;
 1.351809862712792]

p1 = [-0.323443, 0.104873, 1.4266]
p2 = [-0.399079, 0.12047, 1.4838]
(R*p2+t) = [-0.3209806217783838;
 0.09436777718271656;
 1.430432194554881]

p1 = [-0.627221, 0.101454, 1.3116]
p2 = [-0.709709, 0.100216, 1.3998]
(R*p2+t) = [-0.6276559234155776;
 0.09161021993899245;
 1.330936372911853]

4. 三角测量

在使用对极几何得到相机的运动之后,还需要估计地图点的深度,需要用到三角测量(三角化)
深蓝学院-视觉SLAM课程-第5讲笔记_第22张图片

7. 小结

深蓝学院-视觉SLAM课程-第5讲笔记_第23张图片

你可能感兴趣的:(SLAM,slam)