KITTI数据集介绍

 

KITTI数据集:

7481 training images (and point clouds) and 7518 test images (and point clouds).

采集平台:

KITTI数据集介绍_第1张图片

1惯性导航系统(GPS / IMU):OXTS RT 3003
1台激光雷达:Velodyne HDL-64E
2台灰度相机,1.4百万像素:Point Grey Flea 2(FL2-14S3M-C)
2个彩色摄像头,1.4百万像素:Point Grey Flea 2(FL2-14S3C-C)
4个变焦镜头,4-8毫米:Edmund Optics NT59-917

 

文件结构:

├── data
│   ├── kitti
│   │   ├── ImageSets
│   │   ├── testing
│   │   │   ├── calib
│   │   │   ├── image_2
│   │   │   ├── velodyne
│   │   ├── training
│   │   │   ├── calib
│   │   │   ├── image_2
│   │   │   ├── label_2
│   │   │   ├── velodyne

ImageSets: xxx.txt,x为train,test,valid,数据集划分
calib:xx.txt,参数校准文件
image_2:图像数据
velodyne:点云数据
label_2:3D检测的标签文件

label_2标注格式:

Values    Name      Description
----------------------------------------------------------------------------
   1    type         Describes the type of object: 'Car', 'Van', 'Truck',
                     'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram',
                     'Misc' or 'DontCare'
                     object类别
   1    truncated    Float from 0 (non-truncated) to 1 (truncated), where
                     truncated refers to the object leaving image boundaries
                     物体是否超出图像边界
   1    occluded     Integer (0,1,2,3) indicating occlusion state:
                     0 = fully visible, 1 = partly occluded
                     2 = largely occluded, 3 = unknown
                     遮挡程度
   1    alpha        Observation angle of object, ranging [-pi..pi] 
   4    bbox         2D bounding box of object in the image (0-based index):
                     contains left, top, right, bottom pixel coordinates
   3    dimensions   3D object dimensions: height, width, length (in meters)
   3    location     3D object location x,y,z in camera coordinates (in meters)
   1    rotation_y   Rotation ry around Y-axis in camera coordinates [-pi..pi]
   1    score        Only for results: Float, indicating confidence in
                     detection, needed for p/r curves, higher is better.

举例说明:
Car 0.00 0 1.85 387.63 181.54 423.81 203.12 1.67 1.87 3.69 -16.53 2.39 58.49 1.57
Car: 类别0.00:truncated0:occluded1.85:alpha387.63 181.54 423.81 203.12: bbox1.67 1.87 3.69:dimensions-16.53 2.39 58.49: location1.57 :rotation_y
NOTE: 1. DontCare类别指的是没有被标记的物体,例如:激光雷达无法扫描到的(距离太远的)物体。
2. alpha和rotation_y:一辆在相机坐标系中正向x轴的的汽车对应rotation_y=0,无论它位于X/Z平面(鸟瞰图)的哪个位置,而alpha值只有当该对象位于相机坐标系的Z轴时才为0。

calib参数校准文件的介绍:

P0: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 0.000000000000e+00 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00  
P1: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.875744000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00  
P2: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 4.575831000000e+01 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 -3.454157000000e-01 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 4.981016000000e-03  
P3: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.395242000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.199936000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.729905000000e-03  
R0_rect: 9.999128000000e-01 1.009263000000e-02 -8.511932000000e-03 -1.012729000000e-02 9.999406000000e-01 -4.037671000000e-03 8.470675000000e-03 4.123522000000e-03 9.999556000000e-01  
Tr_velo_to_cam: 6.927964000000e-03 -9.999722000000e-01 -2.757829000000e-03 -2.457729000000e-02 -1.162982000000e-03 2.749836000000e-03 -9.999955000000e-01 -6.127237000000e-02 9.999753000000e-01 6.931141000000e-03 -1.143899000000e-03 -3.321029000000e-01  
Tr_imu_to_velo: 9.999976000000e-01 7.553071000000e-04 -2.035826000000e-03 -8.086759000000e-01 -7.854027000000e-04 9.998898000000e-01 -1.482298000000e-02 3.195559000000e-01 2.024406000000e-03 1.482454000000e-02 9.998881000000e-01 -7.997231000000e-01  

解释:
P0,P1,P2,P3分别代表左边灰度相机,右边灰度相机,左边彩色相机,右边彩色相机。
后面的数字代表相机的内参矩阵,大小为3×4:
Prect (i)=(fu(i)0cui−fu(i)bxi0fv(i)cvi00010)P_{\text {rect }}^{(i)}=\left(\begin{array}{cccc} f_{u}^{(i)} & 0 & c_{u}^{i} & -f_{u}^{(i)} b_{x}^{i} \\ 0 & f_{v}^{(i)} & c_{v}^{i} & 0 \\ 0 & 0 & 1 & 0 \end{array}\right)Prect (i)​=⎝⎜⎛​fu(i)​00​0fv(i)​0​cui​cvi​1​−fu(i)​bxi​00​⎠⎟⎞​
fufv指的是相机的焦距
cucv是指主点偏移
bi指的是第i个相机到0号摄像头的距离偏移(x方向)
R0_rect指的是0号相机的修正矩阵
Tr_velo_to_cam指的是velodyne到camera的矩阵 大小为3x4,包含了旋转矩阵 R 和 平移向量 t
Tr_imu_to_veloIMU到camera的矩阵 大小为3x4,包含了旋转矩阵 R 和 平移向量 t
例如:要将Velodyne坐标中的一个点投影到左边的彩色图像中:
x = P2 * R0_rect * Tr_velo_to_cam * y
其中:
Tr_velo_to_cam * x是将Velodyne坐标中的点x投影到编号为0的相机(参考相机)坐标系中
R0_rect *Tr_velo_to_cam * x 是将Velodyne坐标中的点x投影到编号为0的相机(参考相机)坐标系中,再修正
P2 * R0_rect *Tr_velo_to_cam * x是将Velodyne坐标中的点x投影到编号为0的相机(参考相机)坐标系中,再修正,而后投影到编号为2的相机(左边彩色相机)

投影到右边彩色图像:
x = P3 * R0_rect * Tr_velo_to_cam * y

参考:

  1. kitti数据集介绍(采集平台+标定文件+标注文件解释) https://blog.csdn.net/QFJIZHI/article/details/103682310
  2. http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d
  3. kitti 数据集解析 http://www.javashuo.com/article/p-diursvxm-vc.html

 

你可能感兴趣的:(教程,pytorch,深度学习,神经网络)