KITTI数据集数据初体验

  • KITTI简介
  • 数据采集平台
  • 激光数据
  • 参考博文

KITTI简介

KITTI数据集由德国卡尔斯鲁厄理工学院和丰田美国技术研究院联合创办,是目前国际上最大的自动驾驶场景下的计算机视觉算法评测数据集。该数据集用于评测立体图像(stereo),光流(optical flow),视觉测距(visual odometry),3D物体检测(object detection)和3D跟踪(tracking)等计算机视觉技术在车载环境下的性能。KITTI包含市区、乡村和高速公路等场景采集的真实图像数据,每张图像中最多达15辆车和30个行人,还有各种程度的遮挡与截断。整个数据集由389对立体图像和光流图,39.2 km视觉测距序列以及超过200k 3D标注物体的图像组成,以10Hz的频率采样及同步。

数据采集平台

KITTI数据采集平台包括2个灰度摄像机,2个彩色摄像机,一个Velodyne 3D激光雷达,4个光学镜头,以及1个GPS导航系统。
KITTI数据集数据初体验_第1张图片

KITTI数据集数据初体验_第2张图片

各设备坐标系、距离信息由上图可见。坐标系转换原理参见click。其实KITTI提供的数据中都包含三者的标定文件,不需人工转换。
这里写图片描述

激光数据

首先在官网KITTI下载 raw data development kit,其中的readme文件详细记录了你想知道的一切,数据采集装置,不同装置的数据格式,label等。
KITTI数据集数据初体验_第3张图片
激光数据是什么形式呢?激光照射到物体表面产生大量点数据,KITTI中的点数据包括四维x,y,z以及reflectance反射强度。Velodyne 3D激光产生点云数据,以.bin(二进制)文件保存。

Velodyne 3D laser scan data
===========================

The velodyne point clouds are stored in the folder 'velodyne_points'. To
save space, all scans have been stored as Nx4 float matrix into a binary
file using the following code:

  stream = fopen (dst_file.c_str(),"wb");
  fwrite(data,sizeof(float),4*num,stream);
  fclose(stream);

Here, data contains 4*num values, where the first 3 values correspond to
x,y and z, and the last value is the reflectance information. All scans
are stored row-aligned, meaning that the first 4 values correspond to the
first measurement. Since each scan might potentially have a different
number of points, this must be determined from the file size when reading
the file, where 1e6 is a good enough upper bound on the number of values:

  // allocate 4 MB buffer (only ~130*4*4 KB are needed)
  int32_t num = 1000000;
  float *data = (float*)malloc(num*sizeof(float));

  // pointers
  float *px = data+0;
  float *py = data+1;
  float *pz = data+2;
  float *pr = data+3;

  // load point cloud
  FILE *stream;
  stream = fopen (currFilenameBinary.c_str(),"rb");
  num = fread(data,sizeof(float),num,stream)/4;
  for (int32_t i=0; i<num; i++) {
    point_cloud.points.push_back(tPoint(*px,*py,*pz,*pr));
    px+=4; py+=4; pz+=4; pr+=4;
  }
  fclose(stream);

x,y and y are stored in metric (m) Velodyne coordinates.

IMPORTANT NOTE: Note that the velodyne scanner takes depth measurements
continuously while rotating around its vertical axis (in contrast to the cameras,
which are triggered at a certain point in time). This means that when computing
point clouds you have to 'untwist' the points linearly with respect to the velo-
dyne scanner location at the beginning and the end of the 360掳 sweep. The time-
stamps for the beginning and the end of the sweeps can be found in the time-
stamps file. The velodyne rotates in counter-clockwise direction.

Of course this 'untwisting' only works for non-dynamic environments.

The relationship between the camera triggers and the velodyne is the following:
We trigger the cameras when the velodyne is looking exactly forward (into the
direction of the cameras).

官方提供的激光数据为N*4的浮点数矩阵,raw data development kit中的matlab文件夹是官方提供matlab接口,主要是将激光数据与相机数据结合,在图像上投影。matlab接口详解及使用 最终可以将点云数据保存为pcd格式,然后用pcl进行相应处理。

参考博文

KITTI数据集简介与使用
专栏
3D Object检测数据集使用(1)
KITTI数据集中matlab接口说明及扩展
KITTI到rosbag
KITTI原始bin数据转pcd数据
github kitti-pcl

你可能感兴趣的:(计算机视觉)