本文主要介绍相关的RGB-D数据集,并完成其搬运工作
。
1513个采集场景数据,21个类别的对象,其中,1201个场景用于训练,312个场景用于测试。
该数据集有四个评测任务:3D语义分割、3D实例分割、2D语义分割和2D实例分割。
ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowdsourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval. More information can be found in our paper.
官方链接
官方GitHub
申请数据集:ScanNet Terms of Use to [email protected]
下载数据集
#-o 保存文件路径
python download_scannet.py -o data
由于2DRGB-D帧的数据量特别大,作者提供了下载较小子集的选项scannet_frames_25k(约25,000帧,从完整数据集中大约每100帧进行二次采样)通过ScanNet数据下载,有5.6G,还有基准评估scannet_frames_test。#TODO 更多细节待补
PREPROCESSED_FRAMES_FILE = ['scannet_frames_25k.zip', '5.6GB']
TEST_FRAMES_FILE = ['scannet_frames_test.zip', '610MB']
下载scannet_frames_25k
python download-scannet.py -o data --preprocessed_frames
一般会出现urllib.error.HTTPError: HTTP Error 404: Not Found
,笔者的解决方法是将下图中马赛克的下的网页链接复制到浏览器,直接用浏览器或迅雷下载。笔者测试的是迅雷不能下载,浏览器需要科学上网,下载速度还是很可观的,8MB/S左右。
<scanId>
|-- <scanId>.sens
RGB-D传感器流(*sens):压缩二进制格式,
包含每帧的颜色、深度、相机姿势和其他数据。
其中RGB图像大小为1296×968,深度图像大小为640×480
|-- <scanId>_vh_clean.ply
高质量重建后的surface mesh 文件(.ply):
(Updated if had remove annotations)
|-- <scanId>_vh_clean_2.ply
(Updated if had remove annotations)
|-- <scanId>.aggregation.json, <scanId>_vh_clean.aggregation.json
曲面网格分割文件(.segs.json):记录了场景中物体分割的详细信息
Updated aggregated instance-level semantic annotations on lo-res, hi-res meshes, respectively
|-- <scanId>_vh_clean_2.labels.ply
Updated visualization of aggregated semantic segmentation; colored by nyu40 labels (see legend referenced above; ply property 'label' denotes the ScanNet label id)
|-- <scanId>_2d-label.zip
Updated raw 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids
原始16位png标签标注信息,图像大小为1296×968,带有ScanNet的标签id
|-- <scanId>_2d-instance.zip
Updated raw 2d projections of aggregated annotation instances as 8-bit pngs
原始16位png实例标注信息,图像大小为1296×968
|-- <scanId>_2d-label-filt.zip
Updated filtered 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids
经过滤波的8位png标签标注信息,图像大小为1296×968,带有ScanNet的标签id
|-- <scanId>_2d-instance-filt.zip
Updated filtered 2d projections of aggregated annotation instances as 8-bit pngs
经过滤波的8位png实例标注信息,图像大小为1296×968
目前还不清楚,label和instance的区别。
包括每一个场景下的N个帧(为了避免帧之间的重叠信息一般取的时候隔50取一帧)2D标签和实例数据提供为.png图像文件。彩色图像以8位RGB的形式提供.jpg,深度图片为16位 .png(除以1000可获得以米为单位的深度)。详细信息见参考资料1.
2D图像数据解析
解析代码链接
安装依赖包imageio
pip install imageio==1.1
Imageio: 'freeimage-3.15.1-win64.dll' was not found on your computer; downloading it now.
详细信息见参考资料2.
解析图像数据,推荐python2.7,python3存在struct.unpack str到bayes转换为题。
python reader.py --filename scene0000_00.sens --output_path image
#python reader.py --filename [.sens file to export data from] --output_path [output directory to export data to]
#Options:
#--export_depth_images: export all depth frames as 16-bit pngs (depth shift 1000)
#--export_color_images: export all color frames as 8-bit rgb jpgs
#--export_poses: export all camera poses (4x4 matrix, camera to world)
#--export_intrinsics: export camera intrinsics (4x4 matrix)
为了便于可视化解析进程,建议对SensorData.py文件进行修改,增加进度条部分代码
from tqdm import tqdm
#更换71行代码:for i in range(num_frames): 为:
for i in tqdm(range(num_frames),ncols=80):
#相应的81行、93行 也可以相应更换为:
for f in tqdm(range(0, len(self.frames), frame_skip),ncols=80):
for f in tqdm(range(0, len(self.frames), frame_skip),ncols=80):
解析结果:
数据组成:
color图为每隔100帧进行二次采样的结果,depth、instance、label和pose分别对应其深度图、实例图、标签图和位置信息。# TODO intrinsics_color.txt和intrinsics_depth.txt为相机矩阵。
还未开始相关工作,详细信息见参考资料3.
官方分割文件
该数据集有四个评测任务:场景分类,语义分割,室内布局估计,3D目标检测。
Download:http://rgbd.cs.princeton.edu/challenge.html
# see: http://rgbd.cs.princeton.edu/ in section Data and Annotation
DATASET_URL = 'http://rgbd.cs.princeton.edu/data/SUNRGBD.zip'
DATASET_TOOLBOX_URL = 'http://rgbd.cs.princeton.edu/data/SUNRGBDtoolbox.zip'
README.txt:
********************************************************************************
Data: Image depth and label data are in SUNRGBD.zip
image: rgb image
depth:
depth image to read the depth see the code in SUNRGBDtoolbox/read3dPoints/.
extrinsics: the rotation matrix to align the point could with gravity
fullres: full resolution depth and rgb image
intrinsics.txt : sensor intrinsic
scene.txt : scene type
annotation2Dfinal : 2D segmentation
annotation3Dfinal : 3D bounding box
annotation3Dlayout : 3D room layout bounding box
*********************************************************************************
Label:
In SUNRGBDtoolbox/Metadata
SUNRGBDMeta.mat:
2D,3D bounding box ground truth and image information for each frame.
SUNRGBD2Dseg.mat:
2D segmetation ground truth.
The index in "SUNRGBD2Dseg(imageId).seglabelall"
mapping the name to "seglistall".
The index in "SUNRGBD2Dseg(imageId).seglabel"
are mapping the object name in "seg37list".
********************************************************************************
共有37个类别
wall,floor,cabinet,bed,chair,sofa,
table,door,window,bookshelf,picture,
counter,blinds,desk,shelves,curtain,
dresser,pillow,mirror,floor_mat,clothes,
ceiling,books,fridge,tv,paper,towel,
shower_curtain,box,whiteboard,person,
night_stand,toilet,sink,lamp,bathtub,bag
部分解析代码:
for i, meta in tqdm(enumerate(SUNRGBDMeta)):
meta_dir = '/'.join(meta.rgbpath.split('/')[:-2])
real_dir = meta_dir.split('/n/fs/sun3d/data/SUNRGBD/')[1]
depth_bfx_path = os.path.join(real_dir, 'depth_bfx/' + meta.depthname)
rgb_path = os.path.join(real_dir, 'image/' + meta.rgbname)
label_path = os.path.join(real_dir, 'label/label.npy')
label_path_full = os.path.join(output_path, 'SUNRGBD', label_path)
# save segmentation (label_path) as numpy array
if not os.path.exists(label_path_full):
os.makedirs(os.path.dirname(label_path_full), exist_ok=True)
label = np.array(
SUNRGBD2Dseg[seglabel[i][0]][:].transpose(1, 0)).\
astype(np.uint8)
np.save(label_path_full, label)
if meta_dir in split_train:
img_dir_train.append(os.path.join('SUNRGBD', rgb_path))
depth_dir_train.append(os.path.join('SUNRGBD', depth_bfx_path))
label_dir_train.append(os.path.join('SUNRGBD', label_path))
else:
img_dir_test.append(os.path.join('SUNRGBD', rgb_path))
depth_dir_test.append(os.path.join('SUNRGBD', depth_bfx_path))
label_dir_test.append(os.path.join('SUNRGBD', label_path))
SUN RGB-D 数据集论文翻译
更多信息见参考资料5.
Download
详细信息见参考资料6
解析代码
详细信息见参考资料
- 关于ScanNet数据集
- OSError: Unable to download ‘freeimage-3.15.1-win64.dll‘. Perhaps there is a no internet connection?
- ScanNetV2 数据集讲解和选择性下载
- 主流RGBD数据集简介
- 《《《翻译》》》SUN RGB-D数据集
- NYU Depth Dataset V2数据集的读取