Intel RealsenseD435 color图与depth图的两种对齐(align)方式

import pyrealsense2 as rs
import cv2 as cv
import numpy as np

pipeline = rs.pipeline()

cfg = rs.config()
cfg.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
cfg.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)

# 设定需要对齐的方式(这里是深度对齐彩色,彩色图不变,深度图变换)
align_to = rs.stream.color
# 设定需要对齐的方式(这里是彩色对齐深度,深度图不变,彩色图变换)
# align_to = rs.stream.depth

alignedFs = rs.align(align_to)

profile = pipeline.start(cfg)

try:
    while True:
        fs = pipeline.wait_for_frames()
        aligned_frames = alignedFs.process(fs)

        color_frame = aligned_frames.get_color_frame()
        depth_frame = aligned_frames.get_depth_frame()

        if not depth_frame or not color_frame:
            continue

        color_image = np.asanyarray(color_frame.get_data())
        depth_image = np.asanyarray(depth_frame.get_data())

        # D·C 191122:打印depth_image的最大值看看
        # print(depth_image.max())
        # 可以看到,最大的数值从一万到六万不等(表示最远距离十多米到六十米这样)

        # D·C 191122:打印数据类型看看
        # print(depth_image.dtype)
        # uint16
        # print(color_image.dtype)
        # uint8

        # D·C 191122:打印color_image的维度看看
        # print(color_image.shape)
        # (480, 640, 3)
        # print(depth_image.shape)
        # (480, 640)

        # D·C 191122:打印cv.convertScaleAbs(depth_image, alpha=0.03)的数据类型和维度看看:
        # print(cv.convertScaleAbs(depth_image, alpha=0.03).dtype)
        # uint8
        # print(cv.convertScaleAbs(depth_image, alpha=0.03).shape)
        # (480, 640)

        # D·C 191122:打印cv.applyColorMap(cv.convertScaleAbs(depth_image, alpha=0.03), cv.COLORMAP_JET)的数据类型和维度看看
        # print(cv.applyColorMap(cv.convertScaleAbs(depth_image, alpha=0.03), cv.COLORMAP_JET).dtype)
        # uint8
        # print(cv.applyColorMap(cv.convertScaleAbs(depth_image, alpha=0.03), cv.COLORMAP_JET).shape)
        # (480, 640, 3)

        # D·C 191122:打印cv.convertScaleAbs(depth_image, alpha=0.03)的最大值看看
        # print(cv.convertScaleAbs(depth_image, alpha=0.03))
        # 可看到最大值为255
        # 估计若原值*alpha大于255,则将其取值为255,而当alpha为0.03时,能够映射的最大可变距离为255/0.03=8500mm=8.5m

        # D·C 191122:修改alpha参数后,发现图像对比度发生变化(比如alpha=1,图像基本呈红没啥对比度、alpha=0.001,图像呈蓝也没啥对比度、alpha=1点几,效果也不行)
        # origin:depth_image = cv.applyColorMap(cv.convertScaleAbs(depth_image, alpha=0.03), cv.COLORMAP_JET)
        depth_image = cv.applyColorMap(cv.convertScaleAbs(depth_image, alpha=0.03), cv.COLORMAP_JET)

        images = np.hstack((color_image, depth_image))

        # window = cv.namedWindow('window', cv.WINDOW_AUTOSIZE)

        cv.imshow('window', images)

        cv.waitKey(1)
finally:
    pipeline.stop()

对齐原理

The term “alignment” w.r.t SDK denotes a synthetic image generation with the help of depth map. In the process a new image is created using triangulation method comprising of 2D=>3D=>2D transformations.
The transformation is akin to a ray tracing - each pixel is first “fired” from 2D to 3D space, and then projected into the 2D plane of another sensor.
So by definition the aligned data is generated from depth in conjunction with color/ir data, hence the areas with no depth data coverage result in “holes” in the generated image.
The above serves a specific purpose - allowing 1:1 mapping of depth to color and vice versa. So when a user selects RGB pixel of interest then the SDK will be able to provide the corresponding Depth deterministically with no false positives.
This is a critical feature in scenarios where the depth is essentially used as a validation filter for other sensors - collision avoidance, segmentation, object recognition.
While using sheer 2D image manipulations would similarly allow to recalculate an image from a different point of view, it would also introduce fundamental flaws - preserve the pixels for which there is no corresponding depth data, and also introduce artifacts due to occlusion and different perspective, which would undermine the declared purpose.
Eventually there are more than one way to “align” images, depending of the use-case requirements. And the above explanation highlights the specific advantage (and also the utility) of the method implemented by the SDK.

SDK中的“对齐”一词表示借助深度图生成的合成图像。在此过程中,将使用由2D => 3D => 2D转换组成的三角测量方法来创建新图像。
转换类似于ray tracing-每个像素首先从2D空间“发射”到3D空间,然后投影到另一个传感器的2D平面中。
因此,根据定义,对齐的数据是结合颜色/ ir数据从深度生成的,因此没有深度数据覆盖的区域会在生成的图像中产生“孔”。

上面的内容用于特定目的-
允许将深度与颜色进行1:1映射,反之亦然。因此,当用户选择感兴趣的RGB像素时,SDK将能够确定性地提供相应的深度,而不会产生误报。
这在深度实际上用作其他传感器的验证过滤器(避免碰撞,分割,物体识别)的情况下是至关重要的功能。

虽然使用纯粹的2D图像处理可以类似地从不同的角度重新计算图像,但它也会带来基本的缺陷-保留没有相应深度数据的像素,并且还会由于遮挡和视角不同而引入伪像,这会破坏宣布的目的。

最终,取决于用例需求,有不止一种“对齐”图像的方法。上面的说明着重介绍了由SDK实现的方法的特定优势(以及实用程序)。

参考文章:Align to Depth produces lossy color image #5030

你可能感兴趣的:(Intel,RealSense)