几何视觉的编程实践——相机参数计算——基于ipyvolume的3D可视化——透视投影

本文主要侧重上手实践,理论部分可以先参考其他文章学习

文章目录

  • 前言
  • 一、环境配置
    • 1.JupyterLab
    • 2.安装包
    • 3.数据准备
  • 二、代码实现
    • 1.相机校准得到相机内外参
    • 2.对棋盘标定板和相机中心进行三维可视化
    • 3.将标定板图像画在三维空间中的相机视角图像中
    • 4.把兔子模型放到场景中并投影到图像
  • 总结


前言

本次实践是几何视觉的编程实践,是对计算机视觉课程的一次巩固复习,从中查缺补漏完善知识体系。主要实现了相机内外参的计算,标定板的三维可视化,最后还添加新的模型实测透视效果。

使用的是jupterlab的环境,基于ipyvolume实现的3D可视化

图片数据和源码文件在此下载:几何视觉视觉代码+图片

一、环境配置

1.JupyterLab

jupterlab可以在Anaconda Navigator中打开,也可以使用终端打开。
几何视觉的编程实践——相机参数计算——基于ipyvolume的3D可视化——透视投影_第1张图片
如图,点击Launch即可启动,界面和jupyter类似。

2.安装包

想要实现3D效果,需要提前导入ipyvolume和matplotlib库;
如图在base环境下搜索安装ipyvolume,选择安装0.6.0a8版本
(旧版本可能会无法运行,尽量升级到最新版)
几何视觉的编程实践——相机参数计算——基于ipyvolume的3D可视化——透视投影_第2张图片
matplotlib库也是类似,我使用的版本是matplotlib==3.5.0

3.数据准备

提取准备好带有标定板的十张图片,以及chessboard
几何视觉的编程实践——相机参数计算——基于ipyvolume的3D可视化——透视投影_第3张图片

二、代码实现

1.相机校准得到相机内外参

camera_calibration.py 代码如下:

import matplotlib.pyplot as plt
import ipyvolume as ipv
import numpy as np
import cv2

def calibrate_cameras() :
    """ Calibrates cameras from chessboard images. 
    
    Returns: 
        images (list[np.ndarray]): Images containing the chessboard.
        intrinsics (np.ndarray): An upper triangular 4x4 full-rank matrix containing camera intrinsics.
        distortions (np.ndarray): Radial distortion coefficients.
        rotation_vectors (list[np.ndarray]): Rodrigues rotation vectors.
        translation_vectors (list[np.ndarray]): Translation vectors.
        object_points: (np.ndarray): A (4, 54) point array, representing the [x,y,z,w]
            of 54 chessboard points (homogenous coordiantes).
    """    
    
    # Read images
    images = list() 
    for i in range(11): 
        img = cv2.imread(f'./images/{i}.jpg')
        img = cv2.resize(img, None, fx=0.25, fy=0.25)
        images.append(img)     
                
    # chessboard corner search with opencv.
    shape = (6, 9) # The default opencv chessboard has 6 rows, 9 columns 
    object_points_all = [] # 3D world coordinates
    image_points_all = [] # 2D image coordinates
    flags = cv2.CALIB_CB_ADAPTIVE_THRESH + cv2.CALIB_CB_FAST_CHECK + cv2.CALIB_CB_NORMALIZE_IMAGE
    refinement_criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)    
    
    # For each image, store the object points and image points of chessboard corners.对于每个图像,存储棋盘角的对象点和图像点。
    images_filtered = list()    
    object_points = np.zeros((1, shape[0] * shape[1], 3), np.float32)
    object_points[0, :, :2] = np.mgrid[0:shape[0], 0:shape[1]].T.reshape(-1, 2)
    for idx, image in enumerate(images): 
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        succes, corners = cv2.findChessboardCorners(image=gray, 
                                                    patternSize=shape,
                                                    flags=flags) 
        if succes:
            images_filtered.append(image)
            corners = cv2.cornerSubPix(image=gray, 
                                       corners=corners, 
                                       winSize=(11,11), 
                                       zeroZone=(-1,-1), 
                                       criteria=refinement_criteria)
            object_points_all.append(object_points)
            image_points_all.append(corners)    
            
    images = images_filtered
    
    # Calibrate the cameras by using the 3D <-> 2D point correspondences.使用3D<->2D点对应关系校准相机。
    ret, intrinsics, distortions, rotation_vectors, translation_vectors = cv2.calibrateCamera(object_points_all, image_points_all, gray.shape[::-1], None, None)
        
    intrinsics = np.hstack([intrinsics, np.zeros((3, 1))])
    intrinsics = np.vstack([intrinsics, [[0, 0, 0, 1]]])

    # Convert chessboard object points to homogeneous coordinates for later use.将棋盘对象点转换为同质坐标以供以后使用。
    object_points = object_points[0].reshape((-1, 3)).T
    object_points = np.vstack([object_points, np.ones((1, object_points.shape[1]))])
    
    return images, intrinsics, distortions, rotation_vectors, translation_vectors, object_points

if __name__ == "__main__":

    # camera calibration
    images, intrinsics, distortions, rotation_vectors, translation_vectors, object_points = calibrate_cameras()

    print(f'\n\nNumber of calibration images: {len(images)}')
    print(f'Number of calibration points: {object_points.shape[1]}')
    print(f'\n\nSample of imags used')
    plt.figure(figsize=(20,20))
    
    _ = plt.imshow(cv2.hconcat([cv2.cvtColor(im, cv2.COLOR_RGB2BGR) for im in images[:5]]))    

结果如图:
几何视觉的编程实践——相机参数计算——基于ipyvolume的3D可视化——透视投影_第4张图片

2.对棋盘标定板和相机中心进行三维可视化

camera_center.py代码:

import matplotlib.pyplot as plt
import ipyvolume as ipv
import numpy as np
import cv2

cam_sphere_size = 1

#from camera_calibration import calibrate_cameras

def calibrate_cameras() :
    """ Calibrates cameras from chessboard images. 
    
    Returns: 
        images (list[np.ndarray]): Images containing the chessboard.
        intrinsics (np.ndarray): An upper triangular 4x4 full-rank matrix containing camera intrinsics.
        distortions (np.ndarray): Radial distortion coefficients.
        rotation_vectors (list[np.ndarray]): Rodrigues rotation vectors.
        translation_vectors (list[np.ndarray]): Translation vectors.
        object_points: (np.ndarray): A (4, 54) point array, representing the [x,y,z,w]
            of 54 chessboard points (homogenous coordiantes).
    """    
    
    # Read images
    images = list() 
    for i in range(11): 
        img = cv2.imread(f'./images/{i}.jpg')
        img = cv2.resize(img, None, fx=0.25, fy=0.25)
        images.append(img)     
                
    # chessboard corner search with opencv.
    shape = (6, 9) # The default opencv chessboard has 6 rows, 9 columns 
    object_points_all = [] # 3D world coordinates
    image_points_all = [] # 2D image coordinates
    flags = cv2.CALIB_CB_ADAPTIVE_THRESH + cv2.CALIB_CB_FAST_CHECK + cv2.CALIB_CB_NORMALIZE_IMAGE
    refinement_criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)    
    
    # For each image, store the object points and image points of chessboard corners.
    images_filtered = list()    
    object_points = np.zeros((1, shape[0] * shape[1], 3), np.float32)
    object_points[0, :, :2] = np.mgrid[0:shape[0], 0:shape[1]].T.reshape(-1, 2)
    for idx, image in enumerate(images): 
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        succes, corners = cv2.findChessboardCorners(image=gray, 
                                                    patternSize=shape,
                                                    flags=flags) 
        if succes:
            images_filtered.append(image)
            corners = cv2.cornerSubPix(image=gray, 
                                       corners=corners, 
                                       winSize=(11,11), 
                                       zeroZone=(-1,-1), 
                                       criteria=refinement_criteria)
            object_points_all.append(object_points)
            image_points_all.append(corners)    
            
    images = images_filtered
    
    # Calibrate the cameras by using the 3D <-> 2D point correspondences.
    ret, intrinsics, distortions, rotation_vectors, translation_vectors = cv2.calibrateCamera(object_points_all, image_points_all, gray.shape[::-1], None, None)
        
    intrinsics = np.hstack([intrinsics, np.zeros((3, 1))])
    intrinsics = np.vstack([intrinsics, [[0, 0, 0, 1]]])

    # Convert chessboard object points to homogeneous coordinates for later use.
    object_points = object_points[0].reshape((-1, 3)).T
    object_points = np.vstack([object_points, np.ones((1, object_points.shape[1]))])
    
    return images, intrinsics, distortions, rotation_vectors, translation_vectors, object_points


def extrinsics_from_calibration(rotation_vectors, translation_vectors):
    """ Calculates extrinsic matrices from calibration output. 
        
    Args: 
        rotation_vectors (list[np.ndarray]): Rodrigues rotation vectors.
        translation_vectors (list[np.ndarray]): Translation vectors.
    Returns: 
        extrinsics (list[np.ndarray]): A list of camera extrinsic matrices.
            These matrices are 4x4 full-rank.
    """
    
    rotation_matrices = list() 
    for rot in rotation_vectors:
        rotation_matrices.append(cv2.Rodrigues(rot)[0]) 

    extrinsics = list()
    for rot, trans in zip(rotation_matrices, translation_vectors): 
        extrinsic = np.concatenate([rot, trans], axis=1)
        extrinsic = np.vstack([extrinsic, [[0,0,0,1]]]) 
        extrinsics.append(extrinsic)
    
    return extrinsics

def camera_centers_from_extrinsics(extrinsics):
    """ Calculates camera centers from extrinsic matrices. 
    
    Args: 
        extrinsics (list[np.ndarray]):  A list of camera extrinsic matrices.
    Returns: 
        camera_centers (list[np.ndarray]): Homogenous coordinates of camera centers in 
            3D world coordinate frame.
    """
    camera_centers = list() 

    for extrinsic in extrinsics:
        rot = extrinsic[:3, :3]
        trans = extrinsic[:3, 3]
        # compute camera center based on rotation and translation
        center = np.dot(np.linalg.pinv(rot),-trans) # TODO
        center = np.append(center, 1)
        camera_centers.append(center)
    
    return camera_centers

def plot_camera_axis(cam_center, inv_extrinsic, vis_scale):
    """ Plots the x, y, and z basis vectors of the camera coordinate frame.
        This is achieved by  transforming the basis vectors into the world 
        coordinate frame and plotting them. 
    
    Args:
        cam_center (np.ndarray): Coordinates of camera centers in 
            3D world coordinate frame.                                                                                                                                                   :
        inv_extrinsic (np.ndarray): The (pseudo)-inverse of camera extrinsic matrix.
        vis_scale (int): A visualization size multiplyer to make cameras 
            appear larger and easier to see.
    """
    x, y, z, _ = cam_center
    # use inv_extrinsic function to compute a camera's x, y, and z axis
    
    x_arrow = inv_extrinsic @ (1 * vis_scale, 0, 0, 1)
    y_arrow = inv_extrinsic @ (0, 1 * vis_scale, 0, 1) 
    z_arrow =  inv_extrinsic @ (0, 0, 1 * vis_scale,1)
    
    ipv.plot([x, x_arrow[0]], [y, x_arrow[1]], [z, x_arrow[2]], color='red')
    ipv.plot([x, y_arrow[0]], [y, y_arrow[1]], [z, y_arrow[2]], color='blue')
    ipv.plot([x, z_arrow[0]], [y, z_arrow[1]], [z, z_arrow[2]], color='green')
    ipv.scatter(np.array([x]), np.array([y]), np.array([z]), size=cam_sphere_size, marker="sphere", color='blue')

def init_3d_plot():
    """ Initializes a ipyvolume 3d plot and centers the 
        world view around the center of the chessboard.
        
    Returns: 
        fig (ipyvolume.pylab.figure): A 3D plot. 
    """
    chessboard_x_center = 2.5
    chessboard_y_center = 4
    fig = ipv.pylab.figure(figsize=(15, 15), width=800)
    ipv.xlim(2.5 - 30, 2.5 + 30)
    ipv.ylim(4 - 30, 4 + 30)
    ipv.zlim(-50, 10)
    ipv.pylab.view(azimuth=40, elevation=-150)
    return fig

def plot_chessboard(object_points):
    """ Plots a 3D chessboard and highlights the 
        objects points with green spheres. 
        
    Args:  A (4, 54) point array, representing the [x,y,z,w]
           of 54 chessboard points (homogenous coordiantes).
    """
    img = cv2.imread(f'./images/chessboard.jpg')
    img_height, img_width, _ = img.shape
    chessboard_rows, chessboard_cols = 7, 10
    xx, yy = np.meshgrid(np.linspace(0, chessboard_rows, img_height), 
                         np.linspace(0, chessboard_cols, img_width)) 
    zz = np.zeros_like(yy)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) / 255
    # -1 is used as the start of the board images x and y coord, 
    # such that the first inner corner appear as coord (0, 0, 0) 
    ipv.plot_surface(xx-1, yy-1, zz, color=cv2.transpose(img))
    xs, ys, zs, _ = object_points
    ipv.scatter(xs, ys, zs, size=1, marker='sphere', color='lime')


if __name__ == "__main__":
    
    images, intrinsics, distortions, rotation_vectors, translation_vectors, object_points = calibrate_cameras()        

    extrinsics = extrinsics_from_calibration(rotation_vectors, translation_vectors)
    camera_centers = camera_centers_from_extrinsics(extrinsics)

    print(f'Shape of an extrinsic matrix: {extrinsics[0].shape}')
    print(f'Shape of the intrinsic matrix: {intrinsics.shape}')
    print(f'Number of cameras in the scene: {len(camera_centers)}')

    init_3d_plot()

    # Determines the visual scale of the plotted cameras 确定打印相机的视觉比例
    vis_scale = 10

    for cam_center, extrinsic, image in zip(camera_centers, extrinsics, images): 
        height, width, _ = image.shape
        inv_extrinsic = np.linalg.inv(extrinsic)
        plot_camera_axis(cam_center, inv_extrinsic, vis_scale)
    
    plot_chessboard(object_points)

    ipv.show()

结果如下:
几何视觉的编程实践——相机参数计算——基于ipyvolume的3D可视化——透视投影_第5张图片

3.将标定板图像画在三维空间中的相机视角图像中

camera_wireframe.py代码如下:

import matplotlib.pyplot as plt
import ipyvolume as ipv
import numpy as np
import cv2

cam_sphere_size = 1
# Visual dimension of the camera in the 3D plot. 

images = list() 
for i in range(11): 
    img = cv2.imread(f'./images/{i}.jpg')
    img = cv2.resize(img, None, fx=0.25, fy=0.25)
    images.append(img)  
     
height, width,_= images[0].shape
camera_aspect_ratio = width / height
# A length of 1 corresponds to the length of 1 chessboard cell.
# Set height of camera viewport to 1.
vis_cam_height = 1 
vis_cam_width = vis_cam_height * camera_aspect_ratio
wire_frame_depth = 1.2


def calibrate_cameras() :
    """ Calibrates cameras from chessboard images. 
    
    Returns: 
        images (list[np.ndarray]): Images containing the chessboard.
        intrinsics (np.ndarray): An upper triangular 4x4 full-rank matrix containing camera intrinsics.
        distortions (np.ndarray): Radial distortion coefficients.
        rotation_vectors (list[np.ndarray]): Rodrigues rotation vectors.
        translation_vectors (list[np.ndarray]): Translation vectors.
        object_points: (np.ndarray): A (4, 54) point array, representing the [x,y,z,w]
            of 54 chessboard points (homogenous coordiantes).
    """    
    
    # Read images
    images = list() 
    for i in range(11): 
        img = cv2.imread(f'./images/{i}.jpg')
        img = cv2.resize(img, None, fx=0.25, fy=0.25)
        images.append(img)     
                
    # chessboard corner search with opencv.
    shape = (6, 9) # The default opencv chessboard has 6 rows, 9 columns 
    object_points_all = [] # 3D world coordinates
    image_points_all = [] # 2D image coordinates
    flags = cv2.CALIB_CB_ADAPTIVE_THRESH + cv2.CALIB_CB_FAST_CHECK + cv2.CALIB_CB_NORMALIZE_IMAGE
    refinement_criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)    
    
    # For each image, store the object points and image points of chessboard corners.
    images_filtered = list()    
    object_points = np.zeros((1, shape[0] * shape[1], 3), np.float32)
    object_points[0, :, :2] = np.mgrid[0:shape[0], 0:shape[1]].T.reshape(-1, 2)
    for idx, image in enumerate(images): 
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        succes, corners = cv2.findChessboardCorners(image=gray, 
                                                    patternSize=shape,
                                                    flags=flags) 
        if succes:
            images_filtered.append(image)
            corners = cv2.cornerSubPix(image=gray, 
                                       corners=corners, 
                                       winSize=(11,11), 
                                       zeroZone=(-1,-1), 
                                       criteria=refinement_criteria)
            object_points_all.append(object_points)
            image_points_all.append(corners)    
            
    images = images_filtered
    
    # Calibrate the cameras by using the 3D <-> 2D point correspondences.
    ret, intrinsics, distortions, rotation_vectors, translation_vectors = cv2.calibrateCamera(object_points_all, image_points_all, gray.shape[::-1], None, None)
        
    intrinsics = np.hstack([intrinsics, np.zeros((3, 1))])
    intrinsics = np.vstack([intrinsics, [[0, 0, 0, 1]]])

    # Convert chessboard object points to homogeneous coordinates for later use.
    object_points = object_points[0].reshape((-1, 3)).T
    object_points = np.vstack([object_points, np.ones((1, object_points.shape[1]))])
    
    return images, intrinsics, distortions, rotation_vectors, translation_vectors, object_points


#from camera_center import extrinsics_from_calibration, camera_centers_from_extrinsics, init_3d_plot, plot_chessboard

def extrinsics_from_calibration(rotation_vectors, translation_vectors):
    """ Calculates extrinsic matrices from calibration output. 
        
    Args: 
        rotation_vectors (list[np.ndarray]): Rodrigues rotation vectors.
        translation_vectors (list[np.ndarray]): Translation vectors.
    Returns: 
        extrinsics (list[np.ndarray]): A list of camera extrinsic matrices.
            These matrices are 4x4 full-rank.
    """
    
    rotation_matrices = list() 
    for rot in rotation_vectors:
        rotation_matrices.append(cv2.Rodrigues(rot)[0]) 

    extrinsics = list()
    for rot, trans in zip(rotation_matrices, translation_vectors): 
        extrinsic = np.concatenate([rot, trans], axis=1)
        extrinsic = np.vstack([extrinsic, [[0,0,0,1]]]) 
        extrinsics.append(extrinsic)
    
    return extrinsics

def camera_centers_from_extrinsics(extrinsics):
    """ Calculates camera centers from extrinsic matrices. 
    
    Args: 
        extrinsics (list[np.ndarray]):  A list of camera extrinsic matrices.
    Returns: 
        camera_centers (list[np.ndarray]): Homogenous coordinates of camera centers in 
            3D world coordinate frame.
    """
    camera_centers = list() 

    for extrinsic in extrinsics:
        rot = extrinsic[:3, :3]
        trans = extrinsic[:3, 3]
        # compute camera center based on rotation and translation
        center = np.dot(np.linalg.pinv(rot),-trans) # TODO
        center = np.append(center, 1)
        camera_centers.append(center)
    
    return camera_centers

def init_3d_plot():
    """ Initializes a ipyvolume 3d plot and centers the 
        world view around the center of the chessboard.
        
    Returns: 
        fig (ipyvolume.pylab.figure): A 3D plot. 
    """
    chessboard_x_center = 2.5
    chessboard_y_center = 4
    fig = ipv.pylab.figure(figsize=(15, 15), width=800)
    ipv.xlim(2.5 - 30, 2.5 + 30)
    ipv.ylim(4 - 30, 4 + 30)
    ipv.zlim(-50, 10)
    ipv.pylab.view(azimuth=40, elevation=-150)
    return fig



def plot_camera_wireframe(cam_center, vis_scale, inv_extrinsic):
    """ Plots the wireframe of a camera's viewport.打印摄影机视口的线框。 """
    x, y, z, _ = cam_center 
    
    # Get left/right top/bottom wireframe coordinates
    # Use the inverse of the camera's extrinsic matrix to convert 
    # coordinates relative to the camera to world coordinates.
    lt = inv_extrinsic @ np.array((-vis_cam_width/2, -vis_cam_height/2, wire_frame_depth, 1/vis_scale)) * vis_scale
    rt = inv_extrinsic @ np.array((vis_cam_width/2,  -vis_cam_height/2, wire_frame_depth, 1/vis_scale)) * vis_scale
    lb = inv_extrinsic @ np.array((-vis_cam_width/2,  vis_cam_height/2, wire_frame_depth, 1/vis_scale)) * vis_scale
    rb = inv_extrinsic @ np.array((vis_cam_width/2,   vis_cam_height/2, wire_frame_depth, 1/vis_scale)) * vis_scale

    # Connect camera projective center to wireframe extremities
    ipv.plot([x, lt[0]], [y, lt[1]], [z, lt[2]], color='blue')
    ipv.plot([x, rt[0]], [y, rt[1]], [z, rt[2]], color='blue')
    ipv.plot([x, lb[0]], [y, lb[1]], [z, lb[2]], color='blue')
    ipv.plot([x, rb[0]], [y, rb[1]], [z, rb[2]], color='blue')
    
    # Connect wireframe corners with a rectangle
    ipv.plot([lt[0], rt[0]], [lt[1], rt[1]], [lt[2], rt[2]], color='blue')
    ipv.plot([rt[0], rb[0]], [rt[1], rb[1]], [rt[2], rb[2]], color='blue')
    ipv.plot([rb[0], lb[0]], [rb[1], lb[1]], [rb[2], lb[2]], color='blue')
    ipv.plot([lb[0], lt[0]], [lb[1], lt[1]], [lb[2], lt[2]], color='blue')
    ipv.scatter(np.array([x]), np.array([y]), np.array([z]), size=cam_sphere_size, marker="sphere", color='blue')

def plot_chessboard(object_points):
    """ Plots a 3D chessboard and highlights the 
        objects points with green spheres. 
        
    Args:  A (4, 54) point array, representing the [x,y,z,w]
           of 54 chessboard points (homogenous coordiantes).
    """
    img = cv2.imread(f'./images/chessboard.jpg')
    img_height, img_width, _ = img.shape
    chessboard_rows, chessboard_cols = 7, 10
    xx, yy = np.meshgrid(np.linspace(0, chessboard_rows, img_height), 
                         np.linspace(0, chessboard_cols, img_width)) 
    zz = np.zeros_like(yy)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) / 255
    # -1 is used as the start of the board images x and y coord, 
    # such that the first inner corner appear as coord (0, 0, 0) 
    ipv.plot_surface(xx-1, yy-1, zz, color=cv2.transpose(img))
    xs, ys, zs, _ = object_points
    ipv.scatter(xs, ys, zs, size=1, marker='sphere', color='lime')    
    
def plot_picture(image, inv_extrinsic, vis_scale):
    """ Plots a real-world image its respective 3D camera wireframe. 打印真实世界图像及其相应的3D相机线框。""" 
    image = image.copy()
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) 
    image = cv2.resize(image, None, fx=0.1, fy=0.1) / 255
    img_height, img_width, _ = image.shape

    xx, yy = np.meshgrid(np.linspace(-vis_cam_width/2 * vis_scale,  vis_cam_width/2 * vis_scale,  img_width), 
                         np.linspace(-vis_cam_height/2 * vis_scale, vis_cam_height/2 * vis_scale, img_height))
    zz = np.ones_like(yy) * wire_frame_depth * vis_scale
    coords = np.stack([xx, yy, zz, np.ones_like(zz)]) 
    coords = coords.reshape(4, -1) 
     
    # Convert canera relative coordinates to world relative coordinates将canera相对坐标转换为世界相对坐标
    coords = inv_extrinsic @ coords
    xx, yy, zz, ones = coords.reshape(4, img_height, img_width) 
    ipv.plot_surface(xx, yy, zz, color=image)
    
    
def project_points_to_picture(image, object_points, intrinsics, extrinsic):
    """ Perspective projects points to an image and draws them green. """
    image = image.copy()
    # project points to an image with perspective projection使用透视投影指向图像
    
    proj_matrix = np.dot(intrinsics, extrinsic) # TODO
    object_points = np.dot(proj_matrix, object_points)  # TODO
    xs, ys, ones, disparity = object_points / object_points[2]
    
    for idx, (x, y) in enumerate(zip(xs, ys)):
        x = round(x)
        y = round(y)
        if (0 < y < image.shape[0] and
            0 < x < image.shape[1]):
            # Each point occupies a 20x20 pixel area in the image.
            image[y-10:y+10, x-10:x+10] = [0, 255, 0] 
    
    return image

if __name__ == "__main__":
    images, intrinsics, distortions, rotation_vectors, translation_vectors, object_points = calibrate_cameras()    
    extrinsics = extrinsics_from_calibration(rotation_vectors, translation_vectors)
    camera_centers = camera_centers_from_extrinsics(extrinsics)

    init_3d_plot()
    
    plot_chessboard(object_points)    
    
    vis_scale = 10
   
    for cam_center, extrinsic, image in zip(camera_centers, extrinsics, images):
        image = project_points_to_picture(image, object_points, intrinsics, extrinsic)
        inv_extrinsic = np.linalg.pinv(extrinsic)

        # Plot cameras as viewport wireframes
        plot_camera_wireframe(cam_center, vis_scale, inv_extrinsic)

        # Plot images in the camera wireframes
        plot_picture(image, inv_extrinsic, vis_scale)
        
    ipv.show()  

结果如下:
几何视觉的编程实践——相机参数计算——基于ipyvolume的3D可视化——透视投影_第6张图片

4.把兔子模型放到场景中并投影到图像

plot_bunny.py 代码:

import matplotlib.pyplot as plt
import ipyvolume as ipv
import numpy as np
import cv2

cam_sphere_size = 1
# Visual dimension of the camera in the 3D plot. 
images = list() 
for i in range(11): 
    img = cv2.imread(f'./images/{i}.jpg')
    img = cv2.resize(img, None, fx=0.25, fy=0.25)
    images.append(img)  
height, width,_= images[0].shape
camera_aspect_ratio = width / height
# A length of 1 corresponds to the length of 1 chessboard cell.
# This is because a chessboard points have been defined as such.
# Set height of camera viewport to 1.
vis_cam_height = 1 
vis_cam_width = vis_cam_height * camera_aspect_ratio
wire_frame_depth = 1.2

#from camera_calibration import calibrate_cameras
#from camera_center import extrinsics_from_calibration, camera_centers_from_extrinsics, init_3d_plot, plot_chessboard
#from camera_wireframe import plot_camera_wireframe, plot_picture, project_points_to_picture

def calibrate_cameras() :
    """ Calibrates cameras from chessboard images. 
    
    Returns: 
        images (list[np.ndarray]): Images containing the chessboard.
        intrinsics (np.ndarray): An upper triangular 4x4 full-rank matrix containing camera intrinsics.
        distortions (np.ndarray): Radial distortion coefficients.
        rotation_vectors (list[np.ndarray]): Rodrigues rotation vectors.
        translation_vectors (list[np.ndarray]): Translation vectors.
        object_points: (np.ndarray): A (4, 54) point array, representing the [x,y,z,w]
            of 54 chessboard points (homogenous coordiantes).
    """    
    
    # Read images
    images = list() 
    for i in range(11): 
        img = cv2.imread(f'./images/{i}.jpg')
        img = cv2.resize(img, None, fx=0.25, fy=0.25)
        images.append(img)     
                
    # chessboard corner search with opencv.
    shape = (6, 9) # The default opencv chessboard has 6 rows, 9 columns 
    object_points_all = [] # 3D world coordinates
    image_points_all = [] # 2D image coordinates
    flags = cv2.CALIB_CB_ADAPTIVE_THRESH + cv2.CALIB_CB_FAST_CHECK + cv2.CALIB_CB_NORMALIZE_IMAGE
    refinement_criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)    
    
    # For each image, store the object points and image points of chessboard corners.
    images_filtered = list()    
    object_points = np.zeros((1, shape[0] * shape[1], 3), np.float32)
    object_points[0, :, :2] = np.mgrid[0:shape[0], 0:shape[1]].T.reshape(-1, 2)
    for idx, image in enumerate(images): 
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        succes, corners = cv2.findChessboardCorners(image=gray, 
                                                    patternSize=shape,
                                                    flags=flags) 
        if succes:
            images_filtered.append(image)
            corners = cv2.cornerSubPix(image=gray, 
                                       corners=corners, 
                                       winSize=(11,11), 
                                       zeroZone=(-1,-1), 
                                       criteria=refinement_criteria)
            object_points_all.append(object_points)
            image_points_all.append(corners)    
            
    images = images_filtered
    
    # Calibrate the cameras by using the 3D <-> 2D point correspondences.
    ret, intrinsics, distortions, rotation_vectors, translation_vectors = cv2.calibrateCamera(object_points_all, image_points_all, gray.shape[::-1], None, None)
        
    intrinsics = np.hstack([intrinsics, np.zeros((3, 1))])
    intrinsics = np.vstack([intrinsics, [[0, 0, 0, 1]]])

    # Convert chessboard object points to homogeneous coordinates for later use.
    object_points = object_points[0].reshape((-1, 3)).T
    object_points = np.vstack([object_points, np.ones((1, object_points.shape[1]))])
    
    return images, intrinsics, distortions, rotation_vectors, translation_vectors, object_points


#from camera_center import extrinsics_from_calibration, camera_centers_from_extrinsics, init_3d_plot, plot_chessboard

def extrinsics_from_calibration(rotation_vectors, translation_vectors):
    """ Calculates extrinsic matrices from calibration output. 
        
    Args: 
        rotation_vectors (list[np.ndarray]): Rodrigues rotation vectors.
        translation_vectors (list[np.ndarray]): Translation vectors.
    Returns: 
        extrinsics (list[np.ndarray]): A list of camera extrinsic matrices.
            These matrices are 4x4 full-rank.
    """
    
    rotation_matrices = list() 
    for rot in rotation_vectors:
        rotation_matrices.append(cv2.Rodrigues(rot)[0]) 

    extrinsics = list()
    for rot, trans in zip(rotation_matrices, translation_vectors): 
        extrinsic = np.concatenate([rot, trans], axis=1)
        extrinsic = np.vstack([extrinsic, [[0,0,0,1]]]) 
        extrinsics.append(extrinsic)
    
    return extrinsics

def camera_centers_from_extrinsics(extrinsics):
    """ Calculates camera centers from extrinsic matrices. 
    
    Args: 
        extrinsics (list[np.ndarray]):  A list of camera extrinsic matrices.
    Returns: 
        camera_centers (list[np.ndarray]): Homogenous coordinates of camera centers in 
            3D world coordinate frame.
    """
    camera_centers = list() 

    for extrinsic in extrinsics:
        rot = extrinsic[:3, :3]
        trans = extrinsic[:3, 3]
        # compute camera center based on rotation and translation
        center = np.dot(np.linalg.pinv(rot),-trans) # TODO# TODO
        center = np.append(center, 1)
        camera_centers.append(center)
    
    return camera_centers

def init_3d_plot():
    """ Initializes a ipyvolume 3d plot and centers the 
        world view around the center of the chessboard.
        
    Returns: 
        fig (ipyvolume.pylab.figure): A 3D plot. 
    """
    chessboard_x_center = 2.5
    chessboard_y_center = 4
    fig = ipv.pylab.figure(figsize=(15, 15), width=800)
    ipv.xlim(2.5 - 30, 2.5 + 30)
    ipv.ylim(4 - 30, 4 + 30)
    ipv.zlim(-50, 10)
    ipv.pylab.view(azimuth=40, elevation=-150)
    return fig

def plot_camera_wireframe(cam_center, vis_scale, inv_extrinsic):
    """ Plots the wireframe of a camera's viewport. """
    x, y, z, _ = cam_center 
    
    # Get left/right top/bottom wireframe coordinates
    # Use the inverse of the camera's extrinsic matrix to convert 
    # coordinates relative to the camera to world coordinates.
    lt = inv_extrinsic @ np.array((-vis_cam_width/2, -vis_cam_height/2, wire_frame_depth, 1/vis_scale)) * vis_scale
    rt = inv_extrinsic @ np.array((vis_cam_width/2,  -vis_cam_height/2, wire_frame_depth, 1/vis_scale)) * vis_scale
    lb = inv_extrinsic @ np.array((-vis_cam_width/2,  vis_cam_height/2, wire_frame_depth, 1/vis_scale)) * vis_scale
    rb = inv_extrinsic @ np.array((vis_cam_width/2,   vis_cam_height/2, wire_frame_depth, 1/vis_scale)) * vis_scale

    # Connect camera projective center to wireframe extremities
    ipv.plot([x, lt[0]], [y, lt[1]], [z, lt[2]], color='blue')
    ipv.plot([x, rt[0]], [y, rt[1]], [z, rt[2]], color='blue')
    ipv.plot([x, lb[0]], [y, lb[1]], [z, lb[2]], color='blue')
    ipv.plot([x, rb[0]], [y, rb[1]], [z, rb[2]], color='blue')
    
    # Connect wireframe corners with a rectangle
    ipv.plot([lt[0], rt[0]], [lt[1], rt[1]], [lt[2], rt[2]], color='blue')
    ipv.plot([rt[0], rb[0]], [rt[1], rb[1]], [rt[2], rb[2]], color='blue')
    ipv.plot([rb[0], lb[0]], [rb[1], lb[1]], [rb[2], lb[2]], color='blue')
    ipv.plot([lb[0], lt[0]], [lb[1], lt[1]], [lb[2], lt[2]], color='blue')
    ipv.scatter(np.array([x]), np.array([y]), np.array([z]), size=cam_sphere_size, marker="sphere", color='blue')

def plot_picture(image, inv_extrinsic, vis_scale):
    """ Plots a real-world image its respective 3D camera wireframe. """ 
    image = image.copy()
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) 
    image = cv2.resize(image, None, fx=0.1, fy=0.1) / 255
    img_height, img_width, _ = image.shape

    xx, yy = np.meshgrid(np.linspace(-vis_cam_width/2 * vis_scale,  vis_cam_width/2 * vis_scale,  img_width), 
                         np.linspace(-vis_cam_height/2 * vis_scale, vis_cam_height/2 * vis_scale, img_height))
    zz = np.ones_like(yy) * wire_frame_depth * vis_scale
    coords = np.stack([xx, yy, zz, np.ones_like(zz)]) 
    coords = coords.reshape(4, -1) 
     
    # Convert canera relative coordinates to world relative coordinates
    coords = inv_extrinsic @ coords
    xx, yy, zz, ones = coords.reshape(4, img_height, img_width) 
    ipv.plot_surface(xx, yy, zz, color=image)
    
    
def project_points_to_picture(image, object_points, intrinsics, extrinsic):
    """ Perspective projects points to an image and draws them green. """
    image = image.copy()
    # project points to an image with perspective projection使用透视投影指向图像
    proj_matrix = np.dot(intrinsics , extrinsic) # TODO
    object_points = np.dot(proj_matrix, object_points)  # TODO
    xs, ys, ones, disparity = object_points / object_points[2]
    
    for idx, (x, y) in enumerate(zip(xs, ys)):
        x = round(x)
        y = round(y)
        if (0 < y < image.shape[0] and
            0 < x < image.shape[1]):
            # Each point occupies a 20x20 pixel area in the image.
            image[y-10:y+10, x-10:x+10] = [0, 255, 0] 
    
    return image

def plot_bunny():
    """Plots the Stanford bunny pointcloud and returns its points. 
    
    Returns: 
        bunny_coords (np.ndarray): a 4xn homogenous point cloud array.
    """
    bunny_coords = np.load(open('data/bunny_point_cloud.npy', 'rb'))
    b_xs, b_ys, b_zs = bunny_coords[:3]
    bunny_point_size = 0.5
    ipv.scatter(b_xs, b_ys, b_zs, size=bunny_point_size, marker="sphere", color='lime')
    return bunny_coords

if __name__ == "__main__":
    images, intrinsics, distortions, rotation_vectors, translation_vectors, object_points = calibrate_cameras()    
    extrinsics = extrinsics_from_calibration(rotation_vectors, translation_vectors)
    camera_centers = camera_centers_from_extrinsics(extrinsics)

    init_3d_plot()
    bunny_coords = plot_bunny() 
    
    vis_scale=10
    for cam_center, extrinsic, image in zip(camera_centers, extrinsics, images):
        image = project_points_to_picture(image, object_points, intrinsics, extrinsic)
        # add bunny on the image
        image = project_points_to_picture(image, bunny_coords, intrinsics, extrinsic)
        inv_extrinsic = np.linalg.pinv(extrinsic)
        plot_camera_wireframe(cam_center, vis_scale, inv_extrinsic)
        plot_picture(image, inv_extrinsic, vis_scale)
  
    ipv.show()      

结果如下几何视觉的编程实践——相机参数计算——基于ipyvolume的3D可视化——透视投影_第7张图片

总结

以上就是本次实践的内容,小伙伴还有什么问题可以关注私信我哦!

你可能感兴趣的:(人工智能,python,3d,python,matplotlib)