《Python计算机视觉编程》一书中关于增强现实茶壶显示的程序

大家好,我是第一次写CSDN博客,也是刚开始学习用Python进行计算机视觉编程,有很多不懂和不足的地方,希望大家多包涵。以下纯粹是我个人的一些实际操作经历。
在《Python计算机视觉编程》一书中,有关于增强现实AR的一部分编程程序。尤其是一段在相片上显示一个虚拟茶壶的程序,在网上(包括CSDN)有很多程序。这里我个人觉得这2篇文章算不错的,给出链接。第一个是https://blog.csdn.net/Dujing2019/article/details/91691202,第二个是https://blog.csdn.net/titansm/article/details/89057184。但直接使用这些程序,不一定能得到它们说的同样的效果和结果(至少我不是),所以我贴出自己的代码,以供大家参考。

1、一些基本的我就不说了,例如软件和包的安装等,大家参考以上两个链接和其他文章,这里我说点和其它文章不一样的地方。对了,我的环境是Win7 64位,装了anaconda3和python3.6。
2、首先是以上两个链接中都会提到的sift.process_image等等函数,你会发现文章中根本没细说这写函数的内容是什么。其实这些函数在《Python计算机视觉编程》一书中有详细提及,只是以上链接文章没提罢了,这里我把所有相关的程序待会在下面都列出来。
3、sift.process_image函数是将.bmp文件转变成.sift文件。那么如何转变呢?这里你需要下载一个vlfeat-0.9.20-bin.tar.gz压缩包(注意必须是0.9.20版本,0.9.21版本没用),解压缩后将其中/bin/win32中的sift.exe、vl.dll、vl.lib拷贝到和你这个AR python程序同一个文件夹下。很奇怪我是64位版本为什么用win32中的文件吧?其实我也奇怪,但我试过了,用win64根本不行,只有win32可以,我也不知道为什么。
4、另外,你在安装完PyOpenGL后,需要将OpenGL库文件(由于我安装了anaconda3,因此我的OpenGL库文件在Continuum\anaconda3\Lib\site-packages\OpenGL)下的DLLS文件夹中的部分东西删除,只保留gle_AUTHORS、gle_COPYING、gle_COPYING.src、gle64.vc14.dll、glut64.vc14.dll这六个文件即可。否则在编译函数draw_teapot时会跳出一个莫名其妙的错误。
5、这一点非常关键,我找遍了好多文章,最后终于在第二个链接中找到了,也就是width和height的值。一定不能用大多数文章里使用的1000和747,要用你自己那个.bmp的像素。比如我的.bmp文件的像素是553和369(可以点击右键–>属性–>详细信息看到该文件的像素)。
另外关于这一点,我做了若干次尝试:当width,height = 553,369时,效果OK;当width,height = 553,300时,效果也OK;当width不是553时,哪怕是554或552,出来的背景效果都是极度扭曲的。
还有,width和height的值不能太大,否则编译时就会报错,什么内存读取溢出或错误OSError: exception: access violation reading 0x0000000008FC7000。网上很多人说这是64位版本问题,算是一个bug,具体我也搞不清楚。所以你在生成.bmp图片时,就要记住图片像素值不能太大。
好,接下来我就贴代码了。

首先生成一个空文件sift.py,将它和其它AR python程序放在同一个文件夹。sift.py里的程序是:

from PIL import Image
from numpy import *
from pylab import *
import os

def process_image(imagename, resultname, params='--edge-thresh 10 --peak-thresh 5'):
    if imagename[-3:] != 'pgm':
        im = Image.open(imagename).convert('L')
        im.save('tmp.pgm')
        imagename = 'tmp.pgm'
    cmmd = str("sift " + imagename + " --output=" + resultname + " " + params)
    os.system(cmmd)
    print(str("processed " + imagename + " to " + resultname))

def read_features_from_file(filename):
    f = loadtxt(filename)
    return (f[:, :4], f[:, 4:])
    
def match(desc1,desc2):
    desc1 = array([d/linalg.norm(d) for d in desc1])
    desc2 = array([d/linalg.norm(d) for d in desc2])
    dist_ratio = 0.6
    desc1_size = desc1.shape
    matchscores = zeros((desc1_size[0],1),'int')
    desc2t = desc2.T
    for i in range(desc1_size[0]):
        dotprods = dot(desc1[i,:],desc2t)
        dotprods = 0.9999*dotprods
        indx = argsort(arccos(dotprods))
        if arccos(dotprods)[indx[0]] < dist_ratio*arccos(dotprods)[indx[1]]:
            matchscores[i] = int(indx[0])
    return matchscores

def match_twosided(desc1,desc2):
    matches_12 = match(desc1, desc2)
    matches_21 = match(desc2, desc1)
    ndx_12 = matches_12.nonzero()[0]
    for n in ndx_12:
        if matches_21[matches_12[n]] != n:
            matches_12[n] = 0
    return matches_12

然后生成一个空文件homography.py,将它和其它AR python程序放在同一个文件夹。homography.py里的程序是:

import os
from PIL import Image
from numpy import *
from pylab import *
from scipy import ndimage
import ransac

def make_homog(points):
	return vstack((points,ones((1,points.shape[1]))))

class RansacModel(object):
	def __init__(self,debug=False):
		self.debug = debug
	def fit(self,data):
		data = data.T
		fp = data[:3,:4]
		tp = data[3:,:4]
		return H_from_points(fp,tp)
	def get_error(self,data,H):
		data = data.T
		fp = data[:3]
		tp = data[3:]
		fp_transformed = dot(H,fp)
		for i in range(3):
			fp_transformed[i] /= fp_transformed[2]
		return sqrt(sum((tp-fp_transformed)**2,axis=0))

def H_from_ransac(fp,tp,model,maxiter=1000,match_threshold=10):
	data = vstack((fp,tp))
	H,ransac_data = ransac.ransac(data.T,model,4,maxiter,match_threshold,10,return_all=True)
	return H,ransac_data['inliers']

再生成一个空文件ransac.py,将它和其它AR python程序放在同一个文件夹。ransac.py里的程序是(ransac.py其实网上能下载,这里我把代码都贴出来吧):

import numpy
import scipy # use numpy if scipy unavailable
import scipy.linalg # use numpy if scipy unavailable

## Copyright (c) 2004-2007, Andrew D. Straw. All rights reserved.

## Redistribution and use in source and binary forms, with or without
## modification, are permitted provided that the following conditions are
## met:

##     * Redistributions of source code must retain the above copyright
##       notice, this list of conditions and the following disclaimer.

##     * Redistributions in binary form must reproduce the above
##       copyright notice, this list of conditions and the following
##       disclaimer in the documentation and/or other materials provided
##       with the distribution.

##     * Neither the name of the Andrew D. Straw nor the names of its
##       contributors may be used to endorse or promote products derived
##       from this software without specific prior written permission.

## THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
## "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
## LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
## A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
## OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
## SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
## LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
## DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
## THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
## (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
## OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

def ransac(data,model,n,k,t,d,debug=False,return_all=False):
    """fit model parameters to data using the RANSAC algorithm
    
This implementation written from pseudocode found at
http://en.wikipedia.org/w/index.php?title=RANSAC&oldid=116358182

{{{
Given:
    data - a set of observed data points
    model - a model that can be fitted to data points
    n - the minimum number of data values required to fit the model
    k - the maximum number of iterations allowed in the algorithm
    t - a threshold value for determining when a data point fits a model
    d - the number of close data values required to assert that a model fits well to data
Return:
    bestfit - model parameters which best fit the data (or nil if no good model is found)
iterations = 0
bestfit = nil
besterr = something really large
while iterations < k {
    maybeinliers = n randomly selected values from data
    maybemodel = model parameters fitted to maybeinliers
    alsoinliers = empty set
    for every point in data not in maybeinliers {
        if point fits maybemodel with an error smaller than t
             add point to alsoinliers
    }
    if the number of elements in alsoinliers is > d {
        % this implies that we may have found a good model
        % now test how good it is
        bettermodel = model parameters fitted to all points in maybeinliers and alsoinliers
        thiserr = a measure of how well model fits these points
        if thiserr < besterr {
            bestfit = bettermodel
            besterr = thiserr
        }
    }
    increment iterations
}
return bestfit
}}}
"""
    iterations = 0
    bestfit = None
    besterr = numpy.inf
    best_inlier_idxs = None
    while iterations < k:
        maybe_idxs, test_idxs = random_partition(n,data.shape[0])
        maybeinliers = data[maybe_idxs,:]
        test_points = data[test_idxs]
        maybemodel = model.fit(maybeinliers)
        test_err = model.get_error( test_points, maybemodel)
        also_idxs = test_idxs[test_err < t] # select indices of rows with accepted points
        alsoinliers = data[also_idxs,:]
        if debug:
            '''
            print 'test_err.min()',test_err.min()
            print 'test_err.max()',test_err.max()
            print 'numpy.mean(test_err)',numpy.mean(test_err)
            print 'iteration %d:len(alsoinliers) = %d'%(
                iterations,len(alsoinliers))
            '''
        if len(alsoinliers) > d:
            betterdata = numpy.concatenate( (maybeinliers, alsoinliers) )
            bettermodel = model.fit(betterdata)
            better_errs = model.get_error( betterdata, bettermodel)
            thiserr = numpy.mean( better_errs )
            if thiserr < besterr:
                bestfit = bettermodel
                besterr = thiserr
                best_inlier_idxs = numpy.concatenate( (maybe_idxs, also_idxs) )
        iterations+=1
    if bestfit is None:
        raise ValueError("did not meet fit acceptance criteria")
    if return_all:
        return bestfit, {'inliers':best_inlier_idxs}
    else:
        return bestfit

def random_partition(n,n_data):
    """return n random rows of data (and also the other len(data)-n rows)"""
    all_idxs = numpy.arange( n_data )
    numpy.random.shuffle(all_idxs)
    idxs1 = all_idxs[:n]
    idxs2 = all_idxs[n:]
    return idxs1, idxs2

class LinearLeastSquaresModel:
    """linear system solved using linear least squares

    This class serves as an example that fulfills the model interface
    needed by the ransac() function.
    
    """
    def __init__(self,input_columns,output_columns,debug=False):
        self.input_columns = input_columns
        self.output_columns = output_columns
        self.debug = debug
    def fit(self, data):
        A = numpy.vstack([data[:,i] for i in self.input_columns]).T
        B = numpy.vstack([data[:,i] for i in self.output_columns]).T
        x,resids,rank,s = scipy.linalg.lstsq(A,B)
        return x
    def get_error( self, data, model):
        A = numpy.vstack([data[:,i] for i in self.input_columns]).T
        B = numpy.vstack([data[:,i] for i in self.output_columns]).T
        B_fit = scipy.dot(A,model)
        err_per_point = numpy.sum((B-B_fit)**2,axis=1) # sum squared error per row
        return err_per_point
        
def test():
    # generate perfect input data

    n_samples = 500
    n_inputs = 1
    n_outputs = 1
    A_exact = 20*numpy.random.random((n_samples,n_inputs) )
    perfect_fit = 60*numpy.random.normal(size=(n_inputs,n_outputs) ) # the model
    B_exact = scipy.dot(A_exact,perfect_fit)
    assert B_exact.shape == (n_samples,n_outputs)

    # add a little gaussian noise (linear least squares alone should handle this well)
    A_noisy = A_exact + numpy.random.normal(size=A_exact.shape )
    B_noisy = B_exact + numpy.random.normal(size=B_exact.shape )

    if 1:
        # add some outliers
        n_outliers = 100
        all_idxs = numpy.arange( A_noisy.shape[0] )
        numpy.random.shuffle(all_idxs)
        outlier_idxs = all_idxs[:n_outliers]
        non_outlier_idxs = all_idxs[n_outliers:]
        A_noisy[outlier_idxs] =  20*numpy.random.random((n_outliers,n_inputs) )
        B_noisy[outlier_idxs] = 50*numpy.random.normal(size=(n_outliers,n_outputs) )

    # setup model

    all_data = numpy.hstack( (A_noisy,B_noisy) )
    input_columns = range(n_inputs) # the first columns of the array
    output_columns = [n_inputs+i for i in range(n_outputs)] # the last columns of the array
    debug = False
    model = LinearLeastSquaresModel(input_columns,output_columns,debug=debug)

    linear_fit,resids,rank,s = scipy.linalg.lstsq(all_data[:,input_columns],
                                                  all_data[:,output_columns])

    # run RANSAC algorithm
    ransac_fit, ransac_data = ransac(all_data,model,
                                     50, 1000, 7e3, 300, # misc. parameters
                                     debug=debug,return_all=True)
    if 1:
        import pylab

        sort_idxs = numpy.argsort(A_exact[:,0])
        A_col0_sorted = A_exact[sort_idxs] # maintain as rank-2 array

        if 1:
            pylab.plot( A_noisy[:,0], B_noisy[:,0], 'k.', label='data' )
            pylab.plot( A_noisy[ransac_data['inliers'],0], B_noisy[ransac_data['inliers'],0], 'bx', label='RANSAC data' )
        else:
            pylab.plot( A_noisy[non_outlier_idxs,0], B_noisy[non_outlier_idxs,0], 'k.', label='noisy data' )
            pylab.plot( A_noisy[outlier_idxs,0], B_noisy[outlier_idxs,0], 'r.', label='outlier data' )
        pylab.plot( A_col0_sorted[:,0],
                    numpy.dot(A_col0_sorted,ransac_fit)[:,0],
                    label='RANSAC fit' )
        pylab.plot( A_col0_sorted[:,0],
                    numpy.dot(A_col0_sorted,perfect_fit)[:,0],
                    label='exact system' )
        pylab.plot( A_col0_sorted[:,0],
                    numpy.dot(A_col0_sorted,linear_fit)[:,0],
                    label='linear fit' )
        pylab.legend()
        pylab.show()

if __name__=='__main__':
    test()

接下来是主程序了。生成一个空文件AR_Test.py,将它和其它AR python程序放在同一个文件夹。AR_Test.py里的程序是(这里基本就是直接复制黏贴链接一和链接二了):

from pylab import *
from OpenGL.GL import *
from OpenGL.GLU import *
from OpenGL.GLUT import *
import pygame, pygame.image
from pygame.locals import *
from numpy import *
import homography,camera,sift
import math
from PIL import Image


def cube_points(c, wid):
    p = []
    # bottom
    p.append([c[0]-wid, c[1]-wid, c[2]-wid])
    p.append([c[0]-wid, c[1]+wid, c[2]-wid])
    p.append([c[0]+wid, c[1]+wid, c[2]-wid])
    p.append([c[0]+wid, c[1]-wid, c[2]-wid])
    p.append([c[0]-wid, c[1]-wid, c[2]-wid]) #same as first to close plot
    # top
    p.append([c[0]-wid, c[1]-wid, c[2]+wid])
    p.append([c[0]-wid, c[1]+wid, c[2]+wid])
    p.append([c[0]+wid, c[1]+wid, c[2]+wid])
    p.append([c[0]+wid, c[1]-wid, c[2]+wid])
    p.append([c[0]-wid, c[1]-wid, c[2]+wid]) #same as first to close plot

    # vertical sides
    p.append([c[0]-wid, c[1]-wid, c[2]+wid])
    p.append([c[0]-wid, c[1]+wid, c[2]+wid])
    p.append([c[0]-wid, c[1]+wid, c[2]-wid])
    p.append([c[0]+wid, c[1]+wid, c[2]-wid])
    p.append([c[0]+wid, c[1]+wid, c[2]+wid])
    p.append([c[0]+wid, c[1]-wid, c[2]+wid])
    p.append([c[0]+wid, c[1]-wid, c[2]-wid])

    return array(p).T

def my_calibration(sz):
    row, col = sz
    fx = 2555*col/2592
    fy = 2586*row/1936
    K = diag([fx, fy, 1])
    K[0, 2] = 0.5*col
    K[1, 2] = 0.5*row
    return K

def set_projection_from_camera(K):  # 获取视图
    glMatrixMode(GL_PROJECTION)
    glLoadIdentity()
    fx = K[0, 0]
    fy = K[1, 1]
    fovy = 2 * math.atan(0.5 * height / fy) * 180 / math.pi
    aspect = (width * fy) / (height * fx)
    # 定义近和远的剪裁平面
    near = 0.1
    far = 100.0
    # 设定透视
    gluPerspective(fovy, aspect, near, far)
    glViewport(0, 0, width, height)


def set_modelview_from_camera(Rt):  # 获取矩阵
    glMatrixMode(GL_MODELVIEW)
    glLoadIdentity()
    # 围绕x轴将茶壶旋转90度,使z轴向上
    Rx = np.array([[1, 0, 0], [0, 0, -1], [0, 1, 0]])
    # 获得旋转的最佳逼近
    R = Rt[:, :3]
    U, S, V = np.linalg.svd(R)
    R = np.dot(U, V)
    R[0, :] = -R[0, :]  # 改变x轴的符号
    # 获得平移量
    t = Rt[:, 3]
    # 获得4*4的的模拟视图矩阵
    M = np.eye(4)
    M[:3, :3] = np.dot(R, Rx)
    M[:3, 3] = t
    # 转置并压平以获取列序数值
    M = M.T
    m = M.flatten()
    # 将模拟视图矩阵替换成新的矩阵
    glLoadMatrixf(m)


def draw_background(imname):
    # 载入背景图像
    bg_image = pygame.image.load(imname).convert()
    bg_data = pygame.image.tostring(bg_image, 'RGBX', 1)  # 将图像转为字符串描述
    glMatrixMode(GL_MODELVIEW)  # 将当前矩阵指定为投影矩阵
    glLoadIdentity()  # 把矩阵设为单位矩阵

    #glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)  # 清楚颜色、深度缓冲
    glEnable(GL_TEXTURE_2D)  # 纹理映射
    glBindTexture(GL_TEXTURE_2D, glGenTextures(1))
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, bg_data)
    #glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST)
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST)

    # 绑定纹理
    glBegin(GL_QUADS)
    glTexCoord2f(0.0, 0.0);glVertex3f(-1.0, -1.0, -1.0)
    glTexCoord2f(1.0, 0.0);glVertex3f(1.0, -1.0, -1.0)
    glTexCoord2f(1.0, 1.0);glVertex3f(1.0, 1.0, -1.0)
    glTexCoord2f(0.0, 1.0);glVertex3f(-1.0, 1.0, -1.0)
    glEnd()
    glDeleteTextures(1)  # 清除纹理

def draw_teapot(size):  # 红色茶壶
    glEnable(GL_LIGHTING)
    glEnable(GL_LIGHT0)
    glEnable(GL_DEPTH_TEST)
    glClear(GL_DEPTH_BUFFER_BIT)
    # 绘制红色茶壶
    glMaterialfv(GL_FRONT, GL_AMBIENT, [0, 0, 0, 0])
    glMaterialfv(GL_FRONT, GL_DIFFUSE, [0.5, 0.0, 0.0, 0.0])
    glMaterialfv(GL_FRONT, GL_SPECULAR, [0.7, 0.6, 0.6, 0.0])
    glMaterialf(GL_FRONT, GL_SHININESS, 0.25 * 128.0)
    glutSolidTeapot(size)

def drawFunc(size):  # 白色茶壶
    glRotatef(0.5, 5, 5, 0)  # (角度,x,y,z)
    glutWireTeapot(size)
    # 刷新显示
    glFlush()

def load_and_draw_model(filename):
    glEnable(GL_LIGHTING)
    glEnable(GL_LIGHT0)
    glEnable(GL_DEPTH_TEST)
    glClear(GL_DEPTH_BUFFER_BIT)

    glMaterialfv(GL_FRONT,GL_AMBIENT,[0,0,0,0])
    glMaterialfv(GL_FRONT, GL_DIFFUSE, [0.5, 0.75, 1.0, 0.0])
    glMaterialf(GL_FRONT, GL_SHININESS, 0.25*128.0)

    import objloader
    obj = objloader.OBJ(filename,swapyz=True)
    glCallList(obj.gl_list)

#这里一定要是.bmp图片的分辨率,不要迷信网上的1000,747
width,height = 553,369
def setup():
    pygame.init()
    pygame.display.set_mode((width,height),OPENGL | DOUBLEBUF)
    pygame.display.set_caption("OpenGL AR demo")




# 计算特征,先对着一本书竖着排一张book0.jpg,再对着这本书横着拍一张book1.jpg
sift.process_image('book0.jpg', 'im0.sift')
l0, d0 = sift.read_features_from_file('im0.sift')
sift.process_image('book1.jpg', 'im1.sift')
l1, d1 = sift.read_features_from_file('im1.sift')

# 匹配特征,计算单应性矩阵
matches = sift.match_twosided(d0, d1)
ndx = matches.nonzero()[0]
fp = homography.make_homog(l0[ndx, :2].T)
ndx2 = [int(matches[i]) for i in ndx]
tp = homography.make_homog(l1[ndx2, :2].T)

model = homography.RansacModel()
H, inliers = homography.H_from_ransac(fp, tp, model)

# 计算照相机标定矩阵
K = my_calibration((height, width))
# 位于边长为0.2,z=0平面上的三维点
box = cube_points([0, 0, 0.1], 0.1)

# 投影第一幅图下个上底部的正方形
cam1 = camera.Camera(hstack((K, dot(K, array([[0], [0], [-1]])))))
# 底部正方形上的点
box_cam1 = cam1.project(homography.make_homog(box[:, :5]))

# 使用H将点变换到第二幅图像中
box_trans = homography.normalize(dot(H, box_cam1))

# 从cam1和H中计算第二个照相机矩阵
cam2 = camera.Camera(dot(H, cam1.P))
A = dot(linalg.inv(K), cam2.P[:, :3])
A = array([A[:, 0], A[:, 1], cross(A[:, 0], A[:, 1])]).T
cam2.P[:, :3] = dot(K, A)
# 使用第二个照相机矩阵投影
box_cam2 = cam2.project(homography.make_homog(box))



#这一段程序是在book图上画一个AR立方体
point = array([1,1,0,1]).T
print(homography.normalize(dot(dot(H,cam1.P),point)))
print(cam2.project(point))
im0 = array(Image.open('book0.jpg'))
im1 = array(Image.open('book1.jpg'))
figure()
imshow(im0)
plot(box_cam1[0,:],box_cam1[1,:],linewidth=3)
figure()
imshow(im1)
plot(box_trans[0,:],box_trans[1,:],linewidth=3)
figure()
imshow(im1)
plot(box_cam2[0,:],box_cam2[1,:],linewidth=3)
show()


Rt = dot(linalg.inv(K),cam2.P)
setup()
draw_background("book1.bmp")
set_projection_from_camera(K)
set_modelview_from_camera(Rt)
draw_teapot(0.1)
#drawFunc(0.1)

pygame.display.flip()
while True:
    for event in pygame.event.get():
        if event.type==pygame.QUIT:
            sys.exit()

好,到这里基本结束了。第一次写CSDN博客,不足之处还望大家海涵。希望对你有帮助!

你可能感兴趣的:(Python,计算机视觉编程)