大家好,我是第一次写CSDN博客,也是刚开始学习用Python进行计算机视觉编程,有很多不懂和不足的地方,希望大家多包涵。以下纯粹是我个人的一些实际操作经历。
在《Python计算机视觉编程》一书中,有关于增强现实AR的一部分编程程序。尤其是一段在相片上显示一个虚拟茶壶的程序,在网上(包括CSDN)有很多程序。这里我个人觉得这2篇文章算不错的,给出链接。第一个是https://blog.csdn.net/Dujing2019/article/details/91691202,第二个是https://blog.csdn.net/titansm/article/details/89057184。但直接使用这些程序,不一定能得到它们说的同样的效果和结果(至少我不是),所以我贴出自己的代码,以供大家参考。
1、一些基本的我就不说了,例如软件和包的安装等,大家参考以上两个链接和其他文章,这里我说点和其它文章不一样的地方。对了,我的环境是Win7 64位,装了anaconda3和python3.6。
2、首先是以上两个链接中都会提到的sift.process_image等等函数,你会发现文章中根本没细说这写函数的内容是什么。其实这些函数在《Python计算机视觉编程》一书中有详细提及,只是以上链接文章没提罢了,这里我把所有相关的程序待会在下面都列出来。
3、sift.process_image函数是将.bmp文件转变成.sift文件。那么如何转变呢?这里你需要下载一个vlfeat-0.9.20-bin.tar.gz压缩包(注意必须是0.9.20版本,0.9.21版本没用),解压缩后将其中/bin/win32中的sift.exe、vl.dll、vl.lib拷贝到和你这个AR python程序同一个文件夹下。很奇怪我是64位版本为什么用win32中的文件吧?其实我也奇怪,但我试过了,用win64根本不行,只有win32可以,我也不知道为什么。
4、另外,你在安装完PyOpenGL后,需要将OpenGL库文件(由于我安装了anaconda3,因此我的OpenGL库文件在Continuum\anaconda3\Lib\site-packages\OpenGL)下的DLLS文件夹中的部分东西删除,只保留gle_AUTHORS、gle_COPYING、gle_COPYING.src、gle64.vc14.dll、glut64.vc14.dll这六个文件即可。否则在编译函数draw_teapot时会跳出一个莫名其妙的错误。
5、这一点非常关键,我找遍了好多文章,最后终于在第二个链接中找到了,也就是width和height的值。一定不能用大多数文章里使用的1000和747,要用你自己那个.bmp的像素。比如我的.bmp文件的像素是553和369(可以点击右键–>属性–>详细信息看到该文件的像素)。
另外关于这一点,我做了若干次尝试:当width,height = 553,369时,效果OK;当width,height = 553,300时,效果也OK;当width不是553时,哪怕是554或552,出来的背景效果都是极度扭曲的。
还有,width和height的值不能太大,否则编译时就会报错,什么内存读取溢出或错误OSError: exception: access violation reading 0x0000000008FC7000。网上很多人说这是64位版本问题,算是一个bug,具体我也搞不清楚。所以你在生成.bmp图片时,就要记住图片像素值不能太大。
好,接下来我就贴代码了。
首先生成一个空文件sift.py,将它和其它AR python程序放在同一个文件夹。sift.py里的程序是:
from PIL import Image
from numpy import *
from pylab import *
import os
def process_image(imagename, resultname, params='--edge-thresh 10 --peak-thresh 5'):
if imagename[-3:] != 'pgm':
im = Image.open(imagename).convert('L')
im.save('tmp.pgm')
imagename = 'tmp.pgm'
cmmd = str("sift " + imagename + " --output=" + resultname + " " + params)
os.system(cmmd)
print(str("processed " + imagename + " to " + resultname))
def read_features_from_file(filename):
f = loadtxt(filename)
return (f[:, :4], f[:, 4:])
def match(desc1,desc2):
desc1 = array([d/linalg.norm(d) for d in desc1])
desc2 = array([d/linalg.norm(d) for d in desc2])
dist_ratio = 0.6
desc1_size = desc1.shape
matchscores = zeros((desc1_size[0],1),'int')
desc2t = desc2.T
for i in range(desc1_size[0]):
dotprods = dot(desc1[i,:],desc2t)
dotprods = 0.9999*dotprods
indx = argsort(arccos(dotprods))
if arccos(dotprods)[indx[0]] < dist_ratio*arccos(dotprods)[indx[1]]:
matchscores[i] = int(indx[0])
return matchscores
def match_twosided(desc1,desc2):
matches_12 = match(desc1, desc2)
matches_21 = match(desc2, desc1)
ndx_12 = matches_12.nonzero()[0]
for n in ndx_12:
if matches_21[matches_12[n]] != n:
matches_12[n] = 0
return matches_12
然后生成一个空文件homography.py,将它和其它AR python程序放在同一个文件夹。homography.py里的程序是:
import os
from PIL import Image
from numpy import *
from pylab import *
from scipy import ndimage
import ransac
def make_homog(points):
return vstack((points,ones((1,points.shape[1]))))
class RansacModel(object):
def __init__(self,debug=False):
self.debug = debug
def fit(self,data):
data = data.T
fp = data[:3,:4]
tp = data[3:,:4]
return H_from_points(fp,tp)
def get_error(self,data,H):
data = data.T
fp = data[:3]
tp = data[3:]
fp_transformed = dot(H,fp)
for i in range(3):
fp_transformed[i] /= fp_transformed[2]
return sqrt(sum((tp-fp_transformed)**2,axis=0))
def H_from_ransac(fp,tp,model,maxiter=1000,match_threshold=10):
data = vstack((fp,tp))
H,ransac_data = ransac.ransac(data.T,model,4,maxiter,match_threshold,10,return_all=True)
return H,ransac_data['inliers']
再生成一个空文件ransac.py,将它和其它AR python程序放在同一个文件夹。ransac.py里的程序是(ransac.py其实网上能下载,这里我把代码都贴出来吧):
import numpy
import scipy # use numpy if scipy unavailable
import scipy.linalg # use numpy if scipy unavailable
## Copyright (c) 2004-2007, Andrew D. Straw. All rights reserved.
## Redistribution and use in source and binary forms, with or without
## modification, are permitted provided that the following conditions are
## met:
## * Redistributions of source code must retain the above copyright
## notice, this list of conditions and the following disclaimer.
## * Redistributions in binary form must reproduce the above
## copyright notice, this list of conditions and the following
## disclaimer in the documentation and/or other materials provided
## with the distribution.
## * Neither the name of the Andrew D. Straw nor the names of its
## contributors may be used to endorse or promote products derived
## from this software without specific prior written permission.
## THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
## "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
## LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
## A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
## OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
## SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
## LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
## DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
## THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
## (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
## OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
def ransac(data,model,n,k,t,d,debug=False,return_all=False):
"""fit model parameters to data using the RANSAC algorithm
This implementation written from pseudocode found at
http://en.wikipedia.org/w/index.php?title=RANSAC&oldid=116358182
{{{
Given:
data - a set of observed data points
model - a model that can be fitted to data points
n - the minimum number of data values required to fit the model
k - the maximum number of iterations allowed in the algorithm
t - a threshold value for determining when a data point fits a model
d - the number of close data values required to assert that a model fits well to data
Return:
bestfit - model parameters which best fit the data (or nil if no good model is found)
iterations = 0
bestfit = nil
besterr = something really large
while iterations < k {
maybeinliers = n randomly selected values from data
maybemodel = model parameters fitted to maybeinliers
alsoinliers = empty set
for every point in data not in maybeinliers {
if point fits maybemodel with an error smaller than t
add point to alsoinliers
}
if the number of elements in alsoinliers is > d {
% this implies that we may have found a good model
% now test how good it is
bettermodel = model parameters fitted to all points in maybeinliers and alsoinliers
thiserr = a measure of how well model fits these points
if thiserr < besterr {
bestfit = bettermodel
besterr = thiserr
}
}
increment iterations
}
return bestfit
}}}
"""
iterations = 0
bestfit = None
besterr = numpy.inf
best_inlier_idxs = None
while iterations < k:
maybe_idxs, test_idxs = random_partition(n,data.shape[0])
maybeinliers = data[maybe_idxs,:]
test_points = data[test_idxs]
maybemodel = model.fit(maybeinliers)
test_err = model.get_error( test_points, maybemodel)
also_idxs = test_idxs[test_err < t] # select indices of rows with accepted points
alsoinliers = data[also_idxs,:]
if debug:
'''
print 'test_err.min()',test_err.min()
print 'test_err.max()',test_err.max()
print 'numpy.mean(test_err)',numpy.mean(test_err)
print 'iteration %d:len(alsoinliers) = %d'%(
iterations,len(alsoinliers))
'''
if len(alsoinliers) > d:
betterdata = numpy.concatenate( (maybeinliers, alsoinliers) )
bettermodel = model.fit(betterdata)
better_errs = model.get_error( betterdata, bettermodel)
thiserr = numpy.mean( better_errs )
if thiserr < besterr:
bestfit = bettermodel
besterr = thiserr
best_inlier_idxs = numpy.concatenate( (maybe_idxs, also_idxs) )
iterations+=1
if bestfit is None:
raise ValueError("did not meet fit acceptance criteria")
if return_all:
return bestfit, {'inliers':best_inlier_idxs}
else:
return bestfit
def random_partition(n,n_data):
"""return n random rows of data (and also the other len(data)-n rows)"""
all_idxs = numpy.arange( n_data )
numpy.random.shuffle(all_idxs)
idxs1 = all_idxs[:n]
idxs2 = all_idxs[n:]
return idxs1, idxs2
class LinearLeastSquaresModel:
"""linear system solved using linear least squares
This class serves as an example that fulfills the model interface
needed by the ransac() function.
"""
def __init__(self,input_columns,output_columns,debug=False):
self.input_columns = input_columns
self.output_columns = output_columns
self.debug = debug
def fit(self, data):
A = numpy.vstack([data[:,i] for i in self.input_columns]).T
B = numpy.vstack([data[:,i] for i in self.output_columns]).T
x,resids,rank,s = scipy.linalg.lstsq(A,B)
return x
def get_error( self, data, model):
A = numpy.vstack([data[:,i] for i in self.input_columns]).T
B = numpy.vstack([data[:,i] for i in self.output_columns]).T
B_fit = scipy.dot(A,model)
err_per_point = numpy.sum((B-B_fit)**2,axis=1) # sum squared error per row
return err_per_point
def test():
# generate perfect input data
n_samples = 500
n_inputs = 1
n_outputs = 1
A_exact = 20*numpy.random.random((n_samples,n_inputs) )
perfect_fit = 60*numpy.random.normal(size=(n_inputs,n_outputs) ) # the model
B_exact = scipy.dot(A_exact,perfect_fit)
assert B_exact.shape == (n_samples,n_outputs)
# add a little gaussian noise (linear least squares alone should handle this well)
A_noisy = A_exact + numpy.random.normal(size=A_exact.shape )
B_noisy = B_exact + numpy.random.normal(size=B_exact.shape )
if 1:
# add some outliers
n_outliers = 100
all_idxs = numpy.arange( A_noisy.shape[0] )
numpy.random.shuffle(all_idxs)
outlier_idxs = all_idxs[:n_outliers]
non_outlier_idxs = all_idxs[n_outliers:]
A_noisy[outlier_idxs] = 20*numpy.random.random((n_outliers,n_inputs) )
B_noisy[outlier_idxs] = 50*numpy.random.normal(size=(n_outliers,n_outputs) )
# setup model
all_data = numpy.hstack( (A_noisy,B_noisy) )
input_columns = range(n_inputs) # the first columns of the array
output_columns = [n_inputs+i for i in range(n_outputs)] # the last columns of the array
debug = False
model = LinearLeastSquaresModel(input_columns,output_columns,debug=debug)
linear_fit,resids,rank,s = scipy.linalg.lstsq(all_data[:,input_columns],
all_data[:,output_columns])
# run RANSAC algorithm
ransac_fit, ransac_data = ransac(all_data,model,
50, 1000, 7e3, 300, # misc. parameters
debug=debug,return_all=True)
if 1:
import pylab
sort_idxs = numpy.argsort(A_exact[:,0])
A_col0_sorted = A_exact[sort_idxs] # maintain as rank-2 array
if 1:
pylab.plot( A_noisy[:,0], B_noisy[:,0], 'k.', label='data' )
pylab.plot( A_noisy[ransac_data['inliers'],0], B_noisy[ransac_data['inliers'],0], 'bx', label='RANSAC data' )
else:
pylab.plot( A_noisy[non_outlier_idxs,0], B_noisy[non_outlier_idxs,0], 'k.', label='noisy data' )
pylab.plot( A_noisy[outlier_idxs,0], B_noisy[outlier_idxs,0], 'r.', label='outlier data' )
pylab.plot( A_col0_sorted[:,0],
numpy.dot(A_col0_sorted,ransac_fit)[:,0],
label='RANSAC fit' )
pylab.plot( A_col0_sorted[:,0],
numpy.dot(A_col0_sorted,perfect_fit)[:,0],
label='exact system' )
pylab.plot( A_col0_sorted[:,0],
numpy.dot(A_col0_sorted,linear_fit)[:,0],
label='linear fit' )
pylab.legend()
pylab.show()
if __name__=='__main__':
test()
接下来是主程序了。生成一个空文件AR_Test.py,将它和其它AR python程序放在同一个文件夹。AR_Test.py里的程序是(这里基本就是直接复制黏贴链接一和链接二了):
from pylab import *
from OpenGL.GL import *
from OpenGL.GLU import *
from OpenGL.GLUT import *
import pygame, pygame.image
from pygame.locals import *
from numpy import *
import homography,camera,sift
import math
from PIL import Image
def cube_points(c, wid):
p = []
# bottom
p.append([c[0]-wid, c[1]-wid, c[2]-wid])
p.append([c[0]-wid, c[1]+wid, c[2]-wid])
p.append([c[0]+wid, c[1]+wid, c[2]-wid])
p.append([c[0]+wid, c[1]-wid, c[2]-wid])
p.append([c[0]-wid, c[1]-wid, c[2]-wid]) #same as first to close plot
# top
p.append([c[0]-wid, c[1]-wid, c[2]+wid])
p.append([c[0]-wid, c[1]+wid, c[2]+wid])
p.append([c[0]+wid, c[1]+wid, c[2]+wid])
p.append([c[0]+wid, c[1]-wid, c[2]+wid])
p.append([c[0]-wid, c[1]-wid, c[2]+wid]) #same as first to close plot
# vertical sides
p.append([c[0]-wid, c[1]-wid, c[2]+wid])
p.append([c[0]-wid, c[1]+wid, c[2]+wid])
p.append([c[0]-wid, c[1]+wid, c[2]-wid])
p.append([c[0]+wid, c[1]+wid, c[2]-wid])
p.append([c[0]+wid, c[1]+wid, c[2]+wid])
p.append([c[0]+wid, c[1]-wid, c[2]+wid])
p.append([c[0]+wid, c[1]-wid, c[2]-wid])
return array(p).T
def my_calibration(sz):
row, col = sz
fx = 2555*col/2592
fy = 2586*row/1936
K = diag([fx, fy, 1])
K[0, 2] = 0.5*col
K[1, 2] = 0.5*row
return K
def set_projection_from_camera(K): # 获取视图
glMatrixMode(GL_PROJECTION)
glLoadIdentity()
fx = K[0, 0]
fy = K[1, 1]
fovy = 2 * math.atan(0.5 * height / fy) * 180 / math.pi
aspect = (width * fy) / (height * fx)
# 定义近和远的剪裁平面
near = 0.1
far = 100.0
# 设定透视
gluPerspective(fovy, aspect, near, far)
glViewport(0, 0, width, height)
def set_modelview_from_camera(Rt): # 获取矩阵
glMatrixMode(GL_MODELVIEW)
glLoadIdentity()
# 围绕x轴将茶壶旋转90度,使z轴向上
Rx = np.array([[1, 0, 0], [0, 0, -1], [0, 1, 0]])
# 获得旋转的最佳逼近
R = Rt[:, :3]
U, S, V = np.linalg.svd(R)
R = np.dot(U, V)
R[0, :] = -R[0, :] # 改变x轴的符号
# 获得平移量
t = Rt[:, 3]
# 获得4*4的的模拟视图矩阵
M = np.eye(4)
M[:3, :3] = np.dot(R, Rx)
M[:3, 3] = t
# 转置并压平以获取列序数值
M = M.T
m = M.flatten()
# 将模拟视图矩阵替换成新的矩阵
glLoadMatrixf(m)
def draw_background(imname):
# 载入背景图像
bg_image = pygame.image.load(imname).convert()
bg_data = pygame.image.tostring(bg_image, 'RGBX', 1) # 将图像转为字符串描述
glMatrixMode(GL_MODELVIEW) # 将当前矩阵指定为投影矩阵
glLoadIdentity() # 把矩阵设为单位矩阵
#glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT) # 清楚颜色、深度缓冲
glEnable(GL_TEXTURE_2D) # 纹理映射
glBindTexture(GL_TEXTURE_2D, glGenTextures(1))
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, bg_data)
#glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST)
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST)
# 绑定纹理
glBegin(GL_QUADS)
glTexCoord2f(0.0, 0.0);glVertex3f(-1.0, -1.0, -1.0)
glTexCoord2f(1.0, 0.0);glVertex3f(1.0, -1.0, -1.0)
glTexCoord2f(1.0, 1.0);glVertex3f(1.0, 1.0, -1.0)
glTexCoord2f(0.0, 1.0);glVertex3f(-1.0, 1.0, -1.0)
glEnd()
glDeleteTextures(1) # 清除纹理
def draw_teapot(size): # 红色茶壶
glEnable(GL_LIGHTING)
glEnable(GL_LIGHT0)
glEnable(GL_DEPTH_TEST)
glClear(GL_DEPTH_BUFFER_BIT)
# 绘制红色茶壶
glMaterialfv(GL_FRONT, GL_AMBIENT, [0, 0, 0, 0])
glMaterialfv(GL_FRONT, GL_DIFFUSE, [0.5, 0.0, 0.0, 0.0])
glMaterialfv(GL_FRONT, GL_SPECULAR, [0.7, 0.6, 0.6, 0.0])
glMaterialf(GL_FRONT, GL_SHININESS, 0.25 * 128.0)
glutSolidTeapot(size)
def drawFunc(size): # 白色茶壶
glRotatef(0.5, 5, 5, 0) # (角度,x,y,z)
glutWireTeapot(size)
# 刷新显示
glFlush()
def load_and_draw_model(filename):
glEnable(GL_LIGHTING)
glEnable(GL_LIGHT0)
glEnable(GL_DEPTH_TEST)
glClear(GL_DEPTH_BUFFER_BIT)
glMaterialfv(GL_FRONT,GL_AMBIENT,[0,0,0,0])
glMaterialfv(GL_FRONT, GL_DIFFUSE, [0.5, 0.75, 1.0, 0.0])
glMaterialf(GL_FRONT, GL_SHININESS, 0.25*128.0)
import objloader
obj = objloader.OBJ(filename,swapyz=True)
glCallList(obj.gl_list)
#这里一定要是.bmp图片的分辨率,不要迷信网上的1000,747
width,height = 553,369
def setup():
pygame.init()
pygame.display.set_mode((width,height),OPENGL | DOUBLEBUF)
pygame.display.set_caption("OpenGL AR demo")
# 计算特征,先对着一本书竖着排一张book0.jpg,再对着这本书横着拍一张book1.jpg
sift.process_image('book0.jpg', 'im0.sift')
l0, d0 = sift.read_features_from_file('im0.sift')
sift.process_image('book1.jpg', 'im1.sift')
l1, d1 = sift.read_features_from_file('im1.sift')
# 匹配特征,计算单应性矩阵
matches = sift.match_twosided(d0, d1)
ndx = matches.nonzero()[0]
fp = homography.make_homog(l0[ndx, :2].T)
ndx2 = [int(matches[i]) for i in ndx]
tp = homography.make_homog(l1[ndx2, :2].T)
model = homography.RansacModel()
H, inliers = homography.H_from_ransac(fp, tp, model)
# 计算照相机标定矩阵
K = my_calibration((height, width))
# 位于边长为0.2,z=0平面上的三维点
box = cube_points([0, 0, 0.1], 0.1)
# 投影第一幅图下个上底部的正方形
cam1 = camera.Camera(hstack((K, dot(K, array([[0], [0], [-1]])))))
# 底部正方形上的点
box_cam1 = cam1.project(homography.make_homog(box[:, :5]))
# 使用H将点变换到第二幅图像中
box_trans = homography.normalize(dot(H, box_cam1))
# 从cam1和H中计算第二个照相机矩阵
cam2 = camera.Camera(dot(H, cam1.P))
A = dot(linalg.inv(K), cam2.P[:, :3])
A = array([A[:, 0], A[:, 1], cross(A[:, 0], A[:, 1])]).T
cam2.P[:, :3] = dot(K, A)
# 使用第二个照相机矩阵投影
box_cam2 = cam2.project(homography.make_homog(box))
#这一段程序是在book图上画一个AR立方体
point = array([1,1,0,1]).T
print(homography.normalize(dot(dot(H,cam1.P),point)))
print(cam2.project(point))
im0 = array(Image.open('book0.jpg'))
im1 = array(Image.open('book1.jpg'))
figure()
imshow(im0)
plot(box_cam1[0,:],box_cam1[1,:],linewidth=3)
figure()
imshow(im1)
plot(box_trans[0,:],box_trans[1,:],linewidth=3)
figure()
imshow(im1)
plot(box_cam2[0,:],box_cam2[1,:],linewidth=3)
show()
Rt = dot(linalg.inv(K),cam2.P)
setup()
draw_background("book1.bmp")
set_projection_from_camera(K)
set_modelview_from_camera(Rt)
draw_teapot(0.1)
#drawFunc(0.1)
pygame.display.flip()
while True:
for event in pygame.event.get():
if event.type==pygame.QUIT:
sys.exit()
好,到这里基本结束了。第一次写CSDN博客,不足之处还望大家海涵。希望对你有帮助!