图像变形与投影(Transformations and projections)




球面投影 (立体和柱面的投影效果)

Written by  Paul Bourke
EEG data courtesy of Dr Per Line
December 1996, Updated December 1999

立体效果(在2维平面 画出 3维的物体)


立体投影模型:一个透明球体置于平面之上。我们把球体与平面相切的点叫南极点,把光源 置于南极点过球心对称的点P1北极点。每一条透过球面某一点P2的光线都会投射在平面上的某一点P那么这个点P就是球面P2点在平面上的立体投影

为了推导立体投影的坐标转换公式,我们假设:球心位于 坐标原点(0,0,0),球的半径是 r,投影平面在 z = -r 上,光源在点 (0,0,r) 上。参照以下的 Schlegal 图解:

假设过光源点P1 = (0,0,r) 、球面上的点P2 = (x,y,z)的线的点方程是:

P = P1 + mu (P2 - P1) . . . . . . (1)
我们要求出P2锅直线在平面上交点,即(Px, Py, -r),代入(1)式得:
-r = r + mu (z - r)
mu = 2r / (r - z)

现在我们可以吧常数mu回代到(1)式来取得任意球面上点 (x,y,z) 的立体投影:

P = P1 +  (P2 - P1) * 2r / (r - z)


  • 南极点位于整个投影的中心

  • 纬线的投影的同心圆的圆心位于 (0,0, -r)

  • 经线的投影是点 (0,0,-r)的射线

  • 在南极点上几乎没有扭曲失真

  • 投影的结果是一个半径为 2r 的圆

  • 越靠近北极点扭曲失真越大,北极点则失真无限大








设极坐标系球面上的点P(r, alpha, beta),令r=1


x = constant * alpha
y = constant * tan(beta)

墨托卡投影(Mercator projection)


Spherical or Equirectangular projection
Mercator projection

以下是通过规范图像坐标(x,y) (-1..1)转换成经纬度的等式:

longitude = x * pi
latitude = atan(exp(-2 * pi * y))


x = longitude / pi
y = ln((1 + sin(latitude))/(1 - sin(latitude))) / (4 pi)



直接极点法, 或者称为球形或正方形投影

虽然不是严格意义上的投影方法,一种常用的把球面展现为正方形的形式的方法直接把极角当作垂直和水平的坐标。由于经度变化范围是(0,2pi) 而纬度的变化范围是(-pi/2,pi/2), 极点图通常以宽高比为2:1的比例展示。这种地图最显而易见的失真表现在从赤道靠近极点水平方向的拉伸上,我们可以看到极点(单个点)的位置被拉伸成一条与地图宽度相等的水平直线。


虽然在制图学上很少回应用这种地图,但由于它是映射球体的一种标准地图纹理构造方法,它们还是在计算机图像学上广受欢迎...... 正如以上广受欢迎的世界地图。

Hammer-Aitoff 汉莫尔-埃托夫地图投影

Conversion to/from longitude/latitude

Written by  Paul Bourke
April 2005

埃托夫地图投影 (attributed to David Aitoff circa 1889) 是众多方位投影中的一种, 是一种方位等距投影。它的经度值被倍增,导致产生的地图成了一个2维空间里水平坐标跟垂直坐标所占范围比例为2:1的椭圆。在标准的方位投影中,经纬线与平面中心的距离为平面投影点到中心的距离。而在埃托夫投影里,除了水平和垂直的两条线,其他的经纬线都产生了不同的漂移。而埃托夫投影的一个修正版汉莫尔-埃托夫投影则保证了整个地图上单位面积拥有相同的实际面积。

经纬度转换成汉莫尔埃托夫坐标 (x,y)方法:

设经度的范围是从 -pi 到 pi, 维度的范围是从 -pi/2 到 pi/2, 则:

z2 = 1 + cos(latitude) cos(longitude/2)

x = cos(latitude) sin(longitude/2) / z

y = sin(latitude) / z

点(x,y) 是标准化坐标,范围从 -1 到 1.


z2 = 1 - x2/2 - y2/2

longitude = 2 atan(sqrt(2) x z / (2 z2 - 1))

latitude = asin(sqrt(2) y z)

Hammer-Aitoff 地图坐标域:(x longitude) >= 0.

例子: 左图经纬度转换成右图Hammer-Aitoff坐标

Grid test pattern, eg: spherical panoramic map

Resulting Hammer-Aitoff projection

例子: 左图 Hammer-Aitoff 坐标转换成右图经纬度

Cosmic microwave background

Spherical projection


Written by  Paul Bourke
January 1987


P = ( x , y ) -> P' = ( x' , y' )


转换(变换) 是通过在x方向上移动T x距离,在y方向上移动  T y  距离:

x' = x + Tx
y' = y + Ty


缩放是通过在x方向上关于原点作S x 、在y方向上关于原点作 S y倍的缩放

x' = Sx x
y' = Sy y

如果Sx 与 Sy 不相等,会导致在两者相比值更大的方向出现拉伸。
要在某一个特点的点附近进行缩放,首先将该点变换到原点,作缩放,在回复坐标位置。例如缩放点 (x0,y0)附近区域:

x' = x0 + Sx ( x - x0 )
y' = y0 + Sy ( y - y0 )



x' = x cos(A) + y sin(A)
y' = y cos(A) - x sin(A)




x' = x
y' = - y

关于 y轴作镜面反射

x' = - x
y' = y



沿x轴切变 SH x  

x' = SHx x 
y' = y

沿y轴切变 SHy 

x' = x
y' = SHy y


Written by  Paul Bourke
June 1996







Written by  Paul Bourke
June 2000

by R.D. Kriz (2006).

在不同坐标系的转换过程中经常要用到坐标轴的旋转。在这里用到的是右手笛卡尔空间直角坐标系(y轴指向前方,x轴指向右方,z轴指向上方),为了方便描述关于不同数轴的旋转,我们将使用不同的术语。 我们约定关于z轴的旋转叫做扭转(direction),关于y轴的旋转叫做翻转 (roll) ,关于x轴的旋转叫做推转(pitch) 。进一步,我们约定正旋转为顺时针(从数轴正方向看原点),负旋转为逆时针。更多的约定读者可以自行处理。


关于x轴旋转tx 度(或者说推转tx 度):

1 0 0
0 cos(tx) sin(tx)
0 -sin(tx) cos(tx)

关于y轴旋转 ty 度(或者说翻转 ty 度)

cos(ty) 0 -sin(ty)
0 1 0
sin(ty) 0 cos(ty)

关于z轴旋转 tz 度(或者说扭转 t度)

cos(tz) sin(tz) 0
-sin(tz) cos(tz) 0
0 0 1

在使用这些转换矩阵是,转换矩阵之间的次序是非常重要的。我们把以上的旋转矩阵分别叫做Rx(t), Ry(t), 和 Rz(t),当他们以次序一:Rz(t) Rx(t) Ry(t)执行旋转,或者以次序二 Rx(t) Ry(t) Rz(t)执行旋转,将会得到不同的结果。以下将会讨论其中一种矩阵的旋转次序,同理可得剩下的矩阵次序组合,这些将留个读者自行推导。我们要讨论的次序是:先翻转(关于y轴旋转),在推转(关于x轴旋转),最后扭转(关于z轴旋转)。这个也可能是在游戏或者飞行模拟中最常用的次序。

= Rz(t) Rx(t) Ry(t)


cos(tz) cos(ty) + sin(tz) sin(tx) sin(ty)    sin(tz) cos(tx)    -cos(tz) sin(ty) + sin(tz) sin(tx) cos(ty)
-sin(tz) cos(ty) + cos(tz) sin(tx) sin(ty)    cos (tz) cos(tx)    sin(tz) sin(ty) + cos(tz) sin(tx) cos(ty)
cos(tx) sin(ty)    -sin(tx)    cos(tx) cos(ty)

另外一个要求是:给定一个新的坐标系,如何去推导相对应的三个欧拉角。如果新坐标系的正交向量(正交向量是指所有向量之间两两垂直)是X,Y,Z 那么从从坐标系 (1,0,0), (0,1,0), (0,0,1) 到新坐标系的转换矩阵B就是:

Xx Yx Zx
Xy Yy Zy
Xz Yz Zz

Y z = -sin(t x)

t x = asin(-Y z)


cos(t x) (-sin(t y), cos((t y)) = (X z, Z z)

t y = atan2(X z, Z z))


cos(t x) (sin(t z, cos(t z)) = (Y x, Y y)

t z = atan2(Y x, Y y)


  • 以上用到的函数 atan2() 需要两个参数来计算正确的象限结果。它和代数的tan()函数有所不同。

  • 当赋予指定的值给以上 tx, ty,和 tz 时,存在多个不同的解。也就是说,对于同一个坐标系转换,存在多个不同的欧拉角组合解。


Written by  Paul Bourke
May 2001

基于计算机的建模和渲染软件似乎都可以分成两类:使用左手坐标系 或者 使用右手坐标系。 比如 OpenGL 用的是右手坐标系而 PovRay 用的则是左手坐标系。这个文档描述了如何把模型的坐标系/摄像机属性 往不同的坐标系去做转换。以下是两个坐标系,它们的不同于自身的矢量积的不同。。。用的是所谓的左手原则和右手原则。


有两张种方法可以让左右手坐标系互换得到相同的结果。法一:把模型和摄像机设置中所有顶点(这里的顶点是指需要被显示器显示出来的点)的x值取反;法二:一切坐标保持不变,但在渲染图出来后需要做一个水平方向上的翻转。以下的符号p, d, 和u分别表示显示媒介的位置,视觉方向和向上的方向。

模型 1  - 把x轴坐标取反

模型 2 - 水平翻转渲染结果图





Written by  Paul Bourke
December 1994







我们通过两个角度( 和 )来区分不同的斜位投影。



  • 轴测投影 tan() =1 or  = 45 degrees

  • 二轴测投影 tan() = 2 or  = 63.4 degrees

以上两种斜位投影  角通常取值 45 度或者 30 度。 通用的斜位投影坐标转换等式如右:


  • 从三维坐标推导出二维平面坐标最需要以上等式中的Xp和Yp两个转换即可。无关紧要的第三个Zp阐明了:斜位投影如何通过z轴把一个平行正交投影切变到一个x-y象限的投影平面上(可理解为:物体的平行正交投影加上其在z轴的切向移动的投影,组成了上述斜位投影)。

  • x和y的坐标值都移动了某个值(这个值与z轴成固定比例,比x轴如移动了x + z * (cos() / tan()))。也就是说,投影的角度,距离和平行线在z轴上都保持精确不变。

     轴测投影  = 45   
    轴测投影  = 30        
    二轴测投影  = 45    
      二轴测投影 = 30      


Written by  Paul Bourke
November 1989

以下的数学运算和图示来自于一个 修正 通过照片求一块平整土地面积 的失真 的项目。这些照片是从多个角度对地面进行拍取而来,所以需要让它们的形状变得更直观才能计算它们相的对应面积(乘以比例尺就是实际面积)。同样的技术当然也可以应用到故意让矩形面积失真的场景中。

常规的方法(笛卡尔坐标)都是通过两个坐标定义二维空间的点(这样的点在整个笛卡尔坐标系中具有唯一性)。 以下的单位正方形,我们把P点的两个坐标分别叫做mu和delta, 它们是P点沿着水平和垂直方向上相对于正方形边缘的距离。







分别点P0, P1, P2和P3在X轴和Y轴的坐标值代入以上等式(叫它做等式1),可以得出(分别叫做等式2,3):




Anamorphic Projections

Written by  Paul Bourke
January 1991

Source: glues.h and glues.c.

Anamorphism is a Macintosh utility which takes a line drawing as a PICT file and performs various nonlinear distortions upon it. The distortions available have been chosen from those which have been used historically by artists (and forgers).

Each of the different distortions will be illustrated by using the following simple diagram.

For the following examples an additional grid will be placed over the image to further illustrate the nature of the distortion. Each type of distortion has controls associated with it, these are indicated by black "blobs" at the current position of the control points. To vary these parameters simply click and drag the control points.



  • The only PICT drawing primitives which can be used are line segments.

  • Since the distortions are non linear, the distorted points alone a line segment do not lie in a straight line between the distorted end points of the line segment. Thus each line is split into a number of line segments in order to approximate the generally curved nature of the distorted lines. The result of this is distorted drawings with a much larger number of line segments.

Reflective balls in the main street of Adelaide, Australia.


Mappings in the Complex Plane

Written by  Paul Bourke
July 1997

The following illustrates the general form of various mappings in the complex plane. The mappings are applied to part of a unit disk centered at the origin as shown on the left hand side. The circle is filled with rays from the origin and arcs centered about the origin. A series of coloured rays further illustrate the mapping orientation.




























z2 + z


1 / (z + 1)


(z - 1) / (z + 1)


(z2 - 1) / (z2 + 1)


(z - a) / (z + b)


(z2 + z - 1) / (z2 + z + 1)


(z2 + z + 1) / (z + 1)

World to Screen Projection Transformation

Written by  Paul Bourke
December 1994

The representation by computer of 3 dimensional forms is normally restricted to the projection onto a plane, namely the 2 dimensional computer screen or hardcopy device. The following is a procedure that transforms points in 3 dimensional space to screen coordinates given a particular coordinate system, camera and projection plane models. This discussion describes the mathematics required for a perspective projection including clipping to the projection pyramid with a front and back cutting plane. It assumes the projection plane to be perpendicular to the view direction vector and thus it does not allow for oblique projections.

Included in the appendices is source code (written in the C programming language) implementing all the processes described.

Coordinate system

In what follows a so called right handed coordinate system is used, it has the positive x axis to the right, the positive z axis upward, and the positive y axis forward (into the screen or page).

Conversion between this and other coordinate systems simply involves the swapping and/or negation of the appropriate coordinates.

Camera model

The camera is fundamentally defined by its position (from), a point along the positive view direction vector (to), a vector defining "up" (up), and a horizontal and vertical aperture (angleh, anglev).

These parameters are illustrated in the following figure. 

One obvious restriction is that the view direction must not be collinear with the up vector. In practical implementations, including the one given in the appendices, the up vector need not be a unit vector.

Other somewhat artificial variables in the camera model used here are front and back clipping planes, a perspective/oblique projection flag, and a multiplicative zoom factor. The clipping planes are defined as positive distances along the view direction vector, in other words they are perpendicular to the view direction vector. As expected all geometry before the front plane and beyond the back plane is not visible. All geometry which crosses these planes is clipped to the appropriate plane. Thus geometry visible to a camera as described here lies within a truncated pyramid.

Screen model

The projection plane (computer screen or hardcopy device) can be defined in many ways. Here the central point, width and height are used. The following will further assume the unfortunate convention, common in computer graphics practice, that the positive vertical axis is downward. The coordinates of the projection space will be referred to as (h,v).

Note that normally in computer windowing systems the window area is defined as an rectangle between two points (left,top) and (right,bottom). Transforming this description into the definition used here is trivial, namely

horizontal center = (left + right) / 2
vertical center = (top + bottom) / 2
width = right - left
height = bottom - top

The units need not be specified although they are generally pixel's, it is assumed that there are drawing routines in the same units. It is also assumed that the computer screen has a 1:1 aspect ratio, a least as far as the drawing routines are concerned.

A relationship could be made between the ratio of the horizontal and vertical camera aperture and the horizontal and vertical ratio of the display area. Here it will be assumed that the display area (eg: window) has the same proportions as the ratio of the camera aperture. In practice this simply means that when the camera aperture is modified, the window size is also modified so as to retain the correct proportions.

The procedure for determining where a 3D point in world coordinates would appear on the screen is as follows: 

Transforming a line segment involves determining which piece, if any, of the line segment intersects the view volume. The logic is shown below. 


Two separate clipping processes occur. The first is clipping to the front and back clipping planes and is done after transforming to eye coordinates. The second is clipping to the view pyramid and is performed after transforming to normalised coordinates at which point it is necessary to clip 2D line segments to a square centered at the origin of length and height of 2.

Source code
transform.c, transform.h.

Cubic to Cylindrical conversion

Written by  Paul Bourke
November 2003, Updated may 2006

The following discusses the transformation of a cubic environment map (90 degree perspective projections onto the face of a cube) into a cylindrical panoramic image. The driver for this was the creation of cylindrical panoramic images from rendering software that didn't explicitly support panoramic creation. The software was scripted to create the 6 cubic images and this utility created the panoramic.

Usage: cube2cyl [options] filemask
filemask can contain %c which will substituted with each of [l,r,t,d,b,f]
For example: "blah_%c.tga" or "%c_something.tga"
  -a n   sets antialiasing level, default = 2
  -v n   vertical aperture, default = 90
  -w n   sets the output image width, default = 3 * cube image width
  -c     enable top and bottom cap texture generation, default = off
  -s     split cylinder into 4 pieces, default = off
File name conventions

The names of the cubic maps are assumed to contain the letters 'f', 'l', 'r', 't', 'b', 'd' that indicate the face (front,left,right,top,back,down). The file mask needs to contain "%c" which specifies the view.

So for example the following cubic maps would be specified as %c_starmap.tga,
l_starmap.tga, r_starmap.tga, f_starmap.tga,
t_starmap.tga, b_starmap.tga, d_starmap.tga.

Test pattern
Note the orientation convention for the cube faces.

cube2cyl -a 3 -v 90
Note that in this special case the top of the face should coincide with the top of the cylindrical panoramic.

cube2cyl -a 3 -v 120

cube2cyl -a 3 -v 150


The mapping for each pixel in the destination cylindrical panoramic from the correct location on the appropriate cubic face is relatively straightforward. The horizontal axis maps linearly onto the angle about the cylinder. The vertical axis of the panoramic image maps onto the vertical axis of the cylinder by a tan relationship.

In particular, if (i,j) is the pixel index of the panoramic normalised to (-1,+1) the the direction vector is given as follows.
x = cos(i pi)
y = j tan(v/2)
z = sin(i pi)

This direction vector is then used within the texture mapped cubic geometry. The face of the cube it intersects needs to be found and then the pixel the ray passes through is determined (intersection of the direction vector with the plane of the face). Critical to obtaining good quality results is antialiasing, in this implementation a straightforward constant weighted supersampling is used.

Example 1
Cubic map

Cylindrical panoramic (90 degrees)

Example 2
Cubic map

Cylindrical panoramic (90 degrees)


  • The vertical aperture must be greater than 0 and less than 180 degrees. The current implementation limits it to be between 1 and 179 degrees.

  • While not a requirement, the current implementation retains the height to width ratio of the output image to the same as the vertical aperture to horizontal aperture.

  • Typically antialiasing levels of 3 are more than enough.
Addendum: Top and bottom caps

One can equally form the image textures for a top and bottom cap. The following is an example of such caps, in this case the vertical field of view is 90 degrees so the cylinder is cubic (diameter of 2 and height of 2 units). It should be noted that a relatively high degree of tessellation is required for the cylindrical mesh if the linear approximations of the edges is not to create seam artifacts.



The aspect of the cylinder height to the width for an undistorted cylindrical vies and for the two caps to match is tan(verticalFOV/2).

Converting to and from 6 cubic environment maps and a spherical map

Written by  Paul Bourke
February 2002, updated May 2006

There are two common methods of representing environment maps, cubic and spherical. In cubic maps the virtual camera is surrounded by a cube the 6 faces of which have an appropriate texture map. These texture maps are often created by imaging the scene with six 90 degree fov cameras giving a left, front, right, back, top, and bottom texture. In a spherical map the camera is surrounded by a sphere with a single spherically distorted texture. This document describes bow to convert 6 cubic maps into a single spherical map.


As an illustrative example the following 6 images are the textures placed on the cubic environment, they are arranged as an unfolded cube. Below that is the spherical texture map that would give the same appearance if applied as a texture to a sphere about the camera.

    Maps courtesy of
Ben Syverson


The conversion process involves two main stages. The goal is to determine the best estimate of the colour at each pixel in the final spherical image given the 6 cubic texture images. The first stage is to calculate the polar coordinates corresponding to each pixel in the spherical image. The second stage is to use the polar coordinates to form a vector and find which face and which pixel on that face the vector (ray) strikes. In reality this process is repeated a number of times at slightly different positions in each pixel in the spherical image and an average is used in order to avoid aliasing effects.

If the coordinates of the spherical image are (i,j) and the image has width "w" and height "h" then the normalised coordinates (x,y) each ranging from -1 to 1 are given by:

x = 2 i / w - 1 
y = 2 j / h - 1 
or y = 1 - 2 j / h depending on the position of pixel 0

The polar coordinates theta and phi are derived from the normalised coordinates (x,y) below. theta ranges from 0 to 2 pi and phi ranges from -pi/2 (south pole) to pi/2 (north pole). Note there are two vertical relationships in common use, linear and spherical. In the former phi is linearly related to y, in the later there is a sine relationship.

theta = x pi 
phi = y pi / 2 
or phi = asin(y) for spherical vertical distortion

The polar coordinates (theta,phi) are turned into a unit vector (view ray from the camera) as below. This assumes a right hand coordinate system, x to the right, y upwards, and z out of the page. The front view of the cubic map is looking from the origin along the positive z axis.

x = cos(phi) cos(theta) 
y = sin(phi) 
z = cos(phi) sin(theta)

The intersection of this ray is now found with the faces of the cube. Once the intersection point is found the coordinate on the square face specifies the corresponding pixel and therefore colour associated with the ray.

Mapping geometry

Photographic example

Usage: cube2sphere [options] filemask
filemask should contain %c which will substituted with each of [l,r,t,d,b,f]
For example: "blah_%c.tga" or "%c_something.tga"
   -w n     sets the output image width, default = 4*inwidth
  -w1 n     sub image position 1, default: 0
  -w2 n     sub image position 2, default: width
   -h n     sets the output image height, default = width/2
   -a n     sets antialiasing level, default = 1 (none)
   -s       use sine correction for vertical axis

Convert spherical projections to cylindrical projection

Written by  Paul Bourke
February 2010

The following utility was written to convert spherical projections into cylindrical projections. Of course only a slice of the spherical projection is used. The original reason for developing this was to convert video content from the LadyBug-3 camera (spherical projection) to a suitable image for a 360 cylindrical display. This a command line utility and as such straightforward to script to convert sequences of frames that make up a movie.

sph2pan [options] sphericalimagename
-t n       set max theta on vertical axis of panoramic, 0...90 (default: 45)
-a n       set antialias level, 1 upwards, 2 or 3 typical (default: 2)
-w n       width of the spherical image
-r n       horizontal rotation angle (default: 0)
-v n       vertical rotation angle (default: 0)
-f         flip insideout (default: off)

Sample spherical projection (from the LadyBug-3)

Click for original image (5400x2700 pixels).

As with all such image transformations, one considers a point in the destination image noting that the pont may be a sub pixel (required for supersampling antialiasing). This point corresponds to a vector in 3D space, this is then used to determine the corresponding point in the input image. Options such as rotations and flips correspond to operations on the 3D vector.

Sample derived cylindrical projections

The following show example transformations illustrating some of the more important options. In particular the ability to specify the vertical field of view and the vertical offset for the cylindrical image.

sph2pan -w 800 -v -7.5 dervish_sph.tga

sph2pan -w 800 -v -7.5 -t 60 dervish_sph.tga

sph2pan -w 800 -v -7.5 -t 60 -r 90 dervish_sph.tga

Image courtesy of Sarah Kenderdine and Jeffrey Shaw.

