最近在思考opengl将三维立体空间转换到屏幕空间的具体流程;
OpenSceneGraph 3.0 Beginner's Guide p164-p165讲的比较清楚,懒得翻译了,原文如下:
When drawing a point, a line, or a complex polygon in the 3D world, our final goal is
to display it on the screen. That is, the 3D object that we are going to represent will be
converted to a set of pixels in a 2D window. In this process, three major matrices are used to
determine the transformations between different coordinate systems. These are often called
the model, view, and projection matrices.
The model matrix is used to describe the specific location of an object in the world. It can
transform vertices from an object's local coordinate system into world coordinate system.
Both coordinates are right-handed.
The next step is to transform the entire world into view space, by using the view matrix.
Suppose we have a camera placed at a certain position in the world; the inverse of the
camera's transformation matrix is actually used as the view matrix. In the right-handed view
coordinate system, OpenGL defines that the camera is always located at the origin (0, 0, 0),
and facing towards the negative Z axis. Hence, we can represent the world on our camera's
screen.
Note that, there is no separate model matrix or view matrix in OpenGL. However, it defines
a model-view matrix to transform from the object's local space to view space, which is a
combination of both matrices. Thus, to transform the vertex V in local space to Ve in view
space, we have:
Ve = V * modelViewMatrix
The next important work is to determine how 3D objects are projected onto the screen
(perspective or orthogonal), and calculate the frustum from which objects get rendered.
The projection matrix is used to specify the frustum in the world coordinate system with
six clipping planes: the left, right, bottom, top, near, and far planes. OpenGL also provides
an additional gluPerspective() function to determine a field of view with camera lens parameters.
The resulting coordinate system (called the normalized device coordinate system) ranges
from -1 to +1 in each of the axes, and is changed to left-handed now. And as a final step, we
project all result data onto the viewport (the window), define the window rectangle in which
the final image is mapped, and the z value of the window coordinates. After that, the 3D
scene is rendered to a rectangular area on your 2D screen. And finally, the screen coordinate
Vs can represent the local vertex V in the 3D world by using the so called MVPW matrix,
that is:
Vs = V * modelViewMatrix * projectionMatrix * windowMatrix
The Vs is still a 3D vector that represents a 2D pixel location with a depth value.
By reversing this mapping process, we can get a line in the 3D space from a 2D screen point
(Xs, Ys). That's because the 2D point can actually be treated as two points: one on the near
clipping plane (Zs = 0), and the other on the far plane (Zs = 1).
The inverse matrix of MVPW is used here to obtain the result of the "unproject" work:
V0 = (Xs, Ys, 0) * invMVPW
V1 = (Xs, Ys, 1) * invMVPW
其中模型矩阵的作用是对场景中的物体进行平移,旋转,缩放等线性变换使物体正确的放在统一的世界坐标系中;
gllookat定义了相机空间,view矩阵则将世界坐标中的物体转换到相机空间中;现在所有物体的位置已经是在相机空间了;具体原理见(http://schabby.de/view-matrix/)
glPerspective等则在相机空间中定义了一个视椎体,投影矩阵将视椎体中的物体映射到归一化设备坐标空间;具体原理见(http://www.cnblogs.com/microsoftxiao/archive/2006/03/27/360295.html)
最后经过视口转换将归一化坐标系中的内容映射到屏幕空间;
比较粗略,算是当整理一下大体流程吧,中间的透视除法及裁剪等过程还有待详细研究