AbstractIn recent years, both Williams’ original Z-buffer shadow mapping algorithm [Williams 1978] and Crow’s shadow volumes [Crow 1977] have seen many variations, additions and enhancements, greatly increasing the visual quality and efficiency of renderings using these techniques. Additionally, the fast evolution of commodity graphics hardware allows for a nearly complete mapping of these algorithms to such devices (depending on the GPU’s capabilities) which results in real-time display rates as seen in the real-time version of Pixar’s Luxo Jr and the use of hardware shadow maps therein. In this article, we describe the major contributions since Williams’ and Crow’s original publications in 1978 and 1977 respectively, briefly present both the shadow mapping (which is computed in image space) and the shadow volume algorithms, present more sophisticated approaches to shadow mapping which are better suited to high quality off-line renderers and describe the aliasing problems inherent in all shadow algorithms which operate in image space (and proposed solutions). Finally, we describe new extensions to the existing algorithms such as perspective shadow maps as described by [Stamminger and Drettakis 2002] in the 2002 SIGGRAPH conference, and robust stenciled shadow volumes by Mark Kilgard [Everitt and Kilgard 2002]. Overview
In this article, we attempt to bridge this gap and present to the reader a comprehensive overview of available recently developed techniques, enhancements and additions to the basic shadow mapping and shadow volume algorithms. To keep within the scope of this article we only briefly describe off-line, highly sophisticated versions of shadow mapping, but list relevant literature for further reading. Also, the description of the shadow mapping and shadow volume techniques presented in this report may also not be as detailed as necessary for direct implementation, therefore all necessary literature is referenced. An invaluable resource for developers is NVIDIA’s developer web-site at http://developer.nvidia.com. Given this information, the reader should be up to date with the current state of the art in real-time shadow rendering. Alternatives to (Real-Time) Shadow Mapping and Shadow VolumesProjected Planar ShadowsDuring the research phase, which inspired this article, other alternatives to shadow mapping and shadow volumes were surveyed. Of the possible alternatives, projected planar shadows seems to be the most promising algorithm. Blinn [Blinn 1988] invented the original technique, which only allows shadows to be cast on planar surfaces. Haines [Haines 2001] extended the planar shadow projection technique to soft shadows, although with the same limitations of Blinn’s basic projection algorithm. Kilgard [Kilgard 1999] shows how to utilize the stencil buffer to improve the visual quality of projected shadows. The idea: given a plane P: n*x + d = 0 and a point light source l, construct a projection matrix M that projects each vertex v onto the plane P. The point p, which is the projection of vertex v onto the plane P can be described as: This can be converted into a projection matrix which satisfiesMv = p [Haines and Moeller 1999] Care must be taken to avoid anti-shadows, e.g. when the light source is between the occluder and the shadow plane. Other subtleties and solutions (such as utilizing the stencil buffer to prevent geometry overlapping) can be found in [Haines and Moeller 1999] and [Kilgard 1999]. Off-line Shadow MappingStamminger and Drettakis [Stamminger and Drettakis 2002] describe a slew of very sophisticated shadow mapping algorithms in their paper on perspective shadow maps. They note that almost all of these algorithms involve multiple rendering passes (as they generally use more than one shadow map) and more involved data structures and thus do not, or do not easily map to hardware. Nevertheless, for completeness, we list the relevant literature in the references section, specifically for soft shadows [Agrawala et al. 2000], adaptive shadow maps [Fernando et al. 2001], deep shadow maps [Lokovic and Veach 2000] and solar shadows [Tadamura et al. 2001]. The Shadow Volume AlgorithmThe shadow volume algorithm is a geometry based shadow algorithm which requires connectivity information of the polygonal meshes in the scene to efficiently compute the silhouette of each shadow casting object (each occluder). It is also a per pixel algorithm, which performs an in shadow test for each rendered fragment. This operation can be accelerated using graphics hardware (the stencil buffer) as we will describe later. In pseudocode, the algorithm reads as follows: procedure SHADOWVOLUMERENDERING for all rasterized fragments do draw fragment with ambient and emissive lighting update the Z-buffer end for COMPUTEFRAGMENTSINSHADOW for all rasterized fragments do if not INSHADOW(fragment) then draw fragment with diffuse and specular lighting end if end for As obvious, the key issue here is determining whether a rendered fragment is in shadow or not. This procedure maps easily to any graphics hardware with a stencil buffer. After rendering each fragment with emissive and ambient lighting, the following must be inserted: procedure COMPUTEFRAGMENTSINSHADOW (Z-pass) for all shadow casting objects do compute potential silhouette edges (PSE) of the polygonal model compute the shadow volume polygons (shadow quads) from the light source(s) and the PSE end for for all front facing shadow quads from viewpoint do if Z-buffer test passes then increment stencil buffer value end if end for for all back facing shadow quads from viewpoint do if Z-buffer test passes then decrement stencil buffer value end if end for
Now to evaluate the boolean function INSHADOW (used in SHADOWVOLUMERENDERING) for each fragment we simply query the stencil buffer value. If the the value stored in the stencil buffer after COMPUTEFRAGMENTSINSHADOW is greater than zero, the fragment is in shadow and must not be drawn in the second rendering pass. To understand the procedure geometrically, see figure 2 taken from [Akenine-Moeller and Assarsson 2002]. When computing the shadow query using the stencil buffer as described above, the algorithm is usually titled stenciled shadow volumes and was first proposed in [Heidmann 1991] and further re- fined in [Kilgard 1999]. The algorithm as described suffers from an amount of drawbacks which make it impractical for certain situations, if special cases are not treated correctly. First off, the algorithm only works if the viewpoint is outside of shadow, otherwise the stencil counting is inverted. This can be remedied by testing this special case, inverting the shadow test, and also initializing the stencil buffer to 2N−1 (where N is the stencil buffer precision), as the stencil buffer holds only unsigned values and decrementing from a zero value would again result in incorrect shadows. A more elegant solution is proposed in [Everitt and Kilgard 2002] , which was partially inspired by an insight of John Carmack, lead programmer of id software [Carmack 2000]. First off, instead of computing the stencil values by incrementing front facing shadow quads and decrementing back facing shadow quads on Z-buffer pass, the entire process is modified to count from infinity instead of from the viewpoint, the so called Z-fail version: procedure COMPUTEFRAGMENTSINSHADOW(Z-fail) for all shadow casting objects do compute potential silhouette edges (PSE) of the polygonal model compute the shadow volume polygons (shadow quads) from the light source(s) and the PSE end for for all front facing shadow quads from viewpoint do if Z-buffer test fails then decrement stencil buffer value end if end for for all back facing shadow quads from viewpoint do if Z-buffer test fails then increment stencil buffer value end if end for
The two representations (Z-pass and Z-fail) are completely equivalent and compute the same value, yet they do not suffer from the problems mentioned above: the viewer being in shadow is no longer a special case. Another problem related to stenciled shadow volumes has also been solved in [Everitt and Kilgard 2002], namely the necessity to compute shadow volume caps to prevent clipping errors, as the per pixel computation is carried out in already clipped, post perspective-transform space (clip space). By using the Z-fail method and setting the far clipping plane to infinity, we only need to cap off the shadow volume by adding the occluder polygons facing the light source to the shadow volume, and possibly projecting these onto the near clip plane, should they be in front of this plane. By setting the far plane to infinity we can guarantee that the shadow volume will never be clipped away by the far clipping plane. This entire technique is named Robust Stenciled Shadow Volumes and fully described in [Everitt and Kilgard 2002]. Some good code and comprehensible presentations can be obtained from http://developer.nvidia.com. Up to this point, the described stenciled shadow volume algorithm produces only hard shadows and, thus, is only suited for ideal point light sources without the penumbrae typical for area light sources. One possible variation is to recompute the shadow volumes from a variety of point light samples nearby the original point light source and accumulate the resulting shadows. This is both computationally expensive and also results in visually displeasing results, see [Kilgard 1999]. M¨oller [Akenine-Moeller and Assarsson 2002] has recently presented a viable solution, computing soft shadows for existing shadow volumes using penumbra wedges, see figure 3 (taken from [Akenine-Moeller and Assarsson 2002]). The procedure is quite intricate and is therefore out of the scope of this report. See the paper for a detailed description. Overall, the stenciled shadow volume produces visually impressive results at high frame rates, which could be seen at this years Electronic Entertainment Expo in form of the newest Doom 3 demo (see figure 1). The drawbacks inherent to the method are due both to the geometric nature and the per pixel operation of the algorithm. Because a potential silhouette must be computed for every shadow casting object in the scene, the scene complexity has direct influence on the performance of the algorithm (a property that shadow maps do not have, as we will see later). Also, rendering the shadow quads to the stencil buffer can consume tremendous amounts of stencil fill rate. Kilgard therefore proposes the use of more effective shadow volume culling schemes [Everitt and Kilgard 2002].
The Shadow Mapping AlgorithmShadow mapping is a completely image-space algorithm, which means that no knowledge of the scene’s geometry is required to carry out the necessary computations. As it uses discrete sampling it must deal with various aliasing artifacts, the major drawback of the technique. These drawbacks can be partially overcome as we will describe soon. This algorithm, like the shadow volume algorithm, performs a per-pixel (per-fragment) in shadow test to determine whether a pixel has a diffuse and/or specular component or not. The two pass algorithm in pseudocode: procedure SHADOWMAPPING Render depth buffer (Z-buffer) from lights point of view, resulting in a shadow map or depth map Now, render scene from the eye’s point of view for all rasterized fragments do Determine fragment’s xyz position relative to the light That is transform each fragment’s xyz into the light’s coordinate system A = depth map(x,y) B = z-value of fragment’s xyz light position if A < B then fragment is shadowed else fragment is lit end if end for
A comprehensive visualization of the algorithm applied to a demo scene (taken from the NVIDIA demo) and the accompanying description is shown in figure 5. The first problem this algorithm must deal with is the case of erroneous self shadowing: When transforming a point from a surface in the eye’s point of view into the lights coordinate system, A and B in the above algorithm should ideally be equal. Yet due to Zbuffer quantization, it is very likely that a 6= b and the transformed point will fall above or below the surface [Williams 1978]. To cure this imperfection, a bias value is subtracted to ensure that false self shadowing is removed from the scene. Williams simply subtracts a constant bias from the Z-values points after they have been transformed into light space, which may move the shadow line slightly. Kilgard [Kilgard 2001] uses OpenGL’s glPolygonOffset to offset the depth value in the shadow map back and compensate for the slope of the polygon (greater slope = more offset). Stamminger and Drettakis [Stamminger and Drettakis 2002] again use a constant bias subtracted from the perspective shadow map due to the fact that their depth map is constructed after perspective transform and contains non-uniformly scaled objects. In general, the amount of bias necessary decreases with higher shadow map precision and the tendency should be towards a higher bias setting (figure 6).
To perform the depth test, the depth buffer must be read back to memory and accessed with the transformed coordinates of the rasterized fragment. The fragment’s light position can be generated using eye-linear texture coordinate generation, which relies on projective texturing, see [Heidrich 1999], [Segal et al. 1992] and [Heckbert 1986]. The entire procedure of copying the Z-buffer to a texture, transforming the fragment to the lights xyz coordinates, accessing the depth texture and performing the shadow test can be mapped to hardware using existing OpenGL extensions (and the appropriate hardware supporting these extensions of course). A detailed description is beyond the scope of this report. An excellent treatment on the details can be found in [Kilgard 2001]. As described above, aliasing artifacts can reduce shadow quality significantly. These artifacts (figure 8, right image and figure 10, top right image) are due to shadow map undersampling, which means that a shadow map texel maps to more then one frame buffer pixel. An elegant formalization reads as follows [Stamminger and Drettakis 2002] (see figure 7): for every pixel of size ds ×ds in the shadow map maps to a pixel area of (approximately) size d in the final image: Undersampling appears when d is larger than the image pixel size di. Shadow map aliasing can be split into two independent parts, perspective aliasing, dependent on the term dsrs/ri and projection aliasing, dependent on cosβ/cosα.
One solution to the aliasing problem, percentage closer filtering, was proposed by Reeves et. al. [Reeves et al. 1987]. In general, depth map values can not be blended, as this can lead to pixels being wrongly in shadow [Kilgard 2001]. Percentage closer filtering averages over boolean comparison results within the extents of the filter kernel, so if for example we operate with a 3x3 kernel around the computed fragment, as shown in figure 9, with the displayed results then the pixel is said to be 55% in shadow (for results see figure 8, left image). Recently (at SIGGRAPH 2002), Stamminger and Drettakis presented another solution to greatly reduce perspective aliasing: perspective shadow maps [Stamminger and Drettakis 2002]. The basic idea is to perform the shadow map computation and the shadow test in normalized device coordinate space after perspective transformation (clip space). The goal here is to keep the fraction rs/ri close to a constant. In post perspective space the final image is an orthogonal view onto the unit cube, therefore perspective aliasing due to distance to the eye is avoided. For a more in depth analysis and special cases see [Stamminger and Drettakis 2002]. The basic principle and results of the two methods are shown in figure (taken from [Stamminger and Drettakis 2002]). It should be noted though, that the example depicted in figure 10 is a case for which the perspective shadow mapping technique produces excellent results. Cases in which the perspective shadow map converges to the standard uniform shadow map can easily be constructed. Also, just like for the shadow volume algorithm, some soft shadow extensions to the shadow map algorithm have been suggested, many of which also suited to real-time rendering. These algorithms are not necessarily physically correct, but produce aesthetically pleasing results as pointed out by Brabec and Seidel in their paper [Brabec and Seidel 2002], where single sample soft shadows are generated. Another extension to linear light sources is described in [Heidrich et al. 2000].
A Hybrid ApproachOne more shadow algorithm which deserves mention in this report is McCool’s clever idea shadow volume reconstruction from depth maps [McCool 2000]. This algorithm is a hybrid of the shadow map and shadow volume algorithms and does not require a polygonal representation of the scene. Instead of finding the silhouette edges via a dot product per model edge (shadow volumes), a depth map of the scene from the light’s point of view is acquired (shadow map) from which the silhouette edges are extracted using computer vision techniques. From these edges the shadow volumes are constructed. McCool also describes a method which uses only a one-bit stencil buffer and toggling the stencil value during shadow volume rendering. For a detailed description, see [McCool 2000] and figure 11.
In a recent quote (from August 3rd 2002), John Carmack from id software replied to an email in which he was asked why stenciled shadow volumes were chosen over shadow maps for doom3’s rendering architecture, even though stenciled shadow volumes force the application to be lower-poly and permits the use of certain special effects such as displacement mapping. Here is his reply: "Shadow buffers make good looking demos with controlled circumstances, but when you start using them for a 'real' application, you find that you need absolutely massive resolution to get acceptable results for omnidirectional lights, and a lot of the artifacts need to be tweaked on a per-light basis. While it is possible to do shadow buffers on GF1/Radeon class hardware, without percentage closer filtering they look wretched. If we were targeting only the newest hardware, shadow buffers would have a better shot, but even then, they have more drawbacks than are commonly appreciated." Carmack is known to do extensive research before making such decisions, and id software’s engines are generally accepted as state-of-the-art in video game technology, so the opinion is a completely valid view on the tradeoffs between shadow maps and shadow volumes. It is not clear if McCool’s hybrid approach was ever considered for Doom III's engine. One of the drawbacks of shadow maps Carmack mentions, is that to achieve good visual quality (referring to the RenderMan architecture) Pixar uses 2k or 4k focussed on a very narrow field of view, assuming that projections outside the shadow map are not in shadow. The problem of the narrow field of view is treated in [Brabec et al. 2002] where Brabec uses Heidrich’s omnidirectional dual parabolic parametrization for the shadow map. Depending on the resolution, this can naturally lead to a unwanted growth in aliasing artifacts. In any case, both algorithms (and the hybrid) come with their fare share of trade-offs and a decision must be made depending on the nature of the application to be rendered. A general statement in favor of a certain approach can not be made at this point, yet it is quite obvious (in the opinion of the author of this report) that an educated decision can be made which leads to very convincing results (figure 12). ConclusionThe fast development of computer graphics hardware in the past few years has helped map the described algorithms almost completely to the hardware accelerators. Shadow volumes would be much faster could the silhouette computation be carried out in hardware as well. Newer GPU’s and more flexible vertex and fragment programs as available by DirectX 9 and the newest OpenGL specifications will make such computations possible. Shadow volume extrusion can already be mapped to hardware, a demo of this technique using a vertex shader is available on the NVIDIA developer website. It remains to be seen whether the existing techniques will be extended to more realistic shadow display, specifically real-time soft shadows (e.g. [Akenine-Moeller and Assarsson 2002]), or if hardware development will inspire completely original ideas. Further Reading
References
|