陇西行
[唐代][陈陶]
誓扫匈奴不顾身, 五千貂锦丧胡尘。
可怜无定河边骨, 犹是深闺梦里人。
GPU Gems 3
GPU Gems 3 is now available for free online!
Please visit our Recent Documents page to see all the latest whitepapers and conference presentations that can help you with your projects.
You can also subscribe to our Developer News Feed to get notifications of new material on the site.
Chapter 13. Volumetric Light Scattering as a Post-Process
Kenny Mitchell
Electronic Arts
In this chapter, we present a simple post-process method that produces the effect of volumetric light scattering due to shadows in the atmosphere. We improve an existing analytic model of daylight scattering to include the effect of volumetric occlusion, and we present its implementation in a pixel shader. The demo, which is included on the DVD accompanying this book, shows the technique can be applied to any animating image of arbitrary scene complexity. A screenshot from the demo appears in Figure 13-1.
这里给出一个使用后处理方式生成体积雾效果的实现方案,整个实现方案是建立在体素遮挡的算法之上的,可以通过PS实现。
Figure 13-1 Volumetric Light Scattering on a Highly Animated Scene in Real Time
13.1 Introduction
In the real world, we rarely see things in a vacuum, where nothing exists between an object and its observer. In real-time rendering, the effect of participating media on light transport is often subject to low-complexity homogeneous assumptions. This is due to the intractable nature of the radiative transport equation (Jensen and Christensen 1998), accounting for emission, absorption, and scattering, in a complex interactive animated environment. In this chapter, we consider the effect of volumetric shadows in the atmosphere on light scattering, and we show how this effect can be computed in real time with a GPU pixel shader post-process applied to one or more image light sources.
现实世界中是不存在真空环境的,而在实时渲染中,光照经行路径中的传播媒介通常都是按照低复杂度的同质粒子来假设的,这是因为想要模拟光照在大气中的自发光,散射以及吸收等作用是非常困难的(无法实现渲染过程中对于辐射传播公式的模拟),本章会给出一个兼顾一个或者多个光源的实时渲染完成的后处理体积雾效果。
13.10 References
Dobashi, Y., T. Yamamoto, and T. Nishita. 2002. "Interactive Rendering of Atmospheric Scattering Effects Using Graphics Hardware." Graphics Hardware.
Hoffman, N., and K. Mitchell. 2002. "Methods for Dynamic, Photorealistic Terrain Lighting." In Game Programming Gems 3, edited by D. Treglia, pp. 433–443. Charles River Media.
Hoffman, N., and A. Preetham. 2003. "Real-Time Light-Atmosphere Interactions for Outdoor Scenes." In Graphics Programming Methods, edited by Jeff Lander, pp. 337–352. Charles River Media.
James, R. 2003. "True Volumetric Shadows." In Graphics Programming Methods, edited by Jeff Lander, pp. 353–366. Charles River Media.
Jensen, H. W., and P. H. Christensen. 1998. "Efficient Simulation of Light Transport in Scenes with Participating Media Using Photon Maps." In Proceedings of SIGGRAPH 98, pp. 311–320.
Karras, T. 1997. Drain by Vista. Abduction'97. Available online at http://www.pouet.net/prod.php?which=418&page=0.
Max, N. 1986. "Atmospheric Illumination and Shadows." In Computer Graphics (Proceedings of SIGGRAPH 86) 20(4), pp. 117–124.
Mech, R. 2001. "Hardware-Accelerated Real-Time Rendering of Gaseous Phenomena." Journal of Graphics Tools 6(3), pp. 1–16.
Mitchell, J. 2004. "Light Shafts: Rendering Shadows in Participating Media." Presentation at Game Developers Conference 2004.
Nishita, T., Y. Miyawaki, and E. Nakamae. 1987. "A Shading Model for Atmospheric Scattering Considering Luminous Intensity Distribution of Light Sources." In Computer Graphics (Proceedings of SIGGRAPH 87) 21(4), pp. 303–310.
13.2 Crepuscular Rays
Under the right conditions, when a space contains a sufficiently dense mixture of light scattering media such as gas molecules and aerosols, light occluding objects will cast volumes of shadow and appear to create rays of light radiating from the light source. These phenomena are variously known as crepuscular rays, sunbeams, sunbursts, star flare, god rays, or light shafts. In sunlight, such volumes are effectively parallel but appear to spread out from the sun in perspective.
在正常情况下,如果太阳光线被一个处于包含浓密气体粒子(molecules & aerosols)空间的物体所遮挡住的话,在这个物体边缘会出现从太阳位置出发的射线光束集,这种现象有很多名字:crepuscular ray(黄昏之光),sunbeams,sunbursts,star flare,god rays以及light shaft。从太阳光的角度去看,这些光线实际上都是平行的,不过在透视相机的观察下,这些光是从太阳位置发出的径向交汇的。
Rendering crepuscular rays was first tackled in non-real-time rendering using a modified shadow volume algorithm (Max 1986) and shortly after that, an approach was developed for multiple light sources (Nishita et al. 1987). This topic was revisited in real-time rendering, using a slice-based volume-rendering technique (Dobashi et al. 2002) and more recently applied using hardware shadow maps (Mitchell 2004). However, slice-based volume-rendering methods can exhibit sampling artifacts, demand high fill rate, and require extra scene setup. While a shadow-map method increases efficiency, here it also has the slice-based detractors and requires further video memory resources and rendering synchronization. Another real-time method, based on the work of Radomir Mech (2001), uses polygonal volumes (James 2003), in which overlapping volumes are accumulated using frame-buffer blending with depth peeling. A similar method (James 2004) removes the need for depth peeling using accumulated volume thickness. In our approach, we apply a per-pixel post-processing operation that requires no preprocessing or other scene setup, and which allows for detailed light shafts in animating scenes of arbitrary complexity.
这里介绍了体积雾实现算法的一些历史。
In previous work (Hoffman and Preetham 2003), a GPU shader for light scattering in homogeneous media is implemented. We extend this with a post-processing step to account for volumetric shadows. The basic manifestation of this post-process can be traced to an image-processing operation, radial blur, which appears in many CG demo productions (Karras 1997). Although such demos used software rasterization to apply a post-processing effect, we use hardware-accelerated shader post-processing to permit more sophisticated sampling based on an analytic model of daylight.
Hoffman & Preetham
在2003年给出的光照散射实现方法,只考虑了光线在同质媒介中传播的散射,本文对这个模型进行了扩展,尝试使用后处理增加一个体积雾效果。这种后处理可以看成是一种在CG动画中常见的径向模糊。不过CG中的实现算法通常是用软件光栅化的方式制作的,这里给出的方法准备基于对光照模型的分析使用硬件加速的shader来实现同样的功能。
13.3 Volumetric Light Scattering
To calculate the illumination at each pixel, we must account for scattering from the light source to that pixel and whether or not the scattering media is occluded. In the case of sunlight, we begin with our analytic model of daylight scattering (Hoffman and Preetham 2003). Recall the following:
为了计算每个像素的散射光照,就需要将从光源到像素的的散射都考虑到(不论是否被遮挡),对于太阳光而言,首次尝试的方法是Hoffman&Preetham的日光散射:
where s is the distance traveled through the media and is the angle between the ray and the sun. E sun is the source illumination from the sun, ex is the extinction constant composed of light absorption and out-scattering properties, and sc is the angular scattering term composed of Rayleigh and Mie scattering properties. The important aspect of this equation is that the first term calculates the amount of light absorbed from the point of emission to the viewpoint and the second term calculates the additive amount due to light scattering into the path of the view ray. As in Hoffman and Mitchell 2002, the effect due to occluding matter such as clouds, buildings, and other objects is modeled here simply as an attenuation of the source illumination,
公式中的s指的是光在介质中传播的距离,是光线跟太阳光方向的夹角,Esun是太阳光强,指的是考虑了outscattering跟吸收之后的衰减常量,指的是考虑了瑞利散射跟米氏散射属性的角度散射项(也就是俗称的相函数)。这个公式可以分成前后两项,第一项指的是物体颜色经过outscattering跟吸收之后进入人眼的光照强度,而第二项则是太阳光由于inscattering而进入人眼的光照强度(这一项公式看起来有点奇怪)。而在这里,云层建筑等遮挡的作用是通过对光源强度进行削弱来体现的(Hoffman & Mitchell)
where D() is the combined attenuated sun-occluding objects' opacity for the view location .
其中D(phi)指的是相机所在位置所对应的由于光照被遮挡而导致的强度削减幅度。
This consideration introduces the complication of determining the occlusion of the light source for every point in the image. In screen space, we don't have full volumetric information to determine occlusion. However, we can estimate the probability of occlusion at each pixel by summing samples along a ray to the light source in image space. The proportion of samples that hit the emissive region versus those that strike occluders gives us the desired percentage of occlusion, D(). This estimate works best where the emissive region is brighter than the occluding objects. In Section 13.5, we describe methods for dealing with scenes in which this contrast is not present.
这个公式引入了遮挡物的影响,虽然我们无法知道每个像素的散射路径上的具体遮挡情况,不过我们可以根据概率进行估计:对于从当前像素到光源的屏幕空间路径上的所有点,将那些处于点亮区域的像素数目除以那些处于被遮挡区域的像素数目,就得到了D的预估,这种方法在点亮区域跟遮挡区域对比比较明显的情况下效果比较好,后面会给出两者对比不够强的时候的做法。
If we divide the sample illumination by the number of samples, n, the post-process simply resolves to an additive sampling of the image:
如果我们将采样点照明结果除以采样点数目,就得到了如下的累加采样计算公式:
13.3.1 Controlling the Summation
In addition, we introduce attenuation coefficients to parameterize control of the summation:
此外,我们还引入了用于控制求和计算权重的相关削减系数:
where exposure controls the overall intensity of the post-process, weight controls the intensity of each sample, and decayi (for the range [0, 1]) dissipates each sample's contribution as the ray progresses away from the light source. This exponential decay factor practically allows each light shaft to fall off smoothly away from the light source.
其中曝光度参数控制整个计算的强度,权重weight控制单个采样点的强度,而decay^i则用于模拟采样点的贡献值随着距离光源的距离的增加而衰减的作用,增加这个系数有助于实现light shaft实现从强到弱的平滑过渡。
The exposure and weight factors are simply scale factors. Increasing either of these increases the overall brightness of the result. In the demo, the sample weight is adjusted with fine-grain control and the exposure is adjusted with coarse-grain control.
曝光度跟权重系数用于控制整体的强度,在demo中,采样点权重采用的是精细化控制策略,而曝光度采取的是粗糙控制策略(没太明白)
Because samples are derived purely from the source image, semitransparent objects are handled with no additional effort. Multiple light sources can be applied through successive additive screen-space passes for each ray-casting light source. Although in this explanation we have used our analytic daylight model, in fact, any image source may be used.
由于采样点是直接从原始的屏幕输出结果贴图中选取出来的,因此对于半透物体来说,无需额外的工作量。此外,这种方法还能够用于多个光源的场景渲染,在使用中,只需要对每个光源进行一次如上处理就行了,虽然在推导的时候假设的光源类型是太阳的平行光,但是对于所有的光源类型都是可以的。
For the sun location, , and each screen-space image location, , we implement the summation with successive samples from the source image at regular intervals along the ray vector, = (-)/n(density). Here we introduce density to permit control over the separation between samples for cases in which we wish to reduce the overall number of sample iterations while retaining a sufficiently alias-free sampling density. If we increase the density factor, we decrease the separation between samples, resulting in brighter light shafts covering a shorter range.
整个实现通过对太阳在屏幕空间的位置与屏幕像素的连线上每隔一定距离的采样点进行累加来完成。在这里,引入了一个密度参数,用于控制采样点的间隔,为了保证最终的输出效果不至于出现锯齿等瑕疵,在这里如果增加了密度,就会相应缩减light shaft的范围。
In Figure 13-2, no samples from 1 are occluded, resulting in maximum scattering illumination under regular evaluation of L(s,). At 2, a proportion of samples along the ray hit the building, and so less scattering illumination is accumulated. By summing over cast rays for each pixel in the image, we generate volumes containing occluded light scattering.
下图中,第一个观察点看到的是完全无遮挡的inscattering效果,需要将整条路线上的所有点都纳入累积,而对于第二个观察点,只有从建筑到观察点的位置的一段距离是存在散射的,因此整个计算应该只围绕这一段范围进行。
Figure 13-2 Ray Casting in Screen Space
We may reduce the bandwidth requirements by downsampling the source image. With filtering, this reduces sampling artifacts and consequently introduces a local scattering contribution by neighborhood sampling due to the filter kernel. In the demo, a basic bilinear filter is sufficient.
为了优化性能,还可以对原始贴图进行下采样,之后通过图像过滤减少采样点的瑕疵,还可以顺带引入由相邻像素的散射传递过来的局部散射效果,在demo中,使用的是双边滤波方法,其表现已经能够满足需求。
13.4 The Post-Process Pixel Shader
The core of this technique is the post-process pixel shader, given in Listing 13-1, which implements the simple summation of Equation 4.
整个算法的核心就是公式4,其实现是通过PS完成的。
Given the initial image, sample coordinates are generated along a ray cast from the pixel location to the screen-space light position. [1] The light position in screen space is computed by the standard world-view-project transform and is scaled and biased to obtain coordinates in the range [-1, 1]. Successive samples L(s, , i ) in the summation of Equation 4 are scaled by both the weight constant and the exponential decay attenuation coefficients for the purpose of parameterizing control of the effect. The separation between samples' density may be adjusted and as a final control factor, the resulting combined color is scaled by a constant attenuation coefficient exposure.
整个算法的实现过程可以概述如下:先将光源位置变换到屏幕空间,得到[-1,1]的范围的坐标,将这个坐标与屏幕中对应像素的连线,之后沿着这个连线按照一定的间隔设定采样点,并对采样点计算其L(s,i),之后按照公式4所述,将这个结果与权重以及指数衰减因子相乘。为了控制效果表现,这里将曝光度做成了一个暴露给美术的控制项,用于控制最终的输出强度。
Example 13-1. Post-Process Shader Implementation of Additive Sampling
float4 main(float2 texCoord : TEXCOORD0) : COLOR0
{
// Calculate vector from pixel to light source in screen space.
half2 deltaTexCoord = (texCoord - ScreenLightPos.xy);
// Divide by number of samples and scale by control factor.
deltaTexCoord *= 1.0f / NUM_SAMPLES * Density;
// Store initial sample.
half3 color = tex2D(frameSampler, texCoord);
// Set up illumination decay factor.
half illuminationDecay = 1.0f;
// Evaluate summation from Equation 3 NUM_SAMPLES iterations.
for (int i = 0; i < NUM_SAMPLES; i++)
{
// Step sample location along ray.
texCoord -= deltaTexCoord;
// Retrieve sample at new location.
half3 sample = tex2D(frameSampler, texCoord);
// Apply sample attenuation scale/decay factors.
sample *= illuminationDecay * Weight;
// Accumulate combined color.
color += sample;
// Update exponential decay factor.
illuminationDecay *= Decay;
}
// Output final color with a further scale control factor.
return float4( color * Exposure, 1);
}
13.5 Screen-Space Occlusion Methods
As stated, sampling in screen space is not a pure occlusion sampling. Undesirable streaks may occur due to surface texture variations. Fortunately, we can use the following measures to deal with these undesirable effects.
前面就说过,使用屏幕空间的采样计算方式并不是真正的遮挡采样,这种实现方法难免会由于表面贴图的变化而导致一些错误的光线(因为这里是通过考虑图像对比度来判定像素是否处于点亮(天空)区域的)。不过这些问题可以通过下面的方式处理掉。
13.5.1 The Occlusion Pre-Pass Method
If we render the occluding objects black and untextured into the source frame buffer, image processing to generate light rays is performed on this image. Then the occluding scene objects are rendered with regular shading, and the post-processing result is additively blended into the scene. This approach goes hand in hand with the common technique of rendering an unshaded depth pre-pass to limit the depth complexity of fully shaded pixels. Figure 13-3 shows the steps involved.
如果先将整个场景按照color write disable的方式进行渲染,之后在深度图上进行后处理散射计算,且这个深度图还可以用于过滤后续正式渲染的不必要的颜色处理。不过不是很明白的是,为什么要用无颜色的贴图用作source贴图?因为无颜色的话,就可以增强点亮区域与非点亮区域的对比度,避免误判。
Figure 13-3 The Effect of the Occlusion Pre-Pass
13.5.2 The Occlusion Stencil Method
On earlier graphics hardware the same results can be achieved without a pre-pass by using a stencil buffer or alpha buffer. The primary emissive elements of the image (such as the sky) are rendered as normal while simultaneously setting a stencil bit. Then the occluding scene objects are rendered with no stencil bit. When it comes to applying the post-process, only those samples with the stencil bit set contribute to the additive blend.
对于一些可以使用stencil的硬件来说,prepass也是不必要的,可以在渲染自发光像素(如天空)时,设置对应的stencil值,在渲染普通物件时,stencil值维持不变,之后在进行light shaft后处理的时候,只需要对那些stencil被设置过的像素进行计算即可(将天空的像素的散射累加起来,除以从天空到当前像素对应的采样点数目,从而保证被遮挡距离越远,其散射效果越弱)。
13.5.3 The Occlusion Contrast Method
Equally though, this problem may be managed by reducing texture contrast through the texture's content, fog, aerial perspective, or light adaption as the intensity of the effect increases when facing the light source. Anything that reduces the illumination frequency and contrast of the occluding objects diminishes streaking artifacts.
总的来说,任何有助于降低照明频率(什么东西?)以及遮挡物体之间的对比度的渲染功能,都有助于降低light shaft的瑕疵。
13.6 Caveats
Although compelling results can be achieved, this method is not without limitations. Light shafts from background objects can appear in front of foreground objects, when dealing with relatively near light sources, as shown in Figure 13-4. In a full radiative transfer solution, the foreground object would correctly obscure the background shaft. One reason this is less noticeable than expected is that it can be perceived as the manifestation of a camera lens effect, in which light scattering occurs in a layer in front of the scene. This artifact can also be reduced in the presence of high-frequency textured objects.
虽然本文给出的方法能够提供一种接近于真实light shaft的效果,但是假的终究还是会露馅,比如下图所示,由于无法辨别远近场景物件,处于后面的物体的light shaft效果会透过到前面的建筑上来,这显然是物理不真实的。不过这种问题在场景贴图频率较高(也就是颜色比较多样化)的时候会不那么明显。
Figure 13-4 Dealing with One Limitation
As occluding objects cross the image's boundary, the shafts will flicker, because they are beyond the range of visible samples. This artifact may be reduced by rendering an extended region around the screen to increase the range of addressable samples.
如果遮挡体位于屏幕边缘,其导致的shaft效果就会出现闪烁的瑕疵,这是因为他们(指的是什么?应该是屏幕外的遮挡体采样点)已经超出了可见采样点的范围。这个瑕疵可以通过扩大屏幕的渲染范围来减轻(相当于将不可见的,变成可见)。
Finally, when facing perpendicular to the source, the light's screen-space location can tend toward infinity and therefore lead to large separation between samples. This can be diminished by clamping the screen-space location to an appropriate guard-band region. Alternatively, the effect can be faded toward the perpendicular and is further decreased when using an occlusion method.
最后,如果相机观察方向与太阳光方向垂直的话,那么太阳在屏幕空间的位置就会趋近于无穷远,从而导致采样点之间的间距过宽。这个问题可以通过限制屏幕空间的范围为一个合适的安全范围(从而将光源裁剪到有限距离?),此外,还可以随着观察方向与光照方向的夹角的增加而逐渐削弱light shaft的效果来避免瑕疵。
13.7 The Demo
The demo on this book's DVD uses Shader Model 3.0 to apply the post-process, because the number of texture samples needed exceeds the limits of Shader Model 2.0. However, the effect has been implemented almost as efficiently with earlier graphics hardware by using additive frame-buffer blending over multiple passes with a stencil occlusion method, as shown in Figure 13-5.
因为需要采样的贴图数目超过了SM2.0的限制,因此这个demo使用的是SM3.0。不过这并不影响这个算法的高效性。
Figure 13-5 Crepuscular Rays with Multiple Additive Frame-Buffer Passes on Fixed-Function Hardware
13.8 Extensions
Sampling may occur at a lower resolution to reduce texture bandwidth requirements. A further enhancement is to vary the sample pattern with stochastic sampling, thus reducing regular pattern artifacts where sampling density is reduced.
可以通过降低source贴图的分辨率来提高性能,此外,还可以将采样点的pattern替换成随机采样点来减轻由于相同pattern导致的瑕疵。
Our method performs the post-process in a single-pass execution of a shader. In one multipass approach, the pixel shader summation may be performed where the results of concentric rectangular bands emanating from the light source may be accumulated into successive outer bands, where L(s, , ) = Li - 1(s, , ) + Li (s, , ). While this may not be most suited to current hardware design, this approach is the minimal required sampling and computation limit.
本文给出的方法只需要一个pass就能完成渲染。不过也可以通过multipass来完成,使用multipass的话,PS中的累加过程可以调整成一个相减操作:L(s, , ) = Li - 1(s, , ) + Li (s, , ),其中L是多个从光源位置出发的同心方形圈组成的贴图的数据,每个数据都可以看成是从光源位置到当前位置的散射数据的累加。这种方法计算所需的采样频率是最低的,计算消耗也是最低的。
Creating a balance between light shaft intensity and avoiding oversaturation requires adjustments of the attenuation coefficients. An analytic formula that performs light adaption to a consistent image tone balance may yield an automatic method for obtaining a consistently perceived image. For example, perhaps we can evaluate a combination of average, minimum, and maximum illumination across the image and then apply a corrective color ramp, thus avoiding excessive image bloom or gloominess.
想要在light shaft高强度与过饱和之间做一个平衡需要调整相关的衰减参数,这个过程可以通过一个分析式的数学公式来完成,比如我们可以在平均值,最小值以及最大值之间做一个计算,来保证场景数据不至于过曝。
13.9 Summary
We have shown a simple post-process method that produces the effect of volumetric light scattering due to shadows in the atmosphere. We have expanded an existing analytic model of daylight scattering to include the contribution of volumetric occlusion, and we have described its implementation in a pixel shader. The demo shows that this is a practical technique that can be applied to any animating image of arbitrary scene complexity.
本文给出了一种实现light shaft的简单方法。