Cubemaps相关

使用 Cubmap 可以模拟出环境的反射,预先将环境渲染到 Cubmap 中,从而避免在游戏运行时对环境的实时反射产生的消耗,而且这样做表现效果也非常好。在一些户外环境尤其适用,比如说车身反射外部的环境。但是在一些室内的环境中,普通的 Cubmap 反射通常会产生奇怪的效果。

_Demo1_
_天下手游截图1_
_天下手游截图2_

_Demo1_ 中,可以看到大理石地面的反射错了。 _天下手游截图1_ 中,柱子的反射错位了。 _天下手游截图2_ 中,王座的反射明显和模型产生了错位,不是正常的角度。这是普通计算 Cubmap 的反射射线方式所无法避免的

[C]  纯文本查看  复制代码
?
 
1
2
3
// 通过视线向量和法线向量计算反射向量
float3 reflDir = reflect(viewDir, normal);
// 使用反射向量采样 Cubmap
fixed4 col = texCUBE(_EnvMap, reflDir);


一般我们认为 Cubmap 是一个无穷大的立方体包围着要产生反射的物体,上面的效果和这个假设是匹配的。下面我们使用一种新的方法来计算反射向量。
参考1中提到点B和点C可以是不重叠的两个点,我在制作中发现如果要得到比较好的效果,B和C两点多为重叠的情况,当然B和C两点可以使用一个偏移量来达到微调的作用。

图中,R 是使用上文中介绍的方法计算得到的反射向量,R 射线和假想的 Cubmap 范围盒交点于 P,再从产生 Cubmap 快照的点 B 到 P 形成的新的向量即是新的反射向量 NewR。这些步骤中最为关键的就是求出交点 P。求交点 P 实际上就是在求射线和平面的交点。

Cubemaps相关_第1张图片
以上就是计算射线和平面交点的公式 。
当公式中的分母为 0 的时候,就是射线和平面没有交点的情况,我并不清楚当一个数被 0 除时着色器在不同设备上会发生什么,一般来说这种情况在常规的观察角度时极少发生,即使发生了也只会影响到一个像素,我在测试中没有发现过因此带来的副作用,所以暂时就忽略了。

先说明下以后的计算所基于的一个前提,就是假想的 Cubmap 范围盒(上图中外层的黑色细线矩形)是一个 AABB。AABB 的全称是 Axis Aligned Bounding Box,从字面上翻译为轴对齐的包围盒,最重要的一点是轴对齐的,也就是说这个包围盒的任何一条边和 XYZ 三根正交轴不是平行就是垂直。我们知道 Cubmap 有六个面,如果这六个面是任意的,那么将会增加很多射线和平面检测的计算量,但是由于有了 AABB,这一部分的计算量被大大减少了。



面向三根基轴正方向的三个面(ABC)和射线的交点可以一起计算,而另外三个面(DEF)一起计算。这是由于面 A 的 P\_{N} = (1,0,0),面 B 的 P\_{N} = (0,1,0) ,面 C 的 P\_{N} = (0,0,1),其它三个面同理。同样公式中的 P\_{o} 也是类似的。这就是为什么 ABC 这三个面可以同时计算交点了。下面给出着色器代码:

[C]  纯文本查看  复制代码
?
 
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
float3 viewDir = IN.worldPos - _WorldSpaceCameraPos;
float3 worldNormal = IN.worldNormal;
float3 reflectDir = reflect (viewDir, worldNormal);
// 得到反射向量
reflectDir = normalize(reflectDir);
 
// _BoxPosition 表示假想的 Cubmap 范围盒的中心点
// _BoxSize 表示假想的 Cubmap 范围盒的尺寸
half3 boxStart = _BoxPosition - _BoxSize * 0.5;
half3 firstPlaneIntersect = (boxStart + _BoxSize - IN.worldPos) / reflectDir;
half3 secondPlaneIntersect = (boxStart - IN.worldPos) / reflectDir;
// 上面得到了六个 t,通过比较这六个 t 的大小,从而得到交点 P 处的 t 值
half3 furthestPlane = max(firstPlaneIntersect, secondPlaneIntersect);
half3 intersectDistance = min(min(furthestPlane.x, furthestPlane.y), furthestPlane.z);
// 计算交点 P 的坐标
half3 intersectPosition = IN.worldPos + reflectDir * intersectDistance;
// 使用新的反射向量采样 Cubmap
fixed4 reflcol = texCUBElod(_CubeMap, float4(intersectPosition - _BoxPosition, _Roughness));


_Demo2_
_Demo3_
_Demo2_ 修正了 _Demo1_ 中的错误
_Demo3_ 添加了扰动法线,并结合纹理的 Mipmap 做出了 Roughness 效果

上文说了,这种计算能够成立的前提是 AABB,但是如果是非 AABB 该怎么办呢,其实很简单就是将值转换到 AABB 中再进行计算。下面就直接给出着色器代码了。


[C]  纯文本查看  复制代码
?
 
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
float3 wpos = float3(_Object2World[0].w, _Object2World[1].w, _Object2World[2].w);
float3 viewDir = IN.worldPos - wpos - (_WorldSpaceCameraPos - wpos);
float3 worldNorm = IN.worldNormal;
float3 reflectDir = reflect (viewDir, worldNorm);
reflectDir = normalize(reflectDir);
 
float3 RayLS = normalize(mul(reflectDir, (float3x3)_Object2World));
float3 PositionLS = mul((float3x3)_World2Object, IN.worldPos - wpos);
float3 Unitary = _BoxSize;
float3 FirstPlaneIntersect = (Unitary - PositionLS) / RayLS;
float3 SecondPlaneIntersect = (-Unitary - PositionLS) / RayLS;
float3 FurthestPlane = max(FirstPlaneIntersect, SecondPlaneIntersect);
float Distance = min(FurthestPlane.x, min(FurthestPlane.y, FurthestPlane.z));
float3 IntersectPositionWS = PositionLS + RayLS * Distance;
float3 ReflDirectionWS = IntersectPositionWS - _BoxPosition;
fixed4 reflcol = texCUBElod(_CubeMap, float4(float3(ReflDirectionWS.x,ReflDirectionWS.y,-ReflDirectionWS.z), _Roughness));



使用这种方法我们还可以实现很多有趣的效果,比如像下面这样的:

_Demo4_
_Demo5_
Cubemaps相关_第2张图片

_Demo4_中窗户内部的物体是实实在在的模型,而_Demo5_中窗户内部的看似是有物体的,但其实是通过上文介绍的方法进行的模拟,效果非常好,减少了大量房屋内部的模型消耗。同时窗外的景色也因为 Cubmap 而没有丢失。


放上 Overwatch 的截图,应该也是使用的这个技术吧。



参考1:https://seblagarde.wordpress.com/2012/09/29/image-based-lighting-approaches-and-parallax-corrected-cubemap/

参考2:https://simonschreibt.de/gat/windows-ac-row-ininite/


Image-based Lighting approaches and parallax-corrected cubemap

Version : 1.28 – Living blog – First version was 2 December 2011
This post replace and update a previous post name “Tips, tricks and guidelines for specular cubemap” with information contain in the Siggraph 2012 talk “Local Image-based Lighting With Parallax-corrected Cubemap” available here.

Image-based lighting (IBL) is common in game today to simulate ambient lighting and it fit perfectly with physically based rendering (PBR). The cubemap parameterization is the main use due to its hardware efficiency and this post is focus on ambient specular lighting with cubemap. There is many different specular cubemaps usage in game and this post will describe most of them and include the description of a new approach to simulate ambient specular lighting with  local image-based Lighting. Even if most part of this post are dedicated to tools setup for this new approach, they can easily be reuse in other context. The post will first describe different IBL strategies for ambient specular lighting, then give some tips on saving specular cubemap memory. Second, it will detail an algorithm to gather nearest cubemaps from a point of interest (POI), calculate each cubemap’s contribution  and efficiently blend these cubemaps on the GPU. It will finish by several algorithms to parallax-correct a cubemap and the application for the local IBL with parallax-cubemap approach. I use the term specular cubemap to refer to both classic specular cubemap and prefiltered mipmapped radiance environment map. As always, any feedbacks or comments are welcomed.

IBL strategies for ambient specular lighting

In this section I will discuss several strategies of cubemap usage for ambient specular lighting. The choice of the right method to use depends on the game context and engine architecture. For clarity, I need to introduce some definitions:

I will divide cubemaps in two categories:
– Infinite cubemaps: These cubemaps are used as a representation of infinite distant lighting, they have no location. They can be generated with the game engine or authored by hand. They are perfect for representing low frequency lighting scene like outdoor lighting (i.e the light is rather smooth across the level) .
– Local cubemaps: These cubemaps have a location and represent finite environment lighting.They are mostly generating with game engine based on a sample location in the level. The generated lighting is only right at the location where the cubemap was generated, all other locations must be approximate. More, as cubemap represent an infinite box by definition, there is parallax issue (Reflected objects are not at the right position) which require tricks to be compensated. They are used for middle and high frequency lighting scene like indoor lighting. The number of local cubemap required to match lighting condition of a scene increase with the lighting  complexity (i.e if you have a lot of different lights affecting a scene, you need to sample the lighting at several location to be able to simulate the original lighting condition).

And as we often need to blend multiple cubemap,  I will define different cubemap blending method :
– Sampling K cubemaps in the main shader and do a weighted sum. Expensive.
– Blending cubemap on the CPU and use the resulted cubemap in the shader. Expensive depends on the resolution and required double buffering resources to avoid GPU stall.
– Blending cubemap on the GPU and use the resulted cubemap in the shader. Fast.
– Only with a deferred or light-prepass engine: Apply K cubemaps by weighted additive blending. Each cubemap bounding volume is rendered to the screen and normal+roughness from G-Buffer is used to sample the cubemap.

In all strategies describe below, I won’t talk about visual fidelity but rather about problems and advantages.

Object based cubemap
Each object is linked to a local cubemap. Objects take the nearest cubemap placed in the level and use it as specular ambient light source. This is the way adopted by Half Life 2 [1] for their world specular lighting.
Background objects will be linked at their nearest cubemaps offline and dynamic objects will do dynamic queries at runtime. Cubemaps can have a range to not affect objects outside their boundaries, they can affect background objects only, dynamic objects only or both.

The main problem with object based cubemap is lighting seams between adjacent objects using different cubemaps.
Here is a screenshot (click for full res)

Cubemaps相关_第3张图片

On this screenshot you can see cubemaps linked offline to several objects (the red lines). Lighting seams occur on the ground when parts of the ground are assign to different cubemap. This is obvious in this screenshot because the lighting is high frequency. Another problem is that you need to split a mesh into several parts when it is shared by two different environments, which is common for objects like walls and floors. It adds production constraints on meshes. Note that nearest cubemap is not always the right choice. Visibility should be test in some case.

Some advices for the cubemap placement with this method are describeb here [2]:

If a cubemap is intended for NPCs or the player, the env_cubemap should be placed at head-height above the ground. This way, the cubemap will most accurately represent the world from the perspective of a standing creature.
If a cubemap is intended for static world geometry, the env_cubemap should be a fair distance away from all brush surfaces.
A different cubemap should be taken in each area of distinct of visual contrast. A hallway with bright yellow light will need its own env_cubemap, especially if it is next to a room with low blue light. Without two env_cubemap entities, reflections and specular highlights will seem incorrect on entities and world geometry in one of the areas.
(…)
Because surfaces must approximate their surroundings via cubemaps, using too many cubemaps in a small area can cause noticeable visual discontinuities when moving around. For areas of high reflectivity, it is generally more correct to place one cubemap in the center of the surface and no more. This avoids seams or popping as the view changes.

The other problem is that you need to track which cubemap affect dynamics object. As dynamic object can swap their cubemap, there is popping. Popping can be reduced by blending the K nearest cubemap of the dynamic object. This is rather an expensive solution, you can blend the K cubemaps in the shader, adding a lot of instruction and fetch. But even blending K nearest cubemap will not prevent the popping induce by switching the smallest contributing cubemap when there is more than K cubemaps present.

Zone based cubemap
An infinite cubemap is assigned for each zone of the level. When the camera enters in a zone, the infinite cubemap from the zone is applied on all objects. Killzone 2/3, Call of duty: black ops, Prey 2 for sample  use this method [3][7][19].
This method is simple to implement and fit with any engine architecture. There are  no lighting seams between objects as the same cubemap is used for all objects and cubemaps need to be blend between zones to avoid popping. Cubemap can be blended efficiently on CPU/GPU or in a deferred way with the 2 cubemaps processed at the same time.

Global and local cubemap no overlapping
All objects used a global (infinite in this case) cubemap by default. Some local cubemap are placed into the level with no overlapping range. When the camera or the player reaches the range of a local cubemap, all objects used this local cubemap instead of the global cubemap.
This method is similar in spirit to zone based cubemap except that “zone” has not the same definition. Cubemap can be blended efficiently on CPU/GPU or in a deferred way with the 2 cubemaps processed at the same time. Frostbite engine (Battlefield 2) uses only a global cubmap for representing the skylight [4].
I am not sure of the method used to represent outdoor (global)/indoor (local) lighting in Frosbite 2 engine (Battlefield 3)  [5]. But they use sky visibility to blend between indoor cubemap and outdoor cubemap, which look similar to this strategy.

Global and local cubemap overlapping
Global (infinite in this case) and local cubemap affect only objects in their range (defined by artists). Objects can be affected by several overlapped cubemap, like a global and a local cubemap.
This strategy can only be implemented efficiently with a deferred or light-prepass rendering architecture.
This is the method used by Cryengine 3 (Crysis 2) [6] . This method is simple to implement in a deferred context, the cubemap are blended in a deferred way. There can be lighting seams at the boundary of the cubemap.

Point of interest (POI) based cubemap (Siggraph 2012 talk)
Several local cubemap are placed in the level. The number of local cubemap depends on the lighting complexity. Then the K-nearest cubemap of the POI are blended and the result is applied on all objects.

The POI depends on the context, it can be the camera, the player or any characters in the scene. It could be a dummy location animated by a track for in game cinematic.
I develop this method for a forward renderer with complex lighting scene as I can’t rely on the Global/Local overlapping method due to engine architecture constraint. This approach has been presented in the Siggraph 2012 talk : “Local Image-based Lighting with parallax corrected cubemap”.
With this method, there are no seams at all, as K cubemap are always blended. Then you always have only one cubemap applied on objects. Like for dynamic objects in the object based method, popping can appear when switching the smallest contributing cubemap when there is more than K cubemaps present (Also artists should be able to avoid it). The two main problems of this approach are the parallax issue caused by local cubemaps, which can be even worse  when blending multiple cubemaps, and the inaccurate lighting on distant object; lighting accuracy decreases with distance from POI as the approximated lighting environment is done at POI.

Cubemaps相关_第4张图片Cubemaps相关_第5张图片

The parallax artifact are fix with the method describe at the end of this post and the inaccurate lighting on distant object is reduce by using one of the following method (which can be use for the Global and local cubemap no overlapping strategy too):

  • Use ambient lighting information at object position (like lightmaps or spherical harmonics) and mix it with the current cubemap contribution based on distance from POI (Similar technique are use to save memory and they are describe in the section “Reusing available cubemap” later in this post).
  • Apply an attenuation of the specular contribution with the distance from the POI. For almost mirror objects (like pure metal in the context of PBR), this can be a problem as the diffuse color will be black. In this case an option should be provide to disable the attenuation for this object and fall back to method above. Note that this can be share with share LOD technique and help with performance.
  • Override the cubemap (See below).

Common to several strategies

With all strategies, objects should be able to force the usage of a predefined cubemap. This allow to set high resolution cubemap on mirror object [8], fix visual artifacts with always distant mirror objects, or any other similar needs.

For method based on zone (Zone based and Global and local cubemap no overlapping), the blending of cubemap could be time-based rather than zone distance-based.

Added note:
In the context of PBR:
– With a light-prepass renderer, deferred cubemaps will not have access to the specular color term required by the Fresnel equation.
– With a forward renderer,  cubemaps are rendered at the same time as objects and can access all material attributes.
– With a deferred renderer,  cubemaps can be rendered forward at G-Buffer time (with emissive store in G-buffer) or deferred later and can access all material attributes store in G-buffer.
– With deferred cubemap, we can’t access to ddx/ddy texcoord and so we can’t perform the mipmap optimization which consists to take the lesser lod between manual and hardware lod.

It is possible to add some dynamic responses to lighting environment by switching on/off cubemaps. Several cubemaps must be generated under different lighting condition, and they can be switched based on context, or even blended.

Specular cubemaps memory footprint

Having a lot of cubemaps require a lot of storage. Unlike irradiance cubemap which can be very low resolution (16x16x6 or 8x8x6) a specular cubemap is rather middle resolution (64x64x6 or 128x128x6). Higher resolution (256x256x6 or 512x512x6) may even be required for mirror objects. This section will discuss some methods to help with the memory footprint of cubemaps.

Texture compression and streaming
The most obvious advice when talking about saving memory with textures is to implement a streaming textures system and choose an aggressive texture compression format.

Streaming cubemaps is a more complicated problem than what we can think at first. The simplest strategy is to have infinite/local cubemaps link to a zone and load them with the zone. Streaming only high  mipmap of cubemap when required suppose that you know when you will need it, which is not trivial and this require hardware cubemap layout knowledge.

About the compression format, the cubemap should be HDR (If you generate the cubemap in game engine remember to switch off the postprocess, then you may require to not applying cubemap on objects when capturing a new cubemap). As we talk about games on DX9/XBOX360/PS3 generation, we must fit in a RGBA8bit format at most.
There are few candidates : YCoCg [24], LogLuv, Luv, RGBE, RGBM of any kind [9][11][12] or Cryengine 3 RGBK (like RGBM)[6]; All are stored as DXT5 and require extra instructions. As I need a lot of cubemaps of medium quality with the less extra instructions possible, I choose the aggressive HDR format describe in [10]. A good alternative will be to let the artists choose the compression format and to automatically default to the compression format which best fit the required range (Like RGBE for a very large range). For the aggressive format, the HDR cubemap compress to a sRGB DXT1 which is compact (half the cost of any DXT5 RGBM)  and it require really few instructions to decompress.
The method is to calc the maximum values of each channel (R, G and B) of all texels of the cubemap, clamped to a threshold. Divide each texel by this maximum value. Then store the result in sRGB DXT1 :

// Sample for R channel but same code for other.
// MaxPixelColor is the Max of the 3 RGB channels in the whole cubemap
StorageColor.R = Clamp(SrcPtr[0], 0.0f, MAX_HDR_CUBEMAP_INTENSITY);
StorageColor.R = MaxPixelColor > 0.0f ? StorageColor.R / MaxPixelColor : 0.0f;
StorageColor.R = ConvertTosRGB(StorageColor.R);

Compression quality depends on cubemap content and MAX_HDR_CUBEMAP_INTENSITY. I chose 8 for MAX_HDR_CUBEMAP_INTENSITY which effectively give MDR rather than HDR. The result is good enough for my needs. As a simple comparison: a 1024×1024 DXT1 texture take same memory as ten 128x128x6 DXT1 cubemap.

In the shader, you just have to do a multiply to recover the original range. The multiply is done in linear space after the hardware sRGB decompression:

half3 CubeColor = texCUBElod(CubeTexture, half4(CubeDir, MipmapIndex)).rgb;
half3 FinalCubeColor  = CubeColor * MaxPixelColor;

Advice: You can choose to compress each channel with its own channel maximun instead of using the maximun of the three channels. In this case, care must be taken about the difference between channel maximun (i.e your three maximun are very different values like R:8, G:1, B:1). When the difference between maximun is high, compressing the channel individually cause a perceptual hue error. Here is an example of a lightprobe generated in a scene with a highly bright red light causing maximun color to be R:8, G:1.5, B:1.5. The lightprobe should be grey everywhere expect for the light (not show here):

Here the compression of the Red channel by 8 introduce error in the decompression compare to the two other channel which cause a red shift (Compare with the image below). If you instead chose the maximun of the three values for the compression (i.e 8), you get this:

Cubemaps相关_第6张图片

This is perceptually closest to the right result because all channels are compress with the same value and introduce the same error, so no hue shift, however the decompression is less accurate. To sum up,  compressing with three individual channel are more accurate, it better conserve luminance, but in case of high difference between maximun it cause hue shift; compressing with maximun of the three channel is less accurate but do not produce hue shift. Better is to let the control to the artists.

Reusing available cubemap

The best way to save memory is to reuse cubemap. The exact content of a cubemap matter less than its average color and intensity [10]. So we want to perform some modifications of an original cubemap to match the lighting environment at a given location. These modifications allow reusing a cubemap at several place instead of generating a new cubemap for each new location. There is several ways to do that:

– “Normalizing” cubemap as describe in [10]

“normalizing” the environment map (dividing it by its average value) in a pre-process, and then multiplying it by the diffuse ambient in the shader. The diffuse ambient value used should be averaged over all normal directions, in the case of spherical harmonics this would be the 0-order SH coefficient.

– Multiply by ambient lighting ratio:
All cubemaps must be generated in engine. At the generation time, get the diffuse ambient (as define above). When reusing the cubemap at a new location, get the new diffuse ambient then modulate cubemap by the ratio (“new diffuse ambient” / “old diffuse ambient”).

– Mixing based on gloss:
This way is describe in the comment of this post by Brian Karis: With low gloss value, use the color of the lightmap * intensity of cubemap. With high gloss value, it is the reverse, intensity of lightmap * color of cubemap. Both cases use a normalized cubemap and modulate.

– Desaturate the cubemap :
Convert cubemap to grey scale luminance, then recolorize with the diffuse ambient (average color of ambient lighting). Note that with this method, you have only one channel to store cubemap. Battlefield 3 [5] use this kind of approach. They store some black and white cubemap in one RGBA8 cube texture (one for glass, one for weapons, etc…) and they modulate with the ambient lighting given by their dynamic radiosity system.

– Color correct the cubemap [3]:
Apply any color correction which will allow matching the new location. Color correction can take the form of curve (highlight, midtown, shadow), simple color modulation, desaturation etc…
Killzone 2 [15] can specify a color multiplier and a desaturate factor by zone to apply on the cubemap sample (And they can multiply by SH sample too in a way similar to previous methods).

The diffuse ambient is often take from SH Lightprobe or dynamic radiosity, but it can be taken by pixel from the lightmaps.

All these way have similar drawbacks. They all fail with complex scene lighting consist of many different colored light. Of course, the fail is not too hard and will probably not be noticed by player, but your only choice to be fully accurate is to generate more cubemap. These ways could also be use to minimize artifact with POI-based approach.

Local cubemaps blending weights calculation

Several strategies based on local cubemap like the POI-based cubemap method require retrieving the K nearest local cubemap of the POI and blend them together. The problem of getting the K nearest cubemap is similar to the SH irradiance volume problem. [13][20][21] Give sample of interpolation method for solving it. However, for specular cubemap, as you can’t have so much lightprobe in your level, simpler algorithm are preferable.  I use an octree for this. The blending calculation can be trickier if you want to avoid popping and this section will describe one method I develop for local cubemaps (which mean cubemaps with a location).

The algorithm aiming to work with influence volume associate to each cubemap store in an octree. This will allow gathering cubemap close to the POI  efficiently. If the current POI is outside of influence volume, the cubemap will not be taken into account. This implies that for smooth transition, influence volumes must overlap. We provide artists with boxes and spheres influences volumes. Using only spheres may cause large overlap that we want to avoid. Think about several corridors in a U shape on different floors.

The figure shows a top view of a toy level with 3 cubemap volumes overlapping in green. One square, two spheres. The red circle represents an inner range. When the POI is in inner range, it gets 100% contribution of the cubemap. If no inner range is defined, this is the center of the influence volume.The inner range was an artist’s request.
In order to not to have any lighting pop, we define a set of rules that our algorithm must follow, including artists’ request.  The cubemap will have 100% influences at boundary of an inner range and 0% at boundary of outer range. So the influence volume weights follow a pattern similar to distance fields but inversed and normalized.

We also add the rule that a small volume inside another bigger volume must contribute more to the final result but should respect previous constraints. This allows artists to increase accuracy in a localized region.

In the following we define the variable “NDF” (normalized distance field) as 0 at inner range boundary and 1 at outer range boundary, < 0 if inside inner range. The algorithm starts by gathering all the influence volumes intersecting the POI position. For each influence volumes it calculates the volume influence weight (the NDF value). Selected influence volumes are then sorted by most influential. Then for each selected influence volumes, we calculate the sum of volume influence weights and the sum of the inverse. These sums are then used to get two results. The first result enforces the rule that at the boundary we have 0% contribution and the second enforces the rule that at the center we have 100% contribution, whatever the number of overlapping primitives. We then modulate the two results. To finish, all blend factors are normalized. Here is the pseudo-code:

Box::GetInfluenceWeights()
{
    // Transform from World space to local box (without scaling, so we can test extend box)
    Vector4 LocalPosition = InfluenceVolume.WorldToLocal(SourcePosition);
    // Work in the upper left corner of the box.
    Vector LocalDir = Vector(Abs(LocalPosition.X), Abs(LocalPosition.Y), Abs(LocalPosition.Z));
    LocalDir = (LocalDir - BoxInnerRange) / (BoxOuterRange - BoxInnerRange);
    // Take max of all axis
    NDF = LocalDir.GetMax();
}

Sphere::GetInfluenceWeights()
{
    Vector SphereCenter            = InfluenceVolume->GetCenter();
    Vector Direction               = SourcePosition - SphereCenter;
    const float DistanceSquared    = (Direction).SizeSquared();
    NDF = (Direction.Size() - InnerRange) / (OuterRange - InnerRange);
}

void GetBlendMapFactor(int Num, CubemapInfluenceVolume* InfluenceVolume, float* BlendFactor)
{
    // First calc sum of NDF and InvDNF to normalize value
    float SumNDF            = 0.0f;
    float InvSumNDF         = 0.0f;
    float SumBlendFactor    = 0.0f;
    // The algorithm is as follow
    // Primitive have a normalized distance function which is 0 at center and 1 at boundary
    // When blending multiple primitive, we want the following constraint to be respect:
    // A - 100% (full weight) at center of primitive whatever the number of primitive overlapping
    // B - 0% (zero weight) at boundary of primitive whatever the number of primitive overlapping
    // For this we calc two weight and modulate them.
    // Weight0 is calc with NDF and allow to respect constraint B
    // Weight1 is calc with inverse NDF, which is (1 - NDF) and allow to respect constraint A
    // What enforce the constraint is the special case of 0 which once multiply by another value is 0.
    // For Weight 0, the 0 will enforce that boundary is always at 0%, but center will not always be 100%
    // For Weight 1, the 0 will enforce that center is always at 100%, but boundary will not always be 0%
    // Modulate weight0 and weight1 then renormalizing will allow to respects A and B at the same time.
    // The in between is not linear but give a pleasant result.
    // In practice the algorithm fail to avoid popping when leaving inner range of a primitive
    // which is include in at least 2 other primitives.
    // As this is a rare case, we do with it.
    for (INT i = 0; i < Num; ++i)
    {
        SumNDF       += InfluenceVolume(i).NDF;
        InvSumNDF    += (1.0f - InfluenceVolume(i).NDF);
    }

    // Weight0 = normalized NDF, inverted to have 1 at center, 0 at boundary.
    // And as we invert, we need to divide by Num-1 to stay normalized (else sum is > 1). 
    // respect constraint B.
    // Weight1 = normalized inverted NDF, so we have 1 at center, 0 at boundary
    // and respect constraint A.
    for (INT i = 0; i < Num; ++i)
    {
        BlendFactor[i] = (1.0f - (InfluenceVolume(i).NDF / SumNDF)) / (Num - 1);
        BlendFactor[i] *= ((1.0f - InfluenceVolume(i).NDF) / InvSumNDF);
        SumBlendFactor += BlendFactor[i];
    }

    // Normalize BlendFactor
    if (SumBlendFactor == 0.0f) // Possible with custom weight
    {
        SumBlendFactor = 1.0f;
    }

    float ConstVal = 1.0f / SumBlendFactor;
    for (int i = 0; i < Num; ++i)
    {
        BlendFactor[i] *= ConstVal;
    }
}

// Main code
for (int i = 0; i < NumPrimitive; ++i)
 {
     if (In inner range)
         EarlyOut;

     if (In outer range)
        SelectedInfluenceVolumes.Add(InfluenceVolumes.GetInfluenceWeights(LocationPOI));
 }

SelectedInfluenceVolumes.Sort();
GetBlendMapFactor(SelectedInfluenceVolumes.Num(), SelectedInfluenceVolumes, outBlendFactor)

Here is result for different situation. Each influence volume is represented by a color (2 circles: Red, green and a box : Blue). The weights are represented by the blending of the respective color of each influence volume. A pure red mean 100% contribution, a 50% red mean 50% contribution from the red influence volume. Inner ranges are represented by small influence volume with a white line inside bigger influence volume of the same color. Click for high res:

The image above highlight that there is no popping with common situation and that we fully respect the rules we set. The algorithm even provides a pleasant transition. It however fails in some stressful condition we rarely met. In case you want to play with it, here is a link of the RenderMonkey project I use for generate this image: Siggraph_NormalizedDistance (This is a .pdf file as wordpress don’t support .zip file, right click and select save the link as then rename the file to .rfx after download).

There is also a video showing the influence volume edition and associated debugging tools here : Local Image-based Lighting With Parallax-corrected Cubemap : Influence volume .

class="youtube-player" type="text/html" width="560" height="315" src="https://www.youtube.com/embed/f8_oEg2s8dM?version=3&rel=0&fs=1&autohide=2&showsearch=0&showinfo=1&iv_load_policy=1&wmode=transparent" allowfullscreen="true" style="border-width: 0px; border-style: initial;">

 

Added note:

Other more complex method like Delauney triangulation [13][21] or Voronoi diagram could be used, but I found this one simple and efficient.
Kill zone 2 for sample used a simple distance based interpolation [14] for their SH lightprobe interpolation.
Thanks to my co-worker Antoine Zanuttini for providing me the base RenderMonkey project I used to generate the picture.

Efficient GPU cubemaps blending

The POI-based cubemap method, in the context of a forward renderer, requires an efficient way of blending multiple cubemaps (including mipmaps). When dealing with multiple cubemaps, it can be costly to blend them inside the objects shader (And even more when dealing with parallax-corrected cubemap as explain at the end of this post).  Instead, we chose to blend cubemaps on CPU or GPU separately in a step before the main rendering.

In this section, I will describe a pseudo DX9 implementation method to efficiently blend multiple cubemaps on GPU. For more real usage case, the code comes with the HDR cubemap format I use. This allows seeing the performance linked to this particular format.

The goal is to generate a new cubemap, based on K weighted cubemap. For this method, all your cubemap required to be the same size (128x128x6 here). The algorithm is simple:

 For each face
    For each mipmap of the resulting cubemap
        Setup the current mipmap face as the current rendertarget
        Render the weighted sum of current mipmap of the K cubemaps

I will not details the DX9 creation of the cubemap. You need to create a cube texture and save a surface pointer on each of the mipmap of each face of the cubemap. The following pseudo implementation describe the main blending loop:

const int CubemapSize    = 128;
int         SizeX        = CubemapSize;
const int NumMipmap      = 7;

for (int FaceIdx = 0; FaceIdx < CubeFace_MAX; FaceIdx++)
{    
    // The mipmap follow each other in memory, so used this order
    for (INT MipmapIndex = 0; MipmapIndex < NumMipmap; ++MipmapIndex)
    {
        SizeX = CubemapSize >> MipmapIndex;
        Direct3DDevice->SetRenderTarget(BlendCubemapTextureCube->CubeFaceSurfacesRHI[FaceIdx * NumMipmap + MipmapIndex]);
        // No alpha blending, no depth tests or writes, no stencil tests or writes, no backface culling.
        Direct3DDevice->SetRenderState(...)
        // Write in rendertarget as sRGB for the agressive HDR format in place
        Direct3DDevice->SetRenderState(D3DRS_SRGBWRITEENABLE,TRUE);
        SetBlendCubemapShader(FaceIdx, MipmapIndex);
        DrawQuad(0, 0, SizeX, SizeX);
    }
}

The cubemap rendertarget destination is RGBA8bit for performance, so we need to compress the result. I use the hardware sRGB to read from blended HDR cubemap and to write in the resulting HDR cubemap which allow saving some instructions. I generate a shader for each face of the cubemap for efficiency (Each of them have a different define for FACEIDX, see shader code below). The SetBlendCubemapShader in the code above will select the shader for the current face and set the current mipmap index. The shader code is in charge to sample each of the texel of the K cubemap and blend them. Here is the code:

#define FACE_POS_X 0
#define FACE_NEG_X 1
#define FACE_POS_Y 2
#define FACE_NEG_Y 3
#define FACE_POS_Z 4
#define FACE_NEG_Z 5

void BlendCubemapVertexMain(
    in float4 InPosition : POSITION,
    out float4 OutPosition : POSITION
    )
{
    OutPosition = InPosition;
}

half4    VPosScaleBias;
half     MipmapIndex;

samplerCUBE BlendCubeTexture1;
samplerCUBE BlendCubeTexture2;
samplerCUBE BlendCubeTexture3;
samplerCUBE BlendCubeTexture4;

half    MaxPixelColor1;
half    MaxPixelColor2;
half    MaxPixelColor3;
half    MaxPixelColor4;

// Get direction from cube texel for a given face. x and y are in the [-1, 1] range.
half3 GetCubeDir(half x, half y )
{
    // Set direction according to face.
    // Note : No need to normalize as we sample in a cubemap
#if FACEIDX == FACE_POS_X
    return half3(1.0, -y, -x);
#elif FACEIDX == FACE_NEG_X
    return half3(-1.0, -y, x);
#elif FACEIDX == FACE_POS_Y
    return half3(x, 1.0, y);
#elif FACEIDX == FACE_NEG_Y
    return half3(x, -1.0, -y);
#elif FACEIDX == FACE_POS_Z
    return half3(x, -y, 1.0);
#elif FACEIDX == FACE_NEG_Z
    return half3(-x, -y, -1.0);
#endif
}

void BlendCubemapPixelMain(
    float2 ScreenPosition: VPOS,
    out half4 OutColor : COLOR0
    )
{
    float2 xy        = VPosScaleBias.xy * ScreenPosition.xy + VPosScaleBias.zw;
    half3 CubeDir    = GetCubeDir(xy.x, xy.y);

    // We come from sRGB to linear then we apply the multiplier
    half3 CubeColor1 = texCUBElod(BlendCubeTexture1, half4(CubeDir, MipmapIndex)).rgb;
    half3 CubeColor  = CubeColor1 * MaxPixelColor1;

    half3 CubeColor2 = texCUBElod(BlendCubeTexture2, half4(CubeDir, MipmapIndex)).rgb;
    CubeColor        += CubeColor2 * MaxPixelColor2;

    half3 CubeColor3 = texCUBElod(BlendCubeTexture3, half4(CubeDir, MipmapIndex)).rgb;
    CubeColor        += CubeColor3 * MaxPixelColor3;

    half3 CubeColor4 = texCUBElod(BlendCubeTexture4, half4(CubeDir, MipmapIndex)).rgb;
    CubeColor        += CubeColor4 * MaxPixelColor4;

    OutColor         = half4(CubeColor / 8, 1.0f); // the convert to sRGB is done when writing in the rendertaget
}

FACEIDX is a define set to the current face to blend, each different value mean a different shader.
MaxPixelColorX are the  constant use to decompresses HDR cubemap after the hardware decompress the sRGB value. Note the divide by 8 (which is the constant MAX_HDR_CUBEMAP_INTENSITY) at the end of the shader. To save instruction I just divide by the max range allowed and the result is store in RGBA8 (But you can use the max of all MaxPixelColor for more accuracy). So when using the result of the blending in other shader,  the MaxPixelColor must be set to 8.
MipmapIndex is the current blended mipmap .
VPosScaleBias are four values allowing to transform from VPOS register to a [-1..1] interval required by the GetDir() function. This could be done in the shader, but for efficiency, the transform are baked in the VPosScaleBias.  The VPOS register and the GetDir() allow to retrieve the direction to use to sample the blended cubemap. Here is the code for the VPosScaleBias setup :

// We need to transform VPOS in [-1, 1] range on X and Y axis.
float InvResolution = 1.0f / float(128 >> MipmapIndex); // 128 is the size of the cubemap
// transform from [0..res-1] to [- (1 - 1 / res) .. (1 - 1 / res)]
// VPos register is 0 left, RT_SizeX right and 0 Top, RT_sizeY bottom
// (UsePixelCenterOffset is 0.5 for DX9, 0 else)
Vector4 ScaleBias = Vector4(2.0f * InvResolution, 2.0f * InvResolution, -1.0f + (UsePixelCenterOffset * 2.0f * InvResolution), -1.0f + (UsePixelCenterOffset * 2.0f * InvResolution));

Here is tab showing the performance you can get with this method on PS3 and XBOX360. It requires the knowledge of the hardware layout on PS3 that I can’t detail here.

An alternative to this code is to render an infinite box from the 6 views of the resulting cubemap centered on the origin (0, 0, 0). Each cubemap are additively blended with their corresponding weight. Example code :

#define HALF_WORLD_MAX 262144.0
Matrix CubeLocalToWorld = ScaleMatrix(HALF_WORLD_MAX);
SetShader(...)
SetAdditiveBlending()

for each view of the blended cubemap
{
    for each mipmap
    {
        for each cubemap to blend
        {
            SetEnvMapScale(EnvMapScale * Blendweights);
            SetMipmapIndex(...);
            SetViewProjectionMatrix(CalcCubeFaceViewMatrix() * ProjectionMatrix);
            DrawUnitBox(...)
        }
    }
}

// In the shader

float4    MipmapIndex;
float4x4  LocalToWorld;
float4x4  ViewProjectionMatrix;
sampler2D BlendCubeTexture;
float3    EnvMapScale;

void BlendCubemapVertexMain(
    in float4 InPosition : POSITION,
    out float4 OutPosition : POSITION,
    out float3 UVW : TEXCOORD0
    )
{
    float4 WorldPos = mul( LocalToWorld, InPosition);
    float4 ScreenPos = mul( ViewProjectionMatrix, WorldPos);

    OutPosition = ScreenPos;
    UVW = WorldPos.xyz;
}

void BlendCubemapPixelMain(
    in float3 UVW : TEXCOORD0,
    out float44 OutColor : COLOR0
    )
{
    float3 CubeSample = texCUBElod(BlendCubeTexture, float4(UVW, MipmapIndex.x)).xyz;
    OutColor = half4(EnvMapScale.xyz * CubeSample / 8.0f, 1.0f);
}

The result is the same. Also, even if it look more simpler, the performance for one cubemap are the same but when using several cubemap, it is faster to use the previous method on PS3. So it depends on the context/platform.

Parallax correction for local cubemaps

As said in the first section, a cubemap represent an infinite box by definition. When used as a local cubemap, there is parallax issue (Reflected objects are not at the right position). This section explain some techniques that can be used to correct the parallax issue of a local cubemaps to better match the reflected objects placement.

The common point of every parallax-correction techniques is to define an approximation of the geometry (we will call this geometry proxy) surrounding the local cubemap. The simpler is the approximation, the more efficient will be the algorithm at the price of accuracy.  Example of geometry proxy are sphere volume [16], box volume [17] or cube depth buffer [22]. Here is an example of a cubemap with an associated box volume in white.

In the shader, we perform an intersection between the reflection vector and the geometry proxy. This intersection is then use to correct the original reflection vector to a new direction. As the interaction must be performing in the shader, you can see how performance is linked to the choice of geometry proxy.

Here is an example of the algorithm with a simple AABB volume:

The hatched line is the reflecting ground and the yellow shape is the environment geometry. A cubemap has been generated at position C. A camera is looking at the ground. The view vector reflected by the surface normal R is normally used to sample the cubemap. Artists define an approximation of the geometry surrounding the cubemap using a box volume. This is the black rectangle in the figure. It should be noted that the box center doesn’t need to match the cubemap center. We then find P, the intersection between vector R and the box volume. We use vector CP as a new reflection vector R’ to sample the cubemap.

Here is the code, this is a simple box intersection with some simplifications:

float3 DirectionWS = PositionWS - CameraWS;
float3 ReflDirectionWS = reflect(DirectionWS, NormalWS);

// Following is the parallax-correction code
// Find the ray intersection with box plane
float3 FirstPlaneIntersect = (BoxMax - PositionWS) / ReflDirectionWS;
float3 SecondPlaneIntersect = (BoxMin - PositionWS) / ReflDirectionWS;
// Get the furthest of these intersections along the ray
// (Ok because x/0 give +inf and -x/0 give –inf )
float3 FurthestPlane = max(FirstPlaneIntersect, SecondPlaneIntersect);
// Find the closest far intersection
float Distance = min(min(FurthestPlane.x, FurthestPlane.y), FurthestPlane.z);

// Get the intersection position
float3 IntersectPositionWS = PositionWS + ReflDirectionWS * Distance;
// Get corrected reflection
ReflDirectionWS = IntersectPositionWS - CubemapPositionWS;
// End parallax-correction code

return texCUBE(envMap, ReflDirectionWS);

AABB volume is rather restricted and it is better to use an OBB volume instead. Here is a sample of implementation with an OBB volume :

float3 DirectionWS = normalize(PositionWS - CameraWS);
float3 ReflDirectionWS = reflect(DirectionWS, NormalWS);

// Intersection with OBB convertto unit box space
// Transform in local unit parallax cube space (scaled and rotated)
float3 RayLS = MulMatrix( float(3x3)WorldToLocal, ReflDirectionWS);
float3 PositionLS = MulMatrix( WorldToLocal, PositionWS);

float3 Unitary = float3(1.0f, 1.0f, 1.0f);
float3 FirstPlaneIntersect  = (Unitary - PositionLS) / RayLS;
float3 SecondPlaneIntersect = (-Unitary - PositionLS) / RayLS;
float3 FurthestPlane = max(FirstPlaneIntersect, SecondPlaneIntersect);
float Distance = min(FurthestPlane.x, min(FurthestPlane.y, FurthestPlane.z));

// Use Distance in WS directly to recover intersection
float3 IntersectPositionWS = PositionWS + ReflDirectionWS * Distance;
float3 ReflDirectionWS = IntersectPositionWS - CubemapPositionWS;

return texCUBE(envMap, ReflDirectionWS);

For each texel of the cubemap, we transform the corresponding view vector into a unit box space to perform the intersection. Note the optimization step when transforming the result back to world space to get the final corrected vector. Example result:

Cubemaps相关_第7张图片
For an example with a sphere refer to [16].

Cheaper tricks exist to have a parallax effect. Also it requires tuning. It has been describe in [18] [23] and is also use in [8] and seems originated from Advanced Renderman book publish in 2000.

It consists to add a scaled version of the vector from the cubemap position C to the point on the object being draw Pp. The implementation is simple:

ReflDirectionWS= EnvMapOffset * (PositionWS - CubemapPositionWS) + ReflDirectionWS;

No need to normalize as we fetch into a cubemap. The EnvMapOffset is an artist’s tunable value which depends on objects size, size of the environment etc… Brennan in [23] uses (1 / radius) where radius is the radius of the sphere geometry proxy. Here is a sample of tuned parameter. First image is default, second is with EnvMapOffset tuned:


In the same spirit, Kill zone 2/3 use offset when sampling cubemap in order to change the “size” of the cubemap displayed inside an object [3].

Efficient GPU local parallax-corrected cubemap blending

I will now apply the result of the two previous sections with my POI-based approach. This is what I call : Local IBL approach with parallax-corrected cubemap. This approach has been develop by both me and my co-worker Antoine Zanuttini.
As detail in “Efficient GPU cubemap blending” section, we have a dedicated step for the blending of cubemap. We are looking for to parallax-correct the cubemap when blending them. However in this mixing step we don’t have access to the pixel position. To solve this, we made the following observation:

Cubemaps相关_第8张图片The left figure represent our previous case of a simple box as geometry proxy for the parallax correction. What we observe,  on the middle figure, if we add  the view vector of the reflected camera V’ is that it matches the reflected view vector R. We can see that both vectors V’ and R intersect the same point P. This point P could be used like before to get a new reflection vector R’ to sample a cubemap. Right figure is the result of applying this reasoning to each view direction of a cubemap. We are now able to parallax-correct the whole cubemap without requiring any pixel position. We just require a reflection plane which substitute to the pixel position. However this will restrict our parallax-correction approach to planar objects.

Here is an example of the code for the mixing step which also handles the parallax-correction with box geometry proxy as we just see. The code only handles one face of the blended cubemap with only one cubemap (several cubemap is identical to the “Efficient GPU cubemap blending” section):

float4    VPosScaleBias;
float     MipmapIndex;
float4x4  WorldToLocal;
float3    ReflCameraWS;
samplerCUBE BlendCubeTexture;
float3    EnvMapScale;

void BlendCubemapPixelMain(
 float2 ScreenPosition: VPOS,
 out half4 OutColor : COLOR0
 )
{
 float2 xy        = VPosScaleBias.xy * ScreenPosition.xy + VPosScaleBias.zw;
 half3 CubeDir    = GetCubeDir(xy.x, xy.y);

 // Intersection with OBB convert to unit box space
 half3 RayWS = normalize(GetCubeDir(xy.x, xy.y)); // Current direction
 half3 RayLS = mul((half3x3)WorldToLocal, RayWS);
 half3 ReflCameraLS = mul(WorldToLocal, ReflCameraWS); // Can be precalc

 // Same code as before but for ReflCameraLS and with unit box
 half3 Unitary = half3(1.0f, 1.0f, 1.0f);
 half3 FirstPlaneIntersect    = (Unitary - ReflCameraLS) / RayLS;
 half3 SecondPlaneIntersect = (-Unitary - ReflCameraLS) / RayLS;
 half3 FurthestPlane = max(FirstPlaneIntersect, SecondPlaneIntersect);
 float Distance = min(FurthestPlane.x, min(FurthestPlane.y, FurthestPlane.z));
 // Use Distance in WS directly to recover intersection
 half3 IntersectPositionWS = ReflCameraWS + RayWS * Distance;
 half3 ReflDirectionWS = IntersectPositionWS - CubemapPositionWS;

 half3 CubeColor = texCUBElod(BlendCubeTexture, half4(ReflDirectionWS, MipmapIndex)).rgb;
 CubeColor       = CubeColor * EnvMapScale.xyz;
 OutColor         = half4(CubeColor / 8, 1.0f);
}

As in the “Efficient GPU cubemap blending” section we have an alternative way to get the same result. An alternative to this code is to render the box geometry proxy (instead of the infinite box) from the 6 views of the resulting cubemap from the point of view of the reflected camera (instead of the origin). Each cubemap are additively blended with their corresponding weight. But in this case we can do even better. Why limit ourselves to render in a box. With rasterization we can throw out the limitation on the shape and can start to handle convex volume! Concave volumes still have artifacts due to hidden features. For this, we allow our artists to define BSP geometry (with a brush tools in an editor for sample), and convert the bsp to triangle. Then render this triangle list instead of the simple box.

Here is an example of convex volume usage which fit the environment (right figure),  in-place of a box volume (middle figure). Left figure show the original no corrected cubemap:

And here is the code to perform the parallax-correction for a convex volume. Shader code is for one face of the cubemap.

// C++
SetShader(...)
SetAdditiveBlending()

for each view of the blended cubemap
{
    for each mipmap
    {
        for each cubemap to blend
        {
            Matrix Mirror = CreateMirrorMatrix(ReflectionPlaneWS of this cubemap);
            Vector ReflectedViewPositionWS = Mirror.TransformVector(ViewPositionWS);
            SetEnvMapScale(EnvMapScale * Blendweights);
            SetMipmapIndex(...);
            SetViewProjectionMatrix(CalcCubeFaceViewMatrix(ReflectedViewPositionWS) * ProjectionMatrix);
            DrawConvexVolume(...);
        }
    }
}
// Shader
float4    MipmapIndex;
float4x4  LocalToWorld;
float4x4  ViewProjectionMatrix;
sampler2D BlendCubeTexture;
float3    EnvMapScale;
float3    CubemapPos;

void BlendCubemapVertexMain(
    in float4 InPosition : POSITION,
    out float4 OutPosition : POSITION,
    out float3 UVW : TEXCOORD0
    )
{
    float4 WorldPos = mul( LocalToWorld, InPosition);
    float4 ScreenPos = mul( ViewProjectionMatrix, WorldPos);

    OutPosition = ScreenPos;
    UVW = WorldPos.xyz - CubemapPos; // Current direction
}

void BlendCubemapPixelMain(
    in float3 UVW : TEXCOORD0,
    out float44 OutColor : COLOR0
    )
{
    float3 CubeSample = texCUBElod(BlendCubeTexture, float4(UVW, MipmapIndex.x)).xyz;
    OutColor = half4(EnvMapScale.xyz * CubeSample / 8.0f, 1.0f);
}

Note that even if we use a bsp as geometry proxy for the parallax correction, we still use simple box or sphere as influence volume to not overload the gathering cubemap algorithm.

With all this in hands, we can now apply the  “Local IBL approach with parallax-corrected cubemap” approach. The steps are:

– Retrieve with the help of influence volume the closest cubemap from the POI
– Calc blending weight like in “Local cubemaps blending weights calculation”
– Perform the mixing step on the GPU with the convex parallax-correction blending code just above.
– Apply the result in the main pass as usual.

Here is a video example of the result with 3 cubemaps with 3 sphere influence volumes and box geometry proxy: Local IBL approach with parallax-corrected cubemap

class="youtube-player" type="text/html" width="560" height="315" src="https://www.youtube.com/embed/Bvar6X0dUGs?version=3&rel=0&fs=1&autohide=2&showsearch=0&showinfo=1&iv_load_policy=1&wmode=transparent" allowfullscreen="true" style="border-width: 0px; border-style: initial;">

 

Now let’s see some performance result. Blending cubemaps with convex parallax-correction of 128x128x6 resolution in DXT1 give the following results:

It is useful to compare these numbers to the actual cost of applying the parallax-correction directly in the object’s shader (i.e in the main pass without a mixing step). Left number is for 25% screen coverage and right is for 75% screen coverage:

Cubemaps相关_第9张图片

The number of the second table has been obtain by subtracting the cost of the parallax-correction shader with the original shader. Reading this result require some explanation. With the per pixel correction (i.e the second table), the result are depending on the screen coverage of the scene object. For 25% screen coverage each new cubemaps to mix is around 0.25ms and for 75% it is around 0.75ms on PS3. This is to compare to the mixing step approach where an additional cubemaps is only 0.08ms.  It can be observe that for the per pixel correction, the XBOX perform better with one cubemap but drop more quickly by increasing the number of cubemaps. To sum up, our mixing approach better scale with many cubemaps.

Finally, the parallax-correction step describe above only work perfectly with mirror surface. Handling normal perturbation for glossy materials implies that we can access the lower hemisphere below the reflection plane. This means that our reflected camera’s view vector will miss the intersection with the box volume. To avoid this problem, artists must ensure that the reflected camera will always remain within the box volume:

Moreover, with glossy material, we introduce a distortion compared to the true result. The effect increases with the angle between the reflection plane normal and the perturbed normal.

The source of this distortion is due to the way we generate our parallax-corrected cubemap. With our algorithm, the parallax-corrected cubemap is valid only for pixels with normals perpendicular to the reflection plane.

Cubemaps相关_第10张图片

On the left figure several different intersections are calculated for different camera views. The entire ground surface has the same normal perpendicular to the reflection plane. On the right figure, we take the example of a single position and perturb its normal in different directions. In this case, the camera position needs to be moved. Our previous parallax-corrected camera is not able to provide the correct value in this case. We could generate a parallax-correct cubemap for this position for every possible normal but the result will be wrong for other positions.

Reference

[1]  McTaggarts, “Half-Life 2 Valve Source Shading”  http://www2.ati.com/developer/gdc/D3DTutorial10_Half-Life2_Shading.pdf
[2]  http://developer.valvesoftware.com/wiki/Cubemaps
[3]  Van Beek, “Killzone lighting pipeline” not available publicly
[4] Andersson, Tatarchuck, “Rendering Architecture and Real-time Procedural Shading & Texturing Techniques” http://www.slideshare.net/repii/frostbite-rendering-architecture-and-realtime-procedural-shading-texturing-techniques-presentation
[5] Magnusson, “Lighting you up in Battlefield 3” http://www.slideshare.net/DICEStudio/lighting-you-up-in-battlefield-3
[6] Sousa, Kasyan, Schulz, “Secrets of CryENGINE 3  Graphics Technology” http://advances.realtimerendering.com/s2011/SousaSchulzKazyan%20-%20CryEngine%203%20Rendering%20Secrets%20((Siggraph%202011%20Advances%20in%20Real-Time%20Rendering%20Course).ppt
[7] Karis, “comment” http://blog.selfshadow.com/2011/07/22/specular-showdown/#comments
[8] Gotanda, “Real-time Physically Based Rendering – Implementation” http://research.tri-ace.com/Data/cedec2011_RealtimePBR_Implementation.pptx
[9] “Unity RGBM toy” http://beta.unity3d.com/jcupisz/rgbm/index.html
[10] Hoffman, “Crafting Physically Motivated Shading Models for Game Development” and “Background: Physically-Based Shading” http://renderwonk.com/publications/s2010-shading-course/
[11] Kaplanyan, “CryENGINE 3: Reaching the Speed of Light” http://advances.realtimerendering.com/s2010/index.html
[12] Karis, “RGBM Color encoding” http://graphicrants.blogspot.com/2009/04/rgbm-color-encoding.html
[13] Cupisz, “LightProbe” http://blogs.unity3d.com/2011/03/09/light-probes/
[14] van der Leeuw, “The playstation3 spus in the real world” http://www.slideshare.net/guerrillagames/the-playstation3s-spus-in-the-real-world-a-killzone-2-case-study-9886224
[15] Personal communication with Michal Valient of Guerrilla game
[16] Bjorke, “Image Based-Lighting” http://http.developer.nvidia.com/GPUGems/gpugems_ch19.html
[17] behc, “Box Projected Cubemap Environment Mapping” http://www.gamedev.net/topic/568829-box-projected-cubemap-environment-mapping/
[18] Mad Mod Mike demo, “The Naked Truth Behind NVIDIA’s Demos”  http://ftp.up.ac.za/mirrors/www.nvidia.com/developer/presentations/2005/SIGGRAPH/Truth_About_NVIDIA_Demos.pdf
[19] Lazarov, “Physically Based Lighting in Call of Duty: Black Ops”, http://advances.realtimerendering.com/s2011/Lazarov-Physically-Based-Lighting-in-Black-Ops%20(Siggraph%202011%20Advances%20in%20Real-Time%20Rendering%20Course).pptx 
[20] Hall, Edwards, “Rendering in Cars 2”, http://advances.realtimerendering.com/s2011/Hall,%20Hall%20and%20Edwards%20-%20Rendering%20in%20Cars%202%20(Siggraph%202011%20Advances%20in%20Real-Time%20Rendering%20Course).pptx
[21] Cupisz, “Light probe interpolation using tetrahedral tessellations”, http://robert.cupisz.eu/stuff/Light_Probe_Interpolation-RobertCupisz-GDC2012.pdf
[22] Szirmay-Kalos, Aszódi, Lazányi, and Premecz, “Approximate Ray-Tracing on the GPU with Distance Impostors.”, http://sirkan.iit.bme.hu/~szirmay/ibl3.pdf
[23] Brennan, “Accurate Environment Mapped Reflections and Refractions by Adjusting for Object Distance.”, http://developer.amd.com/media/gpu_assets/ShaderX_CubeEnvironmentMapCorrection.pdf
[24] Van Waveren, Castaño, “Real-Time YCoCg-DXT Compression”, http://developer.download.nvidia.com/whitepapers/2007/Real-Time-YCoCg-DXT-Compression/Real-Time%20YCoCg-DXT%20Compression.pdf






Windows AC/Row/Infinite

I’m neither a carpenter nor a stalker but i strongly love windows. Allow me to show you the most advanced windows i ever saw a game.

The NPCs in Assassins Creed 3 definitely suspect that Connor is a stalker because we made him stare at windows for hours in the last article about windows. Just for you to remember, this is how windows looked in the game:

Source: Assassins Creed 3

The interior is like a smooth texture which tiles and is moved via Bump Offset Mapping to fake some parallax movement. At least i believe so, because there’s a guy who explains how it can be done in UDK.

For Assassins Creed this “blurry” texture is in my eyes the perfect solution because it gives the glass a frosted/milky look which totally fits into the time where the game plays – but there are other ways to do windows and this article is about those.

X
Information
Thanks to  InvisGhost for hinting me to the great windows of Saints Row 3 !

Saints Row is a crazy game. Even the windows are crazy (good)! Some windows have a similar parallax effect like in Assassins Creed – but with sharp textures instead of smooth ones. I wouldn’t have thought that it works that well:

Source: Saints Row 3

Texture-Time! what you see is the “Barber (Night)” texture (see below) but isn’t there something missing? Yes, the posters!

Cubemaps相关_第11张图片

Source: Saints Row 3

They are a layer on top of the room texture. Below you can find the poster-texture.

Source: Saints Row 3

Even now when i write this article and look at the GIF i’m just thinking “Wow…works so great!”. Only in really narrow angles it gets visible that something is weird with the interior.

Source: Source: Saints Row 3

Also nice: when you have closer look you can see that the texture is tiling (like in Assassins Creed) and even this looks perfectly OK, because it results in a room corner. Nothing weird about that, right? Below a non-ingame demonstration of that effect:

Source: Saints Row 3

Here an ingame-example. Works great, doesn’t it?

Source: Saints Row 3

I hope you like it as much as i do. :) But we’re not done yet! Let’s visit Elizabeth.

X
Information
Thanks to  Lino for hinting me to the great windows of Bioshiock Inifite!

Now let’s get to one of the most advanced windows implementations. What do you see here?:

Source: Bioshock Infinite

It just looks like a wooden room containing some objects at billboards/planes to get some nice prallax effects, right? But not only Lino suspected something here:

Cubemaps相关_第12张图片

Source: Bioshock Infinite

Sidenode for those who don’t know the game: The level designers placed points where Elizabeth can interact with the environment – a really cool feature! But what she’s not doing (but Lino did) is looking at this from another angle:

Source: Bioshock Infinite

That’s unexpected, isn’t it? If would be “just” glass + billboards, shouldn’t we

a) see through to the other side and
b) still notice the billboards ?

Short answer: It’s another crazy shader trick. Except from looking at the geometry-wireframe i found the “proof” in one of the textures. I did a subtle mark:

Cubemaps相关_第13张图片

Source: Bioshock Infinite

I’m not sure what the other channels are about (i suspect it’s the glass distortion because you only need two channels for such a normal map) but the red channel definitely marks the eagle-area which looks like a sprite/billboard. And how does the actual texture looks, which is  controlled/masked by this channel?

Source: Bioshock Infinite

But wait! When the eagle-area moves a bit faster than the four guys in the background (when i strafe around)…why is there no gap? Like when you photoshop an image and want to move an object, you have to cut it out, move it and then fill the now uncovered area with something…that’s why there’s another texture just for the background:

Source: Bioshock Infinite

Except from the exciting double-use of (i think) bump-offsets for the two parallax-layers, there are two other interesting details:

1. If we look at the following GIF (again) we’ll see that the glass has kind of a total-reflection(you can’t look “inside” anymore from narrow angles, you just see a blueish reflection). In my eyes this is a good way to

a) make the glass look more realistic
b) avoid any visible texture-deformation at narrow angels like we could observe in SR3.

Source: Bioshock Infinite

2. The eagle-area turns at the player-camera even if it’s not a “real” billboard! How cool is that? Look carefully and you’ll see that it’s always oriented to you:

Source: Bioshock Infinite

So we saw epic shaders crafted by code magicians but … why? Why not just use a “simple” glas shader and build the interior by hand? I’m sure there are more answers to this but i see two major reasons:

1. Adding such a shader to a surface and let the code do the rest is surely extremely time saving in comparison to building all this by hand. Especially when it comes to global changes where you want to change all windows/interiors at once.

2. Transparency. To render transparent objects is expensive and produces artifacts e.g. in combination with fog (see this article as an example). I don’t know if the above shaders are cheaper than doing it with real transparency, but at least i would assume that you don’t run into sorting problems – which would be a good thing, right? :)

As every time, feel free to comment or mail or twitter me and share your knowledge. :) I’m sure there are a lot people out there which can explain in detail why it’s better to do it the mentioned way.



你可能感兴趣的:(Shader)