UnrealEngine3-渲染构架

Overview


This page covers all areas of the rendering subsystem in Unreal Engine 3.

本文覆盖了虚幻3引擎渲染子系统的方方面面。


Getting Started

There's a lot of rendering code in UE3 so it's hard to get a quick high level view of what's going on. A good place to start reading through the code is FSceneRenderer::Render, which is where a new frame is rendered on the rendering thread. It's also useful to take a PIX capture on Xbox 360 and look through the draw events, and inspect the render targets to see what is being accumulated. You can then find where these draw events are issued from code to find the corresponding C++.


UE3有很多渲染相关的代码,所以知道高级视图是怎样被渲染是很困难的事。刚开始阅读代码,从 FSceneRenderer::Render比较好,它是渲染线程新一帧渲染的地方。用它在PIX捕获Xbox 360的渲染事件也是有用的,可以查看渲染目标中什么被增加。您可以通过渲染事件找到相应的C++代码。

Technical rendering pages


  • Threaded Rendering
  • VisibilityCulling
  • GammaCorrection
  • ShaderHome
  • ShadowBufferFilteringTech
  • CreatingMaterialExpressions
  • ShaderModel2Fallback
  • PerfHUD
  • LightmassTechnicalGuide

All rendering pages

RenderingTopics


Shaders

See ShaderHome for low level information.


Global shaders

Global shaders are shaders which operate on fixed geometry (like a full screen quad) and don't need to interface with materials. Examples would be shadow filtering, or post processing. Only one shader of any given global shader type exists in memory and they are stored separately from packages in the global shader cache file. This allows them to be loaded earlier in the startup process so they can be used for things like bink startup movies.


全局shaders是一种不许要材质接口来操作固定几何管线(例如全屏矩形)的shader。例如 阴影过滤或者后处理。全局shader只能同时存在一种类型在内存,他们是和包分开存放,存放在全局shader缓存文件。这可以使得他们比启动过程中加载的早,所以它们可以用于像bink开局动画之类的东西。


Material and Mesh types

Materials are defined by a set of states that control how the material is rendered (blend mode, two sided, etc) and a set of material inputs that control how the material interacts with the various rendering passes.


材料是指由一组状态的控制如何呈现材料(混合模式时,双面等)和一组材质输入,控制材质和渲染过程变量交互。


Vertex Factories

Materials have to support being applied to different mesh types, and this is accomplished with vertex factories. A FVertexFactoryType represents a unique mesh type, and a FVertexFactory instance stores the per-instance data to support that unique mesh type. For example FGPUSkinVertexFactory stores the bone matrices needed for skinning, as well as references to the various vertex buffers that the GPU skin vertex factory shader code needs as input. The vertex factory shader code is an implicit interface which is used by the various pass shaders to abstract mesh type differences. Some important components of the vertex factory shader code are:

材质支持被应用到不同的网格类型,通过顶点工厂vertex factory来编译。 FVertexFactoryType代表一类独特的mesh类型,一个FVertexFactory实例存储每个实例的数据来支持网格的独特的属性。 例如:FGPUSkinVertexFactory 存储骨骼皮肤所需的的材质,以及GPU皮肤顶点工厂shader代码输入所需要的顶点缓存。顶点工厂shader代码是隐式的接口,按shader每一个批次的不同网格类型进行了抽象。工厂的顶点着色器代码的一些重要组成部分是:

 

  • FVertexFactoryInput - defines what the vertex factory needs as input to the vertex shader. These must match the vertex declaration in the C++ side FVertexFactory. For example, LocalVertexFactory's FVertexFactoryInput hasfloat4 Position        : POSITION;, which corresponds to the position stream declaration in FStaticMeshRenderData::SetupVertexFactory.
  • VertexFactoryGetWorldPosition - this is called from the vertex shader to get the world space vertex position. For static meshes this merely transforms the local space positions from the vertex buffer into world space using the LocalToWorld matrix. For GPU skinned meshes, the position is skinned first and then transformed to world space.
  • VertexFactoryGetInterpolants - transforms the FVertexFactoryInput to FVertexFactoryInterpolants, which will be interpolated by the graphics hardware before getting passed into the pixel shader
  • GetMaterialPixelParameters - this is called in the pixel shader and converts vertex factory specific interpolants (FVertexFactoryInterpolants) to the FMaterialPixelParameters structure which is used by the pass pixel shaders.
  • Etc for all the other VF interface functions

Material Shaders

Shaders using FMaterialShaderType are pass specific shaders which need access to some of the material's attributes, and therefore must be compiled for each material, but do not need to access any mesh attributes. The light function pass shaders are an example of FMaterialShaderTypes. 

Shaders using FMeshMaterialShaderType are pass specific shaders which depend on the material's attributes AND the mesh type, and therefore must be compiled for each material/vertex factory combination. For example TLightVertex/PixelShader need to apply a dynamic light to a mesh, but they need to access the mesh's position and normal to do this and they need to access the material's lighting inputs like Diffuse, Specular, etc.

A material's set of required shaders (which is a FMaterialShaderMap) looks like this:


  • FMaterialShaderMap (shaders for a single UMaterial)
    • FLightFunctionPixelShader - FMaterialShaderType
    • FLocalVertexFactory - FVertexFactoryType
      • FDepthOnlyVertexShader - FMeshMaterialShaderType
      • FDepthOnlyPixelShader - FMeshMaterialShaderType
      • TLightVertexShaderFPointLightPolicyFNoStaticShadowingPolicy - FMeshMaterialShaderType
      • TLightPixelShaderFPointLightPolicyFNoStaticShadowingPolicy - FMeshMaterialShaderType
      • Etc
    • FGPUSkinVertexFactory - FVertexFactoryType
      • Etc

Vertex factories are included in this matrix based on their ShouldCache function, which depends on the material's usage. For example bUsedWithSkeletalMesh being TRUE will include the GPU skin vertex factories. FMeshMaterialShaderType's are included in this matrix based on their ShouldCache function, which depends on material and vertex factory attributes. This is a sparse matrix approach to caching shaders and it adds up to a large number of shaders pretty quick which takes up memory and increases compile times. The major advantage over storing a list of actually needed shaders is that no list has to be generated, so needed shaders have always already been compiled before run time on consoles. UE3 mitigates the shader memory problem with shader compression, and the compile time problem with multicore shader compilation.


Creating a material shader

A material shader type is created with the DECLARE_SHADER_TYPE macro:

class FLightFunctionPixelShader : public FShader { DECLARE_SHADER_TYPE(FLightFunctionPixelShader,Material);

This declares the necessary metadata and functions for a material shader type. The material shader type is instantiated with IMPLEMENT_MATERIAL_SHADER_TYPE:

IMPLEMENT_MATERIAL_SHADER_TYPE(,FLightFunctionPixelShader,TEXT("LightFunctionPixelShader")

This generates the material shader type's global metadata, which allows us to do things like iterate through all shaders using a given shader type at runtime.

A typical material pixel shader type will first create a FMaterialPixelParameters struct by calling the GetMaterialPixelParameters vertex factory function. GetMaterialPixelParameters transforms the vertex factory specific inputs into properties like WorldPosition, TangentNormal, etc that any pass might want to access. Then a material shader will call CalcMaterialParameters, which writes out the rest of the members of FMaterialPixelParameters, after which FMaterialPixelParameters is fully initialized. The material shader will then access some of the material's inputs through functions in MaterialTemplate.usf (GetMaterialEmissive for the material's emissive input for example), do some shading and output a final color for that pass.


Special Engine Materials

UMaterial has a setting called bUsedAsSpecialEngineMaterial that allows the material to be used with any vertex factory type. This means all vertex factories are compiled with the material, which will be a very large set. bUsedAsSpecialEngineMaterial is useful for:

  • Materials used with rendering viewmodes like lighting only
  • Materials used as fallbacks when there is a compilation error (DefaultDecalMaterial for decals and DefaultMaterial for everything else)
  • Materials whose shaders are used when rendering other materials in order to cut down on the number of shaders that have to be cached. For example, an opaque material's depth-only shaders will produce the same depth output as the DefaultMaterial, so the DefaultMaterial's shaders are used instead and the opaque material skips caching the depth-only shader.

Scene representation

In UE3 the scene as the renderer sees it is defined by primitive components and the lists of various other stuff stored in FScene. An Octree of primitive components is maintained for accelerated spatial queries, such as view frustum culling.


Primitive components and proxies

Primitive components are the basic unit of visibility and relevance determination. For example, occlusion and frustum culling happen on a per-primitive basis. Therefore it's important when designing a system to think about how big to make components. Each component has a bounds that is used for various operations like culling, shadow casting and light influence determination.


FPrimitiveSceneProxy and FPrimitiveSceneInfo

FPrimitiveSceneProxy is the rendering thread version of UPrimitiveComponent that is intended to be subclassed depending on the component type. FPrimitiveSceneInfo is the rendering thread version of UPrimitiveComponent that is common to all primitives.


Important FPrimitiveSceneProxy methods
  • GetViewRelevance - Called from InitViews at the beginning of the frame, and returns a populated FPrimitiveViewRelevance.
  • DrawDynamicElements - Called to draw the proxy in any passes which the proxy is relevant to. Only called if the proxy indicated it has dynamic relevance.
  • DrawStaticElements - Called to submit static mesh elements for the proxy when the primitive is being attached on the game thread. Only called if the proxy indicated it has static relevance.

Scene rendering order

The renderer processes the scene in the order that it wants to composite data to the render targets. For example, the Depth only pass is rendered before the Base pass, so that HiZ will be populated to reduce shading cost in the base pass. This order is statically defined by the order pass functions are called in C++.


Relevance

FPrimitiveViewRelevance is the information on what effects (and therefore passes) are relevant to the primitive. A primitive may have multiple elements with different relevance, so FPrimitiveViewRelevance is effectively a logical OR of all the element's relevances. This means that a primitive can have both opaque and translucent relevance, or dynamic and static relevance, they are not mutually exclusive. 

FPrimitiveViewRelevance also indicates whether a primitive needs to use the dynamic and/or static rendering path with bStaticRelevance and bDynamicRelevance.


Drawing Policies

Drawing policies contain the logic to render meshes with pass specific shaders. They use the FVertexFactory interface to abstract the mesh type, and the FMaterial interface to abstract the material details. At the lowest level, a drawing policy takes a set of mesh material shaders and a vertex factory, binds the vertex factory's buffers to the RHI (SetStreamSource for D3D9), binds the mesh material shaders to the RHI, sets the appropriate shader parameters, and issues the RHI draw call.


Drawing Policy methods
  • Constructor - Finds the appropriate shader from the given vertex factory and material shader map, stores these references
  • CreateBoundShaderState - Creates an RHI bound shader state for the drawing policy
  • Matches/Compare - Provides methods to sort the drawing policy with others in the static draw lists. Matches must compare on all the factors that DrawShared depends on.
  • DrawShared - Sets RHI state that is constant between drawing policies that return TRUE from Matches. For example, most drawing policies sort on material and vertex factory, so shader parameters depending only on the material can be set, and the vertex buffers specific to the vertex factory can be bound. State should always be set here if possible instead of SetMeshRenderState, since DrawShared is called less times in the static rendering path.
  • SetMeshRenderState - Sets RHI state that is specific to this mesh, or anything not set in DrawShared. This is called many more times than DrawShared so performance is especially critical here.
  • DrawMesh - Actually issues the RHI draw call

Dynamic vs Static rendering paths

UE3 has a dynamic path which provides more control but is slower to traverse, and a static rendering path which caches scene traversal as close to the RHI level as possible. The difference is mostly high level, since they both use drawing policies at the lowest level. Each rendering pass (drawing policy) needs to be sure to handle both rendering paths if needed.


Dynamic rendering path

The dynamic rendering path uses TDynamicPrimitiveDrawer and calls DrawDynamicElements on each primitive scene proxy to render. The set of primitives that need to use the dynamic path to be rendered is tracked by FViewInfo::VisibleDynamicPrimitives. Each rendering pass needs to iterate over this array, and call DrawDynamicElements on each primitive's proxy. DrawDynamicElements of the proxy then needs to assemble as many FMeshElements as it needs and submit them with DrawRichMesh or TDynamicPrimitiveDrawer::DrawMesh. This ends up creating a new temporary drawing policy, calling CreateBoundShaderState, DrawShared, SetMeshRenderState and finally DrawMesh.

The dynamic rendering path provides a lot of flexibility because each proxy has a callback in DrawDynamicElements where it can execute logic specific to that component type. It also has minimal insertion cost but high traversal cost, because there is no state sorting, and nothing is cached.


Static rendering path

The static rendering path is implemented through static draw lists. Meshes are inserted into the draw lists when they are attached to the scene. During this insertion, DrawStaticElements on the proxy is called to collect the FStaticMeshElements. A drawing policy instance is then created and stored, along with the result of CreateBoundShaderState. The new drawing policy is sorted based on its Compare and Matches functions and inserted into the appropriate place in the draw list (see TStaticMeshDrawList::AddMesh). In InitViews, a bitarray containing visibility data for the static draw list is initialized and passed into TStaticMeshDrawList::DrawVisible where the draw list is actually drawn. DrawShared is only called once for all the drawing policies that match each other, while SetMeshRenderState and DrawMesh are called for each FStaticMeshElement (see TStaticMeshDrawList::DrawElement).

The static rendering path moves a lot of work to attach time, which significantly speeds up scene traversal at rendering time. Static draw list rendering is about 3x faster on the rendering thread for static meshes, which allows a lot more static meshes in the scene. Because static draw lists cache data at attach time, they can only cache view independent state. Primitives that are rarely reattached but often rendered are good candidates for the static draw lists. Currently only speedtrees, non-movable static meshes, placed decals and BSP use the static draw lists.

The static rendering path can expose bugs because of the way it only calls DrawShared once per state bucket. These bugs can be difficult to detect, since they depend on the rendering order and the attach order of meshes in the scene. Special viewmodes such as lighting only, unlit, etc will force all primitives to use the dynamic path, so if a bug goes away when forcing the dynamic rendering path, there's a good chance it is due to an incorrect implementation of a drawing policy's DrawShared and/or the Matches function.


High level Rendering order

Here's a description of the control flow when rendering a frame starting from FSceneRenderer::Render:


  • Foreach Depth Priority Group
    • InitViews - Initializes primitive visibility for the views through various culling methods, sets up dynamic shadows that are visible this frame, intersects shadow frustums with the world if necessary (for whole scene shadows or preshadows).
    • PrePass / Depth only pass - RenderPrePass / FDepthDrawingPolicy. First clears the depth buffer, which causes the foreground DPG to sort over the world DPG. Then renders opaque and conditionally masked materials, outputting only depth to the depth buffer. The purpose of this pass is to initialize Hierarchical Z to reduce the shading cost of the Base pass, which has expensive pixel shaders. The depths of the depth only pass may also be used for shadow projection by dominant lights.
    • Base pass - RenderBasePass / TBasePassDrawingPolicy. Renders opaque and masked materials, initializes the linear HDR scene color render target with emissive, lightmap and sky lighting, as well as any dynamic lights that have been merged into the base pass. This is the only shading pass for lightmapped geometry not being affected by any dynamic effects.
    • Additive shadowed lighting passes - RenderLights / FMeshLightingDrawingPolicyFactory. Additively accumulates the contribution of dynamic lights on opaque/masked pixels, optionally masked by normal projected shadows or light functions. This is done with forward multi-pass lighting which requires re-rendering receiver meshes, and has a high rendering thread and GPU cost as a result.
    • Modulative shadows - RenderModulatedShadows. Projects modulative shadows and multiplies their result against the existing scene color. Modulated shadows darken emissive, lighting from other lights and backfaces, because they are just multiplying against scene color blindly, but they are very efficient because they don't require the light they are coming from to be rendered dynamically.
    • Additive unshadowed lighting passes - Accumulates the contribution of lights that shouldn't be affected by modulated shadows, like Spherical Harmonic lights used to emulate indirect lighting for light environments.
    • Issue Occlusion Queries - BeginOcclusionTests. Kicks off latent occlusion queries that will be used in the next frame's InitViews. These are done by rendering bounding boxes around the objects being queried, and sometimes grouping bounding boxes together to reduce draw calls.
    • Screen space ambient occlusion - RenderPostProcessEffects. Calculates an AO factor from a heuristic based only on scene depth, and modulates the existing scene color with the SSAO result.
    • Global fog on opaque pixels - RenderFog. Applies global fog to opaque pixels in a deferred way by using the depth buffer positions to reconstruct world space positions.
    • Distortion - RenderDistortion / TDistortionMeshDrawingPolicyFactory. First accumulates distortion offsets to a render target, then applies that to scene color. This approach handles multiple layers of distortion gracefully but has significant RT and GPU overhead because it requires rendering distorting materials again to accumulate the offsets.
    • Translucency - RenderTranslucency / TBasePassDrawingPolicy. Composites translucent meshes approximately front to back, applying per-vertex fog if appropriate. Fog volumes are sorted with the rest of translucency, since they are order dependent as well.
    • Post process - RenderPostProcessEffects. Applies post process effects as controlled by the post process chain and the current PP settings. The last effect in the chain converts to LDR gamma space and copies to the backbuffer for presentation.

This is a fairly simplified and high level view. To get more details, look through the relevant code or a PIX capture on Xbox 360. Here's the draw event hierarchy of an example scene:



Gamma in UE3

The scene color buffer stores linear space HDR colors while the scene is being composited (from the Base pass to the post process passes), which requires a high precision render target format. See GammaCorrection for more info.


Render Hardware Interface (RHI)

The RHI is a thin layer above the platform specific graphics API. The RHI abstraction level in UE3 is as low level as possible, with the intention that most features can be written in platform independent code and 'just work' on all platforms that support the required feature set.


Rendering state grouping

Render states are grouped based on what part of the pipeline they affect. For example, RHISetDepthState sets all state relevant to depth buffering.


Rendering state defaults

Since there are so many rendering states, it's not practical to set them all every time we want to draw something. Instead, UE3 has an implicit set of states which are assumed to be set to the defaults (and therefore must be restored to those defaults after they are changed), and a much smaller set of states which have to be set explicitly. The set of states that don't have implicit defaults are:


  • RHISetDepthState
  • RHISetBlendState
  • RHISetRasterizerState
  • RHISetBoundShaderState
  • RHISetStreamSource (if applicable)

All other states are assumed to be at their defaults (as defined by the relevant TStaticState, for example the default stencil state is set by RHISetStencilState(TStaticStencilState<>::GetRHI()).


你可能感兴趣的:(图形引擎,Unreal,游戏开发,技术理论,其它文章,游戏引擎)