This page covers all areas of the rendering subsystem in Unreal Engine 3.
本文覆盖了虚幻3引擎渲染子系统的方方面面。
There's a lot of rendering code in UE3 so it's hard to get a quick high level view of what's going on. A good place to start reading through the code is FSceneRenderer::Render, which is where a new frame is rendered on the rendering thread. It's also useful to take a PIX capture on Xbox 360 and look through the draw events, and inspect the render targets to see what is being accumulated. You can then find where these draw events are issued from code to find the corresponding C++.
UE3有很多渲染相关的代码,所以知道高级视图是怎样被渲染是很困难的事。刚开始阅读代码,从 FSceneRenderer::Render比较好,它是渲染线程新一帧渲染的地方。用它在PIX捕获Xbox 360的渲染事件也是有用的,可以查看渲染目标中什么被增加。您可以通过渲染事件找到相应的C++代码。
RenderingTopics
See ShaderHome for low level information.
Global shaders are shaders which operate on fixed geometry (like a full screen quad) and don't need to interface with materials. Examples would be shadow filtering, or post processing. Only one shader of any given global shader type exists in memory and they are stored separately from packages in the global shader cache file. This allows them to be loaded earlier in the startup process so they can be used for things like bink startup movies.
全局shaders是一种不许要材质接口来操作固定几何管线(例如全屏矩形)的shader。例如 阴影过滤或者后处理。全局shader只能同时存在一种类型在内存,他们是和包分开存放,存放在全局shader缓存文件。这可以使得他们比启动过程中加载的早,所以它们可以用于像bink开局动画之类的东西。
Materials are defined by a set of states that control how the material is rendered (blend mode, two sided, etc) and a set of material inputs that control how the material interacts with the various rendering passes.
材料是指由一组状态的控制如何呈现材料(混合模式时,双面等)和一组材质输入,控制材质和渲染过程变量交互。
Materials have to support being applied to different mesh types, and this is accomplished with vertex factories. A FVertexFactoryType represents a unique mesh type, and a FVertexFactory instance stores the per-instance data to support that unique mesh type. For example FGPUSkinVertexFactory stores the bone matrices needed for skinning, as well as references to the various vertex buffers that the GPU skin vertex factory shader code needs as input. The vertex factory shader code is an implicit interface which is used by the various pass shaders to abstract mesh type differences. Some important components of the vertex factory shader code are:
材质支持被应用到不同的网格类型,通过顶点工厂vertex factory来编译。 FVertexFactoryType代表一类独特的mesh类型,一个FVertexFactory实例存储每个实例的数据来支持网格的独特的属性。 例如:FGPUSkinVertexFactory 存储骨骼皮肤所需的的材质,以及GPU皮肤顶点工厂shader代码输入所需要的顶点缓存。顶点工厂shader代码是隐式的接口,按shader每一个批次的不同网格类型进行了抽象。工厂的顶点着色器代码的一些重要组成部分是:
Shaders using FMaterialShaderType are pass specific shaders which need access to some of the material's attributes, and therefore must be compiled for each material, but do not need to access any mesh attributes. The light function pass shaders are an example of FMaterialShaderTypes.
Shaders using FMeshMaterialShaderType are pass specific shaders which depend on the material's attributes AND the mesh type, and therefore must be compiled for each material/vertex factory combination. For example TLightVertex/PixelShader need to apply a dynamic light to a mesh, but they need to access the mesh's position and normal to do this and they need to access the material's lighting inputs like Diffuse, Specular, etc.
A material's set of required shaders (which is a FMaterialShaderMap) looks like this:
Vertex factories are included in this matrix based on their ShouldCache function, which depends on the material's usage. For example bUsedWithSkeletalMesh being TRUE will include the GPU skin vertex factories. FMeshMaterialShaderType's are included in this matrix based on their ShouldCache function, which depends on material and vertex factory attributes. This is a sparse matrix approach to caching shaders and it adds up to a large number of shaders pretty quick which takes up memory and increases compile times. The major advantage over storing a list of actually needed shaders is that no list has to be generated, so needed shaders have always already been compiled before run time on consoles. UE3 mitigates the shader memory problem with shader compression, and the compile time problem with multicore shader compilation.
A material shader type is created with the DECLARE_SHADER_TYPE macro:
class FLightFunctionPixelShader : public FShader { DECLARE_SHADER_TYPE(FLightFunctionPixelShader,Material);
This declares the necessary metadata and functions for a material shader type. The material shader type is instantiated with IMPLEMENT_MATERIAL_SHADER_TYPE:
IMPLEMENT_MATERIAL_SHADER_TYPE(,FLightFunctionPixelShader,TEXT("LightFunctionPixelShader")
This generates the material shader type's global metadata, which allows us to do things like iterate through all shaders using a given shader type at runtime.
A typical material pixel shader type will first create a FMaterialPixelParameters struct by calling the GetMaterialPixelParameters vertex factory function. GetMaterialPixelParameters transforms the vertex factory specific inputs into properties like WorldPosition, TangentNormal, etc that any pass might want to access. Then a material shader will call CalcMaterialParameters, which writes out the rest of the members of FMaterialPixelParameters, after which FMaterialPixelParameters is fully initialized. The material shader will then access some of the material's inputs through functions in MaterialTemplate.usf (GetMaterialEmissive for the material's emissive input for example), do some shading and output a final color for that pass.
UMaterial has a setting called bUsedAsSpecialEngineMaterial that allows the material to be used with any vertex factory type. This means all vertex factories are compiled with the material, which will be a very large set. bUsedAsSpecialEngineMaterial is useful for:
In UE3 the scene as the renderer sees it is defined by primitive components and the lists of various other stuff stored in FScene. An Octree of primitive components is maintained for accelerated spatial queries, such as view frustum culling.
Primitive components are the basic unit of visibility and relevance determination. For example, occlusion and frustum culling happen on a per-primitive basis. Therefore it's important when designing a system to think about how big to make components. Each component has a bounds that is used for various operations like culling, shadow casting and light influence determination.
FPrimitiveSceneProxy is the rendering thread version of UPrimitiveComponent that is intended to be subclassed depending on the component type. FPrimitiveSceneInfo is the rendering thread version of UPrimitiveComponent that is common to all primitives.
The renderer processes the scene in the order that it wants to composite data to the render targets. For example, the Depth only pass is rendered before the Base pass, so that HiZ will be populated to reduce shading cost in the base pass. This order is statically defined by the order pass functions are called in C++.
FPrimitiveViewRelevance is the information on what effects (and therefore passes) are relevant to the primitive. A primitive may have multiple elements with different relevance, so FPrimitiveViewRelevance is effectively a logical OR of all the element's relevances. This means that a primitive can have both opaque and translucent relevance, or dynamic and static relevance, they are not mutually exclusive.
FPrimitiveViewRelevance also indicates whether a primitive needs to use the dynamic and/or static rendering path with bStaticRelevance and bDynamicRelevance.
Drawing policies contain the logic to render meshes with pass specific shaders. They use the FVertexFactory interface to abstract the mesh type, and the FMaterial interface to abstract the material details. At the lowest level, a drawing policy takes a set of mesh material shaders and a vertex factory, binds the vertex factory's buffers to the RHI (SetStreamSource for D3D9), binds the mesh material shaders to the RHI, sets the appropriate shader parameters, and issues the RHI draw call.
UE3 has a dynamic path which provides more control but is slower to traverse, and a static rendering path which caches scene traversal as close to the RHI level as possible. The difference is mostly high level, since they both use drawing policies at the lowest level. Each rendering pass (drawing policy) needs to be sure to handle both rendering paths if needed.
The dynamic rendering path uses TDynamicPrimitiveDrawer and calls DrawDynamicElements on each primitive scene proxy to render. The set of primitives that need to use the dynamic path to be rendered is tracked by FViewInfo::VisibleDynamicPrimitives. Each rendering pass needs to iterate over this array, and call DrawDynamicElements on each primitive's proxy. DrawDynamicElements of the proxy then needs to assemble as many FMeshElements as it needs and submit them with DrawRichMesh or TDynamicPrimitiveDrawer::DrawMesh. This ends up creating a new temporary drawing policy, calling CreateBoundShaderState, DrawShared, SetMeshRenderState and finally DrawMesh.
The dynamic rendering path provides a lot of flexibility because each proxy has a callback in DrawDynamicElements where it can execute logic specific to that component type. It also has minimal insertion cost but high traversal cost, because there is no state sorting, and nothing is cached.
The static rendering path is implemented through static draw lists. Meshes are inserted into the draw lists when they are attached to the scene. During this insertion, DrawStaticElements on the proxy is called to collect the FStaticMeshElements. A drawing policy instance is then created and stored, along with the result of CreateBoundShaderState. The new drawing policy is sorted based on its Compare and Matches functions and inserted into the appropriate place in the draw list (see TStaticMeshDrawList::AddMesh). In InitViews, a bitarray containing visibility data for the static draw list is initialized and passed into TStaticMeshDrawList::DrawVisible where the draw list is actually drawn. DrawShared is only called once for all the drawing policies that match each other, while SetMeshRenderState and DrawMesh are called for each FStaticMeshElement (see TStaticMeshDrawList::DrawElement).
The static rendering path moves a lot of work to attach time, which significantly speeds up scene traversal at rendering time. Static draw list rendering is about 3x faster on the rendering thread for static meshes, which allows a lot more static meshes in the scene. Because static draw lists cache data at attach time, they can only cache view independent state. Primitives that are rarely reattached but often rendered are good candidates for the static draw lists. Currently only speedtrees, non-movable static meshes, placed decals and BSP use the static draw lists.
The static rendering path can expose bugs because of the way it only calls DrawShared once per state bucket. These bugs can be difficult to detect, since they depend on the rendering order and the attach order of meshes in the scene. Special viewmodes such as lighting only, unlit, etc will force all primitives to use the dynamic path, so if a bug goes away when forcing the dynamic rendering path, there's a good chance it is due to an incorrect implementation of a drawing policy's DrawShared and/or the Matches function.
Here's a description of the control flow when rendering a frame starting from FSceneRenderer::Render:
This is a fairly simplified and high level view. To get more details, look through the relevant code or a PIX capture on Xbox 360. Here's the draw event hierarchy of an example scene:
The scene color buffer stores linear space HDR colors while the scene is being composited (from the Base pass to the post process passes), which requires a high precision render target format. See GammaCorrection for more info.
The RHI is a thin layer above the platform specific graphics API. The RHI abstraction level in UE3 is as low level as possible, with the intention that most features can be written in platform independent code and 'just work' on all platforms that support the required feature set.
Render states are grouped based on what part of the pipeline they affect. For example, RHISetDepthState sets all state relevant to depth buffering.
Since there are so many rendering states, it's not practical to set them all every time we want to draw something. Instead, UE3 has an implicit set of states which are assumed to be set to the defaults (and therefore must be restored to those defaults after they are changed), and a much smaller set of states which have to be set explicitly. The set of states that don't have implicit defaults are:
All other states are assumed to be at their defaults (as defined by the relevant TStaticState, for example the default stencil state is set by RHISetStencilState(TStaticStencilState<>::GetRHI()).