知乎文章可能格式有错乱,如果发现格式问题,请看原文吧。
环境: Unreal Engine 4.25.2
此文介绍了ISM即UInstancedStaticMeshComponent的使用,分析了编辑和渲染代码。A component that efficiently renders multiple instances of the same StaticMesh.
TL;DRISM的所有实例一起,无论如何都只使用1个drawcall
ISM的LOD是以所有实例的包围盒计算的,统一变化
ISM没有自己的剔除逻辑
ISM在UE4.25上不会被发布(Shipping)版本渲染,因为有bug,4.26修复了
ISM的逐实例变换在世界坐标变换后被应用
使用创建一个Actor,附加UInstancedStaticMeshComponent组件,并给ISM组件赋值StaticMesh
在ISM的Detail窗口中手动添加实例并指定其Transform空场景用蓝图随机生成了1000个A类型石头的实例和1000个B类型石头的实例
可以看到Mesh draw calls数量只是从4变到了6,只增加了2
编辑时
编辑器的instance数据保存在UInstancedStaticMeshComponent::PerInstanceSMData,类型为TArray;也支持附加逐实例的自定义数据,存储在UInstancedStaticMeshComponent::PerInstanceSMCustomData,类型为TArray,它们数据可以被蓝图和材质读取。
在ISM的Detail窗口中进行编辑操作时,会触发PostEditChangeChainProperty对PerInstanceSMData和PerInstanceSMCustomData进行相应的修改:增加实例:UInstancedStaticMeshComponent::AddInstanceInternal
删除实例: UInstancedStaticMeshComponent::RemoveInstanceInternal
清除实例:UInstancedStaticMeshComponent::ClearInstances
移动实例,即修改FInstancedStaticMeshInstanceData::Transform:修改UInstancedStaticMeshComponent::NumEdits
增加、修改、删除自定义实例数据(略)
以上过程基本都用下面的代码收尾:
// Force recreation of the render dataInstanceUpdateCmdBuffer.Edit();
MarkRenderStateDirty();
这会导致立刻重新CreateSceneProxy生成渲染数据并以当前的Transform绘制所有实例,可以在UInstancedStaticMeshComponent::CreateSceneProxy开头里打个断点看看此时的堆栈:
UInstancedStaticMeshComponent::CreateSceneProxy() InstancedStaticMesh.cpp:1473
FScene::AddPrimitive(UPrimitiveComponent *) RendererScene.cpp:1372
UPrimitiveComponent::CreateRenderState_Concurrent(FRegisterComponentContext *) PrimitiveComponent.cpp:646
UStaticMeshComponent::CreateRenderState_Concurrent(FRegisterComponentContext *) StaticMeshComponent.cpp:620
UActorComponent::ExecuteRegisterEvents(FRegisterComponentContext *) ActorComponent.cpp:1561
...
UInstancedStaticMeshComponent::PostEditChangeChainProperty(FPropertyChangedChainEvent &) InstancedStaticMesh.cpp:3085
在蓝图的Detail窗口中修改ISM组件上的instance数据时,会调用UInstancedStaticMeshComponent::ApplyComponentInstanceData更新其PerInstanceSMData:
UInstancedStaticMeshComponent::ApplyComponentInstanceData(FInstancedStaticMeshComponentInstanceData * InstancedMeshData) Line 1413 C++
FInstancedStaticMeshComponentInstanceData::ApplyToComponent
FComponentInstanceDataCache::ApplyToActor(AActor * Actor, const ECacheApplyPhase CacheApplyPhase) Line 596 C++
同样最后也会引起重新CreateSceneProxy生成渲染数据再重绘。
在Detail窗口中,也有调整剔除相关的参数,但后面可以看到,这些参数只在开启了dithered lod transition才对渲染结果有影响,产生渐隐效果。
/** 从这个距离开始,实例开始渐隐 */
UPROPERTY(EditAnywhere, BlueprintReadOnly, Category=Culling)
int32 InstanceStartCullDistance;
/** 从这个距离开始,实例完全被剔除 */
UPROPERTY(EditAnywhere, BlueprintReadOnly, Category=Culling)
int32 InstanceEndCullDistance;
在CreateSceneProxy, SerializeRenderData等等需要使用instance数据时,调用UInstancedStaticMeshComponent::BuildRenderData将PerInstanceSMData转换为FStaticMeshInstanceData.
目前UE4不支持直接在3D场景中移动单个实例,虽然发现有相关的Customization代码(FInstancedStaticMeshSCSEditorCustomization),但对直接添加的ISM组件不起作用,似乎是在蓝图编辑器里编辑时用的,但也没能尝试出来是如何使用的。目前必须通过Detail窗口中调整实例的Transform来移动(或通过蓝图、C++去修改单个instance的Transform属性)。但查到有个叫instance tool插件称自己可以做到在3D场景中编辑单个实例。
运行时
创建SceneProxy
UInstancedStaticMeshComponent对应的Proxy类型为FInstancedStaticMeshSceneProxy.
在UInstancedStaticMeshComponent::CreateSceneProxy()时,复制实例的渲染数据PerInstanceRenderData到FStaticMeshSceneProxy::InstancedRenderData
保存StaticMesh的LOD数据到FStaticMeshSceneProxy::LODModels
设置每个实例的静态LightMap、ShadowMap偏移
SetupProxy(InComponent) (稍后分析)
实际设置实例的核心方法是各类SetInstance和SetInstanceXXX方法,详细的看下这些方法的代码很容易看懂。
但其中有个点可能要特别说明一下:
void UInstancedStaticMeshComponent::BuildRenderData(/*...*/)
{
FRandomStream RandomStream = FRandomStream(InstancingRandomSeed);
//.. OutData.SetInstance(RenderIndex, InstanceData.Transform, RandomStream.GetFraction(), LightmapUVBias, ShadowmapUVBias);
//..}
void SetInstance(int32 InstanceIndex, const FMatrix& Transform, float RandomInstanceID, const FVector2D& LightmapUVBias, const FVector2D& ShadowmapUVBias)
{
FVector4 Origin(Transform.M[3][0], Transform.M[3][1], Transform.M[3][2], RandomInstanceID);
SetInstanceOriginInternal(InstanceIndex, Origin);
//...}
上面的RandomInstanceID是个随机数,它被存储到了Origin的FVector4域的w上,来自从UInstancedStaticMeshComponent::InstancingRandomSeed生成的随机序列,它在一些使用了ISM和其子类HISM的逻辑中用于随机生成instance,未来在分析LandscapeGrass的代码时就可以看到其应用了。
接着在FInstancedStaticMeshSceneProxy的构造函数内,调用UInstancedStaticMeshComponent::SetupProxy(InComponent)。
首先,检查StaticMesh的所有LOD对应的材质是否可以用于实例化渲染(EMaterialUsage::MATUSAGE_InstancedStaticMeshes)。 然后,复制LOD参数到FInstancedStaticMeshSceneProxy::UserData_AllInstances,类型是FInstancingUserData,里面存储着渲染时需要的数据:
struct FInstancingUserData
{
class FInstancedStaticMeshRenderData* RenderData;//实例的渲染数据(主要是InstanceBuffer),SetupProxy里设置为nullptr class FStaticMeshRenderData* MeshRenderData;//Static Mesh的渲染数据
int32 StartCullDistance;//渐隐距离 int32 EndCullDistance;//剔除距离
int32 MinLOD;//最小LOD
bool bRenderSelected;
bool bRenderUnselected;
FVector AverageInstancesScale;//Instance的平均缩放变换 FVector InstancingOffset;//Static Mesh的包围盒中心};
Proxy里的UserData_AllInstances之后会被MeshBatch引用,最终传输到。
(并没有)剔除
先说结论,默认情况下UE并没有对实例进行剔除。这里不考虑启用dithered lod transition的情况,后面可能会专门写一篇来分析dithered lod transition的实现。
在获取ISM Shader binding数据时,即FInstancedStaticMeshVertexFactoryShaderParameters::GetElementShaderBindings中,
首先,根据MeshBatch中的实例数据和View,计算并添加以下ShaderParameter的Binding。这里因为它们都是给dithered lod transition用的,相关的代码分析略。
LocalVertexFactory.ush
#if USE_DITHERED_LOD_TRANSITION
float4 InstancingViewZCompareZero;
float4 InstancingViewZCompareOne;
float4 InstancingViewZConstant;
float4 InstancingWorldViewOriginZero;
float4 InstancingWorldViewOriginOne;
#endif
//...
ShaderBindings.Add(InstancingViewZCompareZeroParameter, InstancingViewZCompareZero);
ShaderBindings.Add(InstancingViewZCompareOneParameter, InstancingViewZCompareOne);
ShaderBindings.Add(InstancingViewZConstantParameter, InstancingViewZConstant);
ShaderBindings.Add(InstancingOffsetParameter, InstancingOffset);
ShaderBindings.Add(InstancingWorldViewOriginZeroParameter, InstancingWorldViewOriginZero);
ShaderBindings.Add(InstancingWorldViewOriginOneParameter, InstancingWorldViewOriginOne);
然后,从MeshBatch中拿到之前设置的StartCullDistance和EndCullDistance,绑定到Shader参数InstancingWorldViewOriginOneParameter和InstancingFadeOutParamsParameter中:
const float MaxDrawDistanceScale = GetCachedScalabilityCVars().ViewDistanceScale;
const float StartDistance = InstancingUserData->StartCullDistance * MaxDrawDistanceScale;
const float EndDistance = InstancingUserData->EndCullDistance * MaxDrawDistanceScale;
InstancingFadeOutParams.X = StartDistance;
if( EndDistance > 0 )
{
if( EndDistance > StartDistance )
{
InstancingFadeOutParams.Y = 1.f / (float)(EndDistance - StartDistance);
}
else
{
InstancingFadeOutParams.Y = 1.f;
}
}
else
{
InstancingFadeOutParams.Y = 0.f;
}
//Debug用的console variable,用来在vertex shader中强制剔除所有instanceif (CVarCullAllInVertexShader.GetValueOnRenderThread() > 0)
{
InstancingFadeOutParams.Z = 0.0f;
InstancingFadeOutParams.W = 0.0f;
}
else
{
InstancingFadeOutParams.Z = InstancingUserData->bRenderSelected ? 1.f : 0.f;
InstancingFadeOutParams.W = InstancingUserData->bRenderUnselected ? 1.f : 0.f;
}
可以看到InstancingFadeOutParams.xy中保存了渐隐和剔除信息,在看下vertex shader中是如何渐隐和剔除instance的。
float4 InstancingFadeOutParams;
float GetPerInstanceFadeAmount(FMaterialPixelParameters Parameters)
{
return float(Parameters.PerInstanceParams.y);
}
FVertexFactoryIntermediates GetVertexFactoryIntermediates(FVertexFactoryInput Input)
{
Intermediates.PerInstanceParams.y = 1.0 - saturate((length(InstanceLocation + ResolvedView.PreViewTranslation.xyz) - InstancingFadeOutParams.x) * InstancingFadeOutParams.y);
}
//然而PerInstanceParams.y并没有被入口点函数Main使用,不论vs还是ps都没有
虽然vertex shader中有对这些数据进行计算,但却没有体现到vertex shader、pixel shader的输出上,从抓帧的结果来看也是这样,所以说移动端并没有对ISM的实例进行剔除。相关代码基本没有SM5、ES3_1的分支,所以很可能PC端的实现也是这样的。
粗糙的LOD计算
先说结论,所有instance统一切换使用的StaticMesh的LOD,当所有instance的包围盒的并集在屏幕上的投影大小达到StaticMesh的某LOD的切换大小时切换。实际上如果实例分散得比较开,它们得LOD就会始终保持在LOD0,因为分散的实例整体很难达到切换LOD1的大小。所有相同StaticMesh的实例的渲染只用一个drawcall.
LOD相关的逻辑在UInstancedStaticMeshComponent::GetDynamicMeshElements中。
首先提一下ISM计算Proxy包围盒的方法,计算LOD时会用到。
FBoxSphereBounds UInstancedStaticMeshComponent::CalcBounds(const FTransform& BoundTransform) const
{
//检查Static Mesh和instance数量 if(GetStaticMesh() && PerInstanceSMData.Num() > 0)
{
//应用函数参数的变换 FMatrix BoundTransformMatrix = BoundTransform.ToMatrixWithScale();
FBoxSphereBounds RenderBounds = GetStaticMesh()->GetBounds();
FBoxSphereBounds NewBounds = RenderBounds.TransformBy(PerInstanceSMData[0].Transform * BoundTransformMatrix);
//应用每个instance的变换,求所有instance的包围盒的并集 for (int32 InstanceIndex = 1; InstanceIndex < PerInstanceSMData.Num(); InstanceIndex++)
{
NewBounds = NewBounds + RenderBounds.TransformBy(PerInstanceSMData[InstanceIndex].Transform * BoundTransformMatrix);
}
return NewBounds;
}
else//StaticMesh无效或没有instance就返回一个0大小的位置在指定变换的中心的包围盒 {
return FBoxSphereBounds(BoundTransform.GetLocation(), FVector::ZeroVector, 0.f);
}
}
所以正常情况下,ISM的包围盒就是包含了所有instance的static mesh的一个盒子,是考虑了每个instanced的位置的。
然后看下生成MeshBatch的GetDynamicMeshElements(略去了无关逻辑):
void FInstancedStaticMeshSceneProxy::GetDynamicMeshElements(const TArray& Views, const FSceneViewFamily& ViewFamily, uint32 VisibilityMap, FMeshElementCollector& Collector) const
{
#if !(UE_BUILD_SHIPPING || UE_BUILD_TEST) const bool bSelectionRenderEnabled = GIsEditor && ViewFamily.EngineShowFlags.Selection;
// If the first pass rendered selected instances only, we need to render the deselected instances in a second pass const int32 NumSelectionGroups = (bSelectionRenderEnabled && bHasSelectedInstances) ? 2 : 1;
const FInstancingUserData* PassUserData[2] =
{
bHasSelectedInstances && bSelectionRenderEnabled ? &UserData_SelectedInstances : &UserData_AllInstances,
&UserData_DeselectedInstances
};
//Selection的Outline有个单独的MeshBatch要生成 bool BatchRenderSelection[2] =
{
bSelectionRenderEnabled && IsSelected(),
false
};
//逐View for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
{
if (VisibilityMap & (1 << ViewIndex))
{//View可见才增加FMesh const FSceneView* View = Views[ViewIndex];
for (int32 SelectionGroupIndex = 0; SelectionGroupIndex < NumSelectionGroups; SelectionGroupIndex++)
{//逐Selection Group(额外的一个是选择ISM时的Outline) const int32 LODIndex = GetLOD(View);//根据当前Proxy的包围盒、View的位置,计算现在static mesh应使用的LOD const FStaticMeshLODResources& LODModel = StaticMesh->RenderData->LODResources[LODIndex];
for (int32 SectionIndex = 0; SectionIndex < LODModel.Sections.Num(); SectionIndex++)
{
const int32 NumBatches = GetNumMeshBatches();//== 1,只创建一个MeshBatch
for (int32 BatchIndex = 0; BatchIndex < NumBatches; BatchIndex++)
{
FMeshBatch& MeshElement = Collector.AllocateMesh();
//根据当前LOD填充MeshBatch if (GetMeshElement(LODIndex, BatchIndex, SectionIndex, GetDepthPriorityGroup(View), BatchRenderSelection[SelectionGroupIndex], true, MeshElement))
{
MeshElement.Elements[0].UserData = PassUserData[SelectionGroupIndex];//Instance数据 MeshElement.Elements[0].bUserDataIsColorVertexBuffer = false;
MeshElement.bCanApplyViewModeOverrides = true;
Collector.AddMesh(ViewIndex, MeshElement);
}
}
}
}
}
}
#endif}
其中比较重要的是计算LOD这里,它所考虑的是所有instance的包围盒的并集此时在屏幕上显示的大小所对应的LOD,并不是单个instance的。如果instance很分散,导致ISM的包围盒非常大,那么很有可能不论相机离这些实例多远,还是显示LOD0的static mesh. 可以自己创建一个ISM,放几个分散很广的或聚集在一起的实例,动一动相机看下具体效果(记得打开LOD着色模式):知乎视频www.zhihu.com
在上面的视频里,场景里有三个石头模型的实例,相隔有段距离,但随相机前后移动,三者的LOD是同时变化的。
可以参考这个问题中epic staff的回答,ISM的LOD总是同时变化的:Instanced Static Meshes will LOD together, as they are all on one singular draw call for the engine, which is where you are getting your performance increases. In you case it is most likely ok not to have LODs and still get good performance.
另外上面的代码有个问题:整个方法用
#if !(UE_BUILD_SHIPPING || UE_BUILD_TEST)#endif
包起来了,所以实际发布的包里ISM是完全不渲染的。关于这个,在UDN上提问后Epic的工作人员回复说这是bug,而且已经在4.26解决了。如果4.25需要使用,直接去掉上面两行即可。
渲染数据 - Vertex Buffer
Vertex数据格式由FInstancedStaticMeshVertexFactory::FDataType定义,它继承自FInstancedStaticMeshDataType和FLocalVertexFactory::FDataType,也就是说除了StaticMesh相关的域之外就是ISM的instance数据:
struct FInstancedStaticMeshDataType
{
//位置xyz,随机InstanceID w FVertexStreamComponent InstanceOriginComponent;
//Transform除位置外的3x3矩阵数据 FVertexStreamComponent InstanceTransformComponent[3];
//LightMap和ShadowMap的UV偏移 //这个field的官方注释是 The stream to read the Lightmap Bias and Random instance ID from. //似乎是写错了,Random instance ID 是保存在 InstanceOriginComponent 的 w 上,不是这里 FVertexStreamComponent InstanceLightmapAndShadowMapUVBiasComponent;
FRHIShaderResourceView* InstanceOriginSRV = nullptr;
FRHIShaderResourceView* InstanceTransformSRV = nullptr;
FRHIShaderResourceView* InstanceLightmapSRV = nullptr;
FRHIShaderResourceView* InstanceCustomDataSRV = nullptr;
//自定义数据:一个instance对应的float数量 int32 NumCustomDataFloats = 0;
};
在创建了instance数据(包括自定义数据)、StaticMesh的Vertex/IndexBuffer的RHI对象后,通过ISM的VertexFactory绑定到shader中:/Engine/Private/LocalVertexFactory.ush:
ENQUEUE_RENDER_COMMAND(InstancedStaticMeshRenderData_InitVertexFactories)(
[this, LightMapCoordinateIndex](FRHICommandListImmediate& RHICmdList)
{
for (int32 LODIndex = 0; LODIndex < VertexFactories.Num(); LODIndex++)
{//逐LOD进行绑定 const FStaticMeshLODResources* RenderData = &LODModels[LODIndex];
FInstancedStaticMeshVertexFactory::FDataType Data;
// Assign to the vertex factory for this LOD. FInstancedStaticMeshVertexFactory& VertexFactory = VertexFactories[LODIndex];
//StaticMesh的几个VertexBuffer: position, tangent, uv, light-map uv RenderData->VertexBuffers.PositionVertexBuffer.BindPositionVertexBuffer(&VertexFactory, Data);
RenderData->VertexBuffers.StaticMeshVertexBuffer.BindTangentVertexBuffer(&VertexFactory, Data);
RenderData->VertexBuffers.StaticMeshVertexBuffer.BindPackedTexCoordVertexBuffer(&VertexFactory, Data);
if (LightMapCoordinateIndex < (int32)RenderData->VertexBuffers.StaticMeshVertexBuffer.GetNumTexCoords() && LightMapCoordinateIndex >= 0)
{
RenderData->VertexBuffers.StaticMeshVertexBuffer.BindLightMapVertexBuffer(&VertexFactory, Data, LightMapCoordinateIndex);
}
RenderData->VertexBuffers.ColorVertexBuffer.BindColorVertexBuffer(&VertexFactory, Data);
//逐实例的的buffer,包括Origin(位置), Transform, LightMap和CustomData check(PerInstanceRenderData);
PerInstanceRenderData->InstanceBuffer.BindInstanceVertexBuffer(&VertexFactory, Data);
VertexFactory.SetData(Data);
VertexFactory.InitResource();
}
});
其中逐实例buffer的绑定过程BindInstanceVertexBuffer()需要进一步解析
void FStaticMeshInstanceBuffer::BindInstanceVertexBuffer(const class FVertexFactory* VertexFactory, FInstancedStaticMeshDataType& InstancedStaticMeshData) const
{
//移动平台RHISupportsManualVertexFetch会返回false,也就是不会通过ManualVertexFetch来设置instance数据 if (InstanceData->GetNumInstances() && RHISupportsManualVertexFetch(GMaxRHIShaderPlatform))
{
check(InstanceOriginSRV);
check(InstanceTransformSRV);
check(InstanceLightmapSRV);
check(InstanceCustomDataSRV); // Should not be nullptr, but can be assigned a dummy buffer }
{
InstancedStaticMeshData.InstanceOriginSRV = InstanceOriginSRV;
InstancedStaticMeshData.InstanceTransformSRV = InstanceTransformSRV;
InstancedStaticMeshData.InstanceLightmapSRV = InstanceLightmapSRV;
InstancedStaticMeshData.InstanceCustomDataSRV = InstanceCustomDataSRV;
InstancedStaticMeshData.NumCustomDataFloats = InstanceData->GetNumCustomDataFloats();
}
//创建各个instance数据对应的FVertexStreamComponent {
InstancedStaticMeshData.InstanceOriginComponent = FVertexStreamComponent(
&InstanceOriginBuffer,
0,
16,
VET_Float4,
EVertexStreamUsage::ManualFetch | EVertexStreamUsage::Instancing
);
EVertexElementType TransformType = InstanceData->GetTranslationUsesHalfs() ? VET_Half4 : VET_Float4;
uint32 TransformStride = InstanceData->GetTranslationUsesHalfs() ? 8 : 16;
InstancedStaticMeshData.InstanceTransformComponent[0] = FVertexStreamComponent(
&InstanceTransformBuffer,
0 * TransformStride,
3 * TransformStride,
TransformType,
EVertexStreamUsage::ManualFetch | EVertexStreamUsage::Instancing
);
InstancedStaticMeshData.InstanceTransformComponent[1] = FVertexStreamComponent(
&InstanceTransformBuffer,
1 * TransformStride,
3 * TransformStride,
TransformType,
EVertexStreamUsage::ManualFetch | EVertexStreamUsage::Instancing
);
InstancedStaticMeshData.InstanceTransformComponent[2] = FVertexStreamComponent(
&InstanceTransformBuffer,
2 * TransformStride,
3 * TransformStride,
TransformType,
EVertexStreamUsage::ManualFetch | EVertexStreamUsage::Instancing
);
InstancedStaticMeshData.InstanceLightmapAndShadowMapUVBiasComponent = FVertexStreamComponent(
&InstanceLightmapBuffer,
0,
8,
VET_Short4N,
EVertexStreamUsage::ManualFetch | EVertexStreamUsage::Instancing
);
}
}
注意FVertexStreamComponent的构造方法的第一个参数指定了Stream的数据来源于哪个VertexBuffer,这里的源VertexBuffer有: FStaticMeshInstanceBuffer::InstanceOriginBuffer FStaticMeshInstanceBuffer::InstanceTransformBuffer * FStaticMeshInstanceBuffer::InstanceLightmapBuffer 这些VertexBuffer都是在FStaticMeshInstanceBuffer::InitRHI()时,调用FStaticMeshInstanceBuffer::CreateVertexBuffer()创建的,在非移动平台上还会额外创建对应的Shader Resource View.
void FStaticMeshInstanceBuffer::CreateVertexBuffer(FResourceArrayInterface* InResourceArray, uint32 InUsage, uint32 InStride, uint8 InFormat, FVertexBufferRHIRef& OutVertexBufferRHI, FShaderResourceViewRHIRef& OutInstanceSRV)
{
//创建Vertex Buffer FRHIResourceCreateInfo CreateInfo(InResourceArray);
OutVertexBufferRHI = RHICreateVertexBuffer(InResourceArray->GetResourceDataSize(), InUsage, CreateInfo);
//若支持ManualVertexFetch,则创建对应的SRV if (RHISupportsManualVertexFetch(GMaxRHIShaderPlatform))
{
OutInstanceSRV = RHICreateShaderResourceView(OutVertexBufferRHI, InStride, InFormat);
}
}
void FStaticMeshInstanceBuffer::InitRHI()
{
check(InstanceData);
if (InstanceData->GetNumInstances() > 0)
{
auto AccessFlags = BUF_Static;
CreateVertexBuffer(InstanceData->GetOriginResourceArray(), AccessFlags | BUF_ShaderResource, 16, PF_A32B32G32R32F, InstanceOriginBuffer.VertexBufferRHI, InstanceOriginSRV);
CreateVertexBuffer(InstanceData->GetTransformResourceArray(), AccessFlags | BUF_ShaderResource, InstanceData->GetTranslationUsesHalfs() ? 8 : 16, InstanceData->GetTranslationUsesHalfs() ? PF_FloatRGBA : PF_A32B32G32R32F, InstanceTransformBuffer.VertexBufferRHI, InstanceTransformSRV);
CreateVertexBuffer(InstanceData->GetLightMapResourceArray(), AccessFlags | BUF_ShaderResource, 8, PF_R16G16B16A16_SNORM, InstanceLightmapBuffer.VertexBufferRHI, InstanceLightmapSRV);
if (InstanceData->GetNumCustomDataFloats() > 0)
{
CreateVertexBuffer(InstanceData->GetCustomDataResourceArray(), AccessFlags | BUF_ShaderResource, 4, PF_R32_FLOAT, InstanceCustomDataBuffer.VertexBufferRHI, InstanceCustomDataSRV);
}
else
{
InstanceCustomDataSRV = GDummyFloatBuffer.ShaderResourceViewRHI;
}
}
}
void FInstancedStaticMeshVertexFactory::InitRHI()
{
#if !ALLOW_DITHERED_LOD_FOR_INSTANCED_STATIC_MESHES// position(and normal) only shaders cannot work with dithered LOD // If the vertex buffer containing position is not the same vertex buffer containing the rest of the data, // then initialize PositionStream and PositionDeclaration. if(Data.PositionComponent.VertexBuffer != Data.TangentBasisComponents[0].VertexBuffer)
{
auto AddDeclaration = [&Data](EVertexInputStreamType InputStreamType, bool bInstanced, bool bAddNormal)
{
FVertexDeclarationElementList StreamElements;
StreamElements.Add(AccessPositionStreamComponent(Data.PositionComponent, 0));
if (bAddNormal)
{
StreamElements.Add(AccessPositionStreamComponent(Data.TangentBasisComponents[2], 2));
}
if (bInstanced)
{
// toss in the instanced location stream StreamElements.Add(AccessPositionStreamComponent(Data.InstanceOriginComponent, 8));
StreamElements.Add(AccessPositionStreamComponent(Data.InstanceTransformComponent[0], 9));
StreamElements.Add(AccessPositionStreamComponent(Data.InstanceTransformComponent[1], 10));
StreamElements.Add(AccessPositionStreamComponent(Data.InstanceTransformComponent[2], 11));
}
InitDeclaration(StreamElements, InputStreamType);
};
AddDeclaration(EVertexInputStreamType::PositionOnly, bInstanced, false);
AddDeclaration(EVertexInputStreamType::PositionAndNormalOnly, bInstanced, true);
}
#endif
//创建VertexDeclarationElementList,用之前在`FStaticMeshInstanceBuffer::BindInstanceVertexBuffer`中创建的`FVertexStreamComponent`填充它 FVertexDeclarationElementList Elements;
//逐顶点的Element if(Data.PositionComponent.VertexBuffer != NULL)
{
Elements.Add(AccessStreamComponent(Data.PositionComponent,0));
}
//(略)其他逐顶点Element //...
//逐Instance的Element check(Data.InstanceOriginComponent.VertexBuffer);
if (Data.InstanceOriginComponent.VertexBuffer)
{
Elements.Add(AccessStreamComponent(Data.InstanceOriginComponent, 8));
}
check(Data.InstanceTransformComponent[0].VertexBuffer);
if (Data.InstanceTransformComponent[0].VertexBuffer)
{
Elements.Add(AccessStreamComponent(Data.InstanceTransformComponent[0], 9));
Elements.Add(AccessStreamComponent(Data.InstanceTransformComponent[1], 10));
Elements.Add(AccessStreamComponent(Data.InstanceTransformComponent[2], 11));
}
if (Data.InstanceLightmapAndShadowMapUVBiasComponent.VertexBuffer)
{
Elements.Add(AccessStreamComponent(Data.InstanceLightmapAndShadowMapUVBiasComponent,12));
}
// we don't need per-vertex shadow or lightmap rendering InitDeclaration(Elements);
//获取逐instance数据的SRV,这些SRV在FStaticMeshInstanceBuffer::CreateVertexBuffer中创建;如果不支持ManualVertexFetch就是个空值,vertex shader里也不会去访问 //创建UniformBuffer并保存到里面,在shader中如果支持ManualVertexFetch则会从UniformBuffer的这些成员来获取instance数据 {
FInstancedStaticMeshVertexFactoryUniformShaderParameters UniformParameters;
UniformParameters.VertexFetch_InstanceOriginBuffer = GetInstanceOriginSRV();
UniformParameters.VertexFetch_InstanceTransformBuffer = GetInstanceTransformSRV();
UniformParameters.VertexFetch_InstanceLightmapBuffer = GetInstanceLightmapSRV();
UniformParameters.InstanceCustomDataBuffer = GetInstanceCustomDataSRV();
UniformParameters.NumCustomDataFloats = Data.NumCustomDataFloats;
UniformBuffer = TUniformBufferRef::CreateUniformBufferImmediate(UniformParameters, UniformBuffer_MultiFrame, EUniformBufferValidation::None);
}
}
这部分的代码关于Manual Vertex Fetch的额外逻辑直接影响了Vertex shader中如何获取Instance数据,接下来讲完Vertex Shader逻辑后会小结。
渲染 - Vertex Shader
(以下删掉了和Vertex Position无关的代码)
Vertex Shader入口点:
//入口函数,常规的本地到剪裁空间的变换
void Main(FVertexFactoryInput Input, out FMobileShadingBasePassVSToPS Output)
{
ResolvedView = ResolveView();
FVertexFactoryIntermediates VFIntermediates = GetVertexFactoryIntermediates(Input);//A
float4 WorldPositionExcludingWPO = VertexFactoryGetWorldPosition(Input, VFIntermediates);//B
float4 WorldPosition = WorldPositionExcludingWPO;
half3x3 TangentToLocal = VertexFactoryGetTangentToLocal(Input, VFIntermediates);
FMaterialVertexParameters VertexParameters = GetMaterialVertexParameters(Input, VFIntermediates, WorldPosition.xyz, TangentToLocal);
half3 WorldPositionOffset = GetMaterialWorldPositionOffset(VertexParameters);
WorldPosition.xyz += WorldPositionOffset;
float4 RasterizedWorldPosition = VertexFactoryGetRasterizedWorldPosition(Input, VFIntermediates, WorldPosition);
Output.Position = mul(RasterizedWorldPosition, ResolvedView.TranslatedWorldToClip);
}
关键在A GetVertexFactoryIntermediates, B VertexFactoryGetWorldPosition 两处
GetVertexFactoryIntermediates
从FVertexFactoryInput获取绑定的顶点数据,包括Primitive ID(实例的编号,默认没有作用)、Transform矩阵的前3行、instance的位置、 RandomInstanceID(默认没有作用)、LightMap和ShadowMap的UV偏移。
float3 TransformLocalToWorld(float3 LocalPosition, uint PrimitiveId/*没被使用的参数*/)
{
float4x4 LocalToWorld = Primitive_LocalToWorld;
return ((LocalToWorld[0].xyz * LocalPosition.xxx + LocalToWorld[1].xyz * LocalPosition.yyy + LocalToWorld[2].xyz * LocalPosition.zzz) + LocalToWorld[3].xyz);
}
FVertexFactoryIntermediates GetVertexFactoryIntermediates(FVertexFactoryInput Input)
{
FVertexFactoryIntermediates Intermediates;
//从Input取得instance数据
//Primitive ID
Intermediates.PrimitiveId = Input.InstanceTransform1.w;
//LocalVertexFactory.ush
#if USE_INSTANCING && MANUAL_VERTEX_FETCH
uint InstanceId = GetInstanceId(Input.InstanceId);
//instance的Transform矩阵的3行,从VertexFactory的UniformBuffer域VertexFetch_InstanceTransformBuffer获取
Intermediates.InstanceTransform1 = InstanceVF.VertexFetch_InstanceTransformBuffer[3 * (InstanceId + InstanceOffset) + 0];
Intermediates.InstanceTransform2 = InstanceVF.VertexFetch_InstanceTransformBuffer[3 * (InstanceId + InstanceOffset) + 1];
Intermediates.InstanceTransform3 = InstanceVF.VertexFetch_InstanceTransformBuffer[3 * (InstanceId + InstanceOffset) + 2];
//instance的位置在xyz, Random Instance ID在w,从VertexFactory的UniformBuffer域VertexFetch_InstanceOriginBuffer获取
Intermediates.InstanceOrigin = InstanceVF.VertexFetch_InstanceOriginBuffer[(InstanceId + InstanceOffset)];
//instance的LightMap和ShadowMap的UV偏移,从VertexFactory的UniformBuffer域VertexFetch_InstanceLightmapBuffer获取
Intermediates.InstanceLightmapAndShadowMapUVBias = InstanceVF.VertexFetch_InstanceLightmapBuffer[(InstanceId + InstanceOffset)];
#elif USE_INSTANCING
//instance的Transform矩阵的3行
Intermediates.InstanceTransform1 = Input.InstanceTransform1;
Intermediates.InstanceTransform2 = Input.InstanceTransform2;
Intermediates.InstanceTransform3 = Input.InstanceTransform3;
//instance的位置在xyz, Random Instance ID在w
Intermediates.InstanceOrigin = Input.InstanceOrigin;
//instance的LightMap和ShadowMap的UV偏移
Intermediates.InstanceLightmapAndShadowMapUVBias = Input.InstanceLightmapAndShadowMapUVBias;
#endif
Intermediates.PerInstanceParams.x = Intermediates.InstanceOrigin.w;//RandomInstanceID
float3 InstanceLocation = TransformLocalToWorld(Intermediates.InstanceOrigin.xyz, Intermediates.PrimitiveId/*这个参数没有被使用*/).xyz;
//此行是给Dithered LOD Transition用的,略
Intermediates.PerInstanceParams.y = 1.0 - saturate((length(InstanceLocation + ResolvedView.PreViewTranslation.xyz) - InstancingFadeOutParams.x) * InstancingFadeOutParams.y);
Intermediates.PerInstanceParams.w = 0;
return Intermediates;
}
VertexFactoryGetWorldPosition
将顶点坐标从本地空间转换到世界空间,考虑逐instance的Transform.
float4 TransformLocalToTranslatedWorld(float3 LocalPosition, uint PrimitiveId)
{
//Primitive_LocalToWorld是FPrimitiveUniformShaderParameters的一个field
//它的值就是FPrimitiveSceneProxy::LocalToWorld,即SceneProxy的本地到世界的变换
float4x4 LocalToWorld = Primitive_LocalToWorld;
float3 RotatedPosition =
LocalToWorld[0].xyz * LocalPosition.xxx +
LocalToWorld[1].xyz * LocalPosition.yyy +
LocalToWorld[2].xyz * LocalPosition.zzz;
return float4(RotatedPosition + (LocalToWorld[3].xyz + ResolvedView.PreViewTranslation.xyz), 1);
}
//把instance的Origin和Transform数组重新组合成4x4矩阵返回
float4x4 GetInstanceTransform(FVertexFactoryIntermediates Intermediates)
{
return float4x4(
float4(Intermediates.InstanceTransform1.xyz, 0.0f),
float4(Intermediates.InstanceTransform2.xyz, 0.0f),
float4(Intermediates.InstanceTransform3.xyz, 0.0f),
float4(Intermediates.InstanceOrigin.xyz, 1.0f));
}
float4 VertexFactoryGetWorldPosition(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates)
{
//获取当前顶点的instance的变换
float4x4 InstanceTransform = GetInstanceTransform(Intermediates);
//应用当前顶点的instance的变换到顶点位置上(※※)
float4 TranslatedWorld =
TransformLocalToTranslatedWorld(mul(Input.Position, InstanceTransform).xyz,
PrimitiveId/*无用参数*/);
return TranslatedWorld * Intermediates.PerInstanceParams.z;
}
可以看到ISM的vertex shader中(※※),把保存在VertexBuffer的逐instance的Transform应用到了Local to World的变换之前。
关于Manual Vertex Fetch
从上面Vertex Shader绑定、shader中访问instance数据的过程中,可以看到在存取Instance Buffer的数据时,有使用或不使用Manual Vertex Fetch的两种方式: 使用:创建Instance Buffer的Vertex Buffer并创建对应SRV;shader中用InstanceId随机访问VertexFactory的UniformBuffer的SRV成员 (VertexFetch_InstanceTransformBuffer),获取instance数据. 不使用:创建Instance Buffer的Vertex Buffer,不创建对应SRV;shader中直接通过VertexFactory的VertexInput的成员访问instance数据.
移动平台是不支持Manual Vertex Fetch的:
/** Whether Manual Vertex Fetch is supported for the specified shader platform.Shader Platform must not use the mobile renderer, and for Metal, the shader language must be at least 2. */
inline bool RHISupportsManualVertexFetch(const FStaticShaderPlatform InShaderPlatform)
{
return (!IsOpenGLPlatform(InShaderPlatform) || IsSwitchPlatform(InShaderPlatform)) && !IsMobilePlatform(InShaderPlatform);
}
问题不能支持正常的LOD逻辑,所有instance使用相同的Mesh LOD并统一切换。
没有对实例进行任何形式的剔除。
4.25中,在应用的发布版本中,ISM不会被渲染,这个bug在4.26被修复了
以上问题在应用HISM (HierarchicalStaticMesh) 组件后都得到了解决,下一篇会讲HISM的使用和详细代码分析。