创建大量角色的GPU动画系统

【博物纳新】是UWA旨在为开发者推荐新颖、易用、有趣的开源项目，帮助大家在项目研发之余发现世界上的热门项目、前沿技术或者令人惊叹的视觉效果，并探索将其应用到自己项目的可行性。很多时候，我们并不知道自己想要什么，直到某一天我们遇到了它。

更多精彩内容请关注：lab.uwa4d.com

导读

Unity中创建的动画角色数量的提升，往往受到DrawCall、IK效果和CPU Skinning等CPU端的性能限制。本文介绍的项目提供了一种使用GPU进行动画渲染的方法，减轻CPU负担，从而能够创建上万的数量级的动画角色。

开源库链接：https://lab.uwa4d.com/lab/5d0167a272745c25a80ac832

数据结构的准备

1、结构体LODData，用来存储不同细节要求的Mesh。

public struct LodData
{
        public Mesh Lod1Mesh;
        public Mesh Lod2Mesh;
        public Mesh Lod3Mesh;

        public float Lod1Distance;
        public float Lod2Distance;
        public float Lod3Distance;
}

2、结构体AnimationTextures，用于储存转换成Texture2D数据的动画片段，每个动画片段会在顶点处进行三次采样。

public struct AnimationTextures : IEquatable
{
        public Texture2D Animation0;
        public Texture2D Animation1;
        public Texture2D Animation2;
}

3、结构体AnimationClipData，存储原始的动画片段和该动画片段在Texture2D中对应的起始和终止像素。

public class AnimationClipData
{
        public AnimationClip Clip;
        public int PixelStart;
        public int PixelEnd;
}

4、结构体BakedData，存储转换成Texture2D变量后的动画片段数据和Mesh、LOD、帧率等。

public class BakedData
{
        public AnimationTextures AnimationTextures;
        public Mesh NewMesh;
        public LodData lods;
        public float Framerate;
        ...
}

5、结构体BakedAnimationClip，存储Animation Clip数据在材质中的具体位置信息。

public struct BakedAnimationClip
{
        internal float TextureOffset;
        internal float TextureRange;
        internal float OnePixelOffset;
        internal float TextureWidth;
        internal float OneOverTextureWidth;
        internal float OneOverPixelOffset;
        public float AnimationLength;
        public bool  Looping;
        ...
}

6、结构体GPUAnimationState，存储动画片段的持续时间和编号。

 public struct GPUAnimationState : IComponentData
   {
        public float Time;
        public int   AnimationClipIndex;
        ...
    }

7、结构体RenderCharacter，准备好Material、Animation Texture、Mesh之后就可以准备进行绘制了。

struct RenderCharacter : ISharedComponentData, IEquatable
{
        public Material                Material;
        public AnimationTextures        AnimationTexture;
        public Mesh                Mesh;
        ...
}

函数方法的准备

1、CreateMesh()

从已有的SkinnedMeshRenderer和一个Mesh创建一个新的Mesh。如果第二个参数Mesh为空，则生成的新的newMesh是原来Renderer的sharedMesh的复制。

private static Mesh CreateMesh(SkinnedMeshRenderer originalRenderer, Mesh mesh = null)

通过boneWeights的boneIndex0和boneIndex1生成boneIDs，通过weight0和weight1生成boneInfluences，作为newMesh的UV2和UV3存储起来。

boneIds[i] = new Vector2((boneIndex0 + 0.5f) / bones.Length, (boneIndex1 + 0.5f) / bones.Length);
float mostInfluentialBonesWeight = boneWeights[i].weight0 + boneWeights[i].weight1;
boneInfluences[i] = new Vector2(boneWeights[i].weight0 / mostInfluentialBonesWeight, boneWeights[i].weight1 / mostInfluentialBonesWeight);
...
newMesh.uv2 = boneIds;
newMesh.uv3 = boneInfluences;

如果第二个参数Mesh非空，找到Mesh在sharedMesh中对应的bindPoses，把boneIndex0和boneIndex1映射到给定的Mesh上。

...
boneRemapping[i] = Array.FindIndex(originalBindPoseMatrices, x => x == newBindPoseMatrices[i]);
boneIndex0 = boneRemapping[boneIndex0];
boneIndex1 = boneRemapping[boneIndex1];
...

2、SampleAnimationClip()

SampleAnimationClip方法接收动画对象，单个动画片段，SkinnedMeshRenderer，帧率作为输入，输出动画片段采样过后生成的boneMatrices

private static Matrix4x4[,] SampleAnimationClip(GameObject root, AnimationClip clip, SkinnedMeshRenderer renderer, float framerate)
...
//选取当前所在帧的clip数据作为一段时间的采样
  float t = (float)(i - 1) / (boneMatrices.GetLength(0) - 3);
  clip.SampleAnimation(root, t * clip.length);

3、BakedClips()

BakedClips方法，接收动画根对象，动画片段数组，帧率，LOD数据作为输入，输出BakedData。

public static BakedData BakeClips(GameObject animationRoot, AnimationClip[] animationClips, float framerate, LodData lods)

//该方法首先获取动画根对象子对象的SkinMeshRenderer
        var skinRenderer = instance.GetComponentInChildren();

//利用这个skinRenderer作为CreateMesh方法的参数生成 BakedData的NewMesh
        bakedData.NewMesh = CreateMesh(skinRenderer);

//BakedData的LodData结构体中的mesh成员也使用CreateMesh方法生成，只不过需要的第二个参数是输入lod的mesh成员
        var lod1Mesh = CreateMesh(skinRenderer, lods.Lod1Mesh);
       ...
        bakedData.lods = new LodData(lod1Mesh, lod2Mesh, lod3Mesh, lods.Lod1Distance, lods.Lod2Distance, lods.Lod3Distance);
        
//BakedData的framerate直接使用输入的framerate
        bakedData.Framerate = framerate;

//使用SampleAnimationClip方法对每个动画片段采样得到sampledMatrix，然后添加到list中
        var sampledMatrix = SampleAnimationClip(instance, animationClips[i], skinRenderer, bakedData.Framerate);
        sampledBoneMatrices.Add(sampledMatrix);

//使用sampledBoneMatrices的维数参数作为关键帧和骨骼的数目统计
        numberOfKeyFrames += sampledMatrix.GetLength(0);
        int numberOfBones = sampledBoneMatrices[0].GetLength(1);

//使用骨骼数和关键帧数作为大小创建材质
        var tex0 = bakedData.AnimationTextures.Animation0 = new Texture2D(numberOfKeyFrames, numberOfBones, TextureFormat.RGBAFloat, false);

//将sampledBoneMatrices的数据全部存入到材质颜色中
        texture0Color[index] = sampledBoneMatrices[i][keyframeIndex, boneIndex].GetRow(0);

//创建Dictionary字段
        bakedData.AnimationsDictionary = new Dictionary();

//生成AnimationClipData需要的开始结束位置
        PixelStart = runningTotalNumberOfKeyframes + 1,
        PixelEnd = runningTotalNumberOfKeyframes + sampledBoneMatrices[i].GetLength(0) - 1

至此完成BakedData的生成。

4、AddCharacterComponents()

//Add方法是把角色转换成可以使用GPU渲染的关键
public static void AddCharacterComponents(EntityManager manager, Entity entity, GameObject characterRig, AnimationClip[] clips, float framerate)

//利用manager在entity中依次添加animation state，texturecoordinate，rendercharacter 
var animState = default(GPUAnimationState);
animState.AnimationClipSet = CreateClipSet(bakedData);
manager.AddComponentData(entity, animState);
manager.AddComponentData(entity, default(AnimationTextureCoordinate));
manager.AddSharedComponentData(entity, renderCharacter);

5、InstancedSkinningDrawer()

public unsafe InstancedSkinningDrawer(Material srcMaterial, Mesh meshToDraw, AnimationTextures animTexture)

//需要的ComputeBuffer只有76个字节，这也是CPU占用低的主要原因，传递的数据是顶点的转移矩阵和它在材质中的坐标
objectToWorldBuffer = new ComputeBuffer(PreallocatedBufferSize, 16 * sizeof(float));
textureCoordinatesBuffer = new ComputeBuffer(PreallocatedBufferSize, 3 * sizeof(float));

调用DrawMeshInstancedIndirect方法实现在场景中绘制指定数量的角色。

Graphics.DrawMeshInstancedIndirect(mesh, 0, material, new Bounds(Vector3.zero, 1000000 * Vector3.one), argsBuffer, 0, new MaterialPropertyBlock(), shadowCastingMode, receiveShadows);

绘制

1、创建绘制的角色列表

private List _Characters = new List();
private Dictionary _Drawers = new Dictionary();
 private EntityQuery m_Characters;

2、对要绘制的角色实例化一个Drawer

drawer = new InstancedSkinningDrawer(character.Material, character.Mesh, character.AnimationTexture);

3、传输坐标和LocalToWorld矩阵

var coords = m_Characters.ToComponentDataArray(Allocator.TempJob, out jobA);
var localToWorld = m_Characters.ToComponentDataArray(Allocator.TempJob, out jobB);

4、调用Draw()方法

即是DrawMeshInstancedIndirect()方法。

drawer.Draw(coords.Reinterpret_Temp(), localToWorld.Reinterpret_Temp(), character.CastShadows, character.ReceiveShadows);

效果展示

（角色数量400）

（角色数量10000）

性能分析

考虑到Android端GPU性能的不足，适当减少了生成角色的数量并且采用了较少细节的LOD模型。角色数量减少为100个，LOD面片数量约180个，动画片段保持不变。

测试机型为红米4X、红米Note2和小米8：

同时由于该项目使用了Unity的Jobs系统，大量的计算工作被迁移到Worker线程中，大大节省了CPU主线程的耗时。

快用UWA Lab合辑Mark好项目！

今天的推荐就到这儿啦，或者它可直接使用，或者它需要您的润色，或者它启发了您的思路......

请不要吝啬您的点赞和转发，让我们知道我们在做对的事。当然如果您可以留言给出宝贵的意见，我们会越做越好。

创建大量角色的GPU动画系统

导读

数据结构的准备

函数方法的准备

绘制

效果展示

性能分析

你可能感兴趣的:(unity,cpu)