4.Fill-rate, Canvases and input - 填充率,切分Canvas和输入
This chapter discusses broader issues with structuring Unity UIs.
Remediating fill-rate issues - 解决填充率问题
There are two courses of action that can be taken to reduce the stress on the GPU’s fragment pipeline:
解决GPU的片元渲染管线有两个步骤
As the UI shader is generally standardized, the most common problem is simply excessive fill-rate usage. This is most commonly due to a large number of overlapping UI elements and/or having multiple UI elements that occupy significant portions of the screen. Both of these problems can lead to extremely high levels of overdraw.
由于UI着色器通常是标准化的,所以最常见的问题就是过多的填充率使用。这通常是由于大量重叠的UI元素和/或多个UI元素占据了屏幕的重要部分。这两个问题都可能导致极高的过度绘制。
In order to alleviate fill-rate overutilization and reduce overdraw, consider the following possible remediations.
为了缓解填充率过度使用和减少过度绘制支,考虑以下可能的措施。
Eliminating invisible UI - 去掉看不见的UI
The method that requires the least redesigning of existing UI elements is to simply disable elements that are not visible to the player. The most common case where this is applicable is opening full-screen UIs with opaque backgrounds. In this case, any UI elements placed beneath the full-screen UI can be disabled.
在不重新设计UI元素的基础上最简单的方法是禁用对玩家不可见的元素。最常见的情况是当打开一个不透明的全屏ui背景时。可以禁用放置在全屏UI之下的所有UI元素。
The simplest way to do this is to disable the root GameObject or GameObjects containing the invisible UI elements. For an alternate solution, see the Disabling Canvases section.
最简单的方法是禁用根GameObject或包含不可见UI元素的GameObjects。有关替代解决方案,请参阅 Disabling Canvases一节。
Finally, make sure that no UI elements are hidden by setting their alpha to 0, as the element will still be sent to the GPU and may take precious rendering time. If a UI element doesn’t need a Graphic component, you can simply remove it and raycasting will still work.
即使将元素的alpha值设置为0变得不可见,元素仍然会被发送到GPU,并且可能会花费宝贵的渲染时间。如果一个UI元素不需要Graphic组件,直接移除它就好,射线仍然可以工作。
Simplify UI structure - 简化UI结构
To reduce the time required to rebuild and render the UI, it is important to keep the number of UI objects as low as possible. Try to bake things as much as you can. For example, don’t use a blended GameObject just to change the hue to an element, do this via material properties instead. Also, don’t create game objects acting like folders and having no other purpose than organizing your Scenes.
为了减少重绘(rebuild)和渲染UI所需的时间,一定要尽可能地减少UI对象的数量。尽量多合并成一个元素。例如,不要使用多个GameObject混合来实现更改色调的效果,而是通过材质球属性来实现。此外,不要创建像文件夹一样的游戏对象结构,除了组织你的场景没有任何用处。
Disabling invisible camera output - 关掉看不见东西的相机
If a full-screen UI with an opaque background is opened, the world-space camera will still render the standard 3D scene behind the UI. The renderer is not aware that the full-screen Unity UI will obscure the entire 3D scene.
如果打开一个背景不透明的全屏UI, world-space相机仍然会渲染UI背后的3D场景。这种渲染完全是浪费的。
Therefore, if a completely full-screen UI is opened, disabling any and all of the obscured world-space cameras will help reduce GPU stress by simply eliminating the useless work of rendering the 3D world.
因此,如果一个完全全屏的UI被打开,那么禁用被覆盖的的世界空间相机将有助于减少GPU的压力,因为这样可以简化渲染3D世界的无用工作。
If the UI doesn’t cover the whole 3D scene, you may want to render the scene to a texture once and use it instead of continuously render it. You will lose the possibility to see animated content in the 3D scene, but that should be acceptable most of the time.
Note: If a Canvas is set as “Screen Space – Overlay”, then it will be drawn irrespective of the number of cameras active in the scene.
如果UI没有覆盖整个3D场景,你可以把场景渲染到一个贴图上,而不是连续渲染。虽然这样场景就变成静止不动的了,但是一般是可以被接受的。
注意:如果一个Canvas被设置为“Screen Space – Overlay”,那么不管场景里有几个相机,Canvas都会被绘制。
Majority-obscured cameras - 能看见一点的相机
Many “full-screen” UIs do not actually obscure the entire 3D world, but leave a small portion of the world visible. In these cases, it may be more optimal to capture just the portions of the world that are visible into a render texture. If the visible portion of the world is “cached” in a render texture, then the actual world-space camera can be disabled, and the cached render texture can be drawn behind the UI screen to provide an impostor version of the 3D world.
许多“全屏”ui实际上并没有覆盖整个3D世界,而是让世界的一小部分可见。在这些情况下,更理想的方法就是将这部分捕捉到一张render texture上。如果世界的可见部分被“缓存”在render texture中,那么实际的相机可以被禁用,缓存的render texture可以被绘制到UI屏幕后面,以提供3D世界的视点。
Composition-based UIs - 基于组合的UI
It is very common for designers to create UIs via composition – combining and layering standard backgrounds and elements to create the final UI. While this is relatively simple to do, and very friendly to iteration, it is non-performant due to Unity UI’s use of the transparent rendering queue.
对于UI设计师来说,使用各种背景、元素组成一个最终UI是很常见的一件事。这样做相对简单,而且对迭代非常友好。但是由于UnityUI使用了透明渲染队列,所以它的性能很糟。
Consider a simple UI with a background, a button and some text on the button. Because objects in the transparent queue are sorted from back to front, in the case that a pixel falls within a text glyph, the GPU must sample the background’s texture, then the button’s texture, and finally the text atlas’ texture, for a total of three samples. As the complexity of the UI grows, and more decorative elements are layered onto the background, the number of samples can rise rapidly.
想想看一个具有背景、按钮和按钮上的一些文本的简单UI。由于透明队列中的对象是从后往前排序的,在一个像素落在一个文本字形内的情况下,GPU必须采样背景的纹理,然后是按钮的纹理,最后是文本图集的纹理,总共三个样本。随着UI复杂性的增加,更多的装饰元素被添加到背景中,样本的数量会迅速增加。
If a large UI is discovered to be fill-rate bound, the best recourse is to create specialized UI sprites that merge as many of the decorative/invariant elements of the UI into its background texture. This reduces the number of elements that must be layered atop one another to achieve the desired design, but is labor-intensive and increases the size of the project’s texture atlases.
如果发现一个大型UI的填充率达到瓶颈,最好的办法是创建专门的UI sprites,将UI所有的装饰/不变元素合并到它的背景纹理中。这样就减少了必须层叠在一起才能实现设计的元素,但这个工作量很大,而且也增加了项目图集的大小。
This principle of condensing the number of layered elements necessary to create a given UI onto specialized UI sprites can also be used for sub-elements. Consider a store UI with a scrolling pane of products. Each product UI element has a border, a background, and some icons to denote price, name and other information.
这个做法也适用于子元素上。想想看一个带有滚动商品列表的商店UI。每个商品UI元素都有一个边框、一个背景和一些表示价格、名称和其他信息的图标。
The store UI will need a background, but because its products scroll across the background, the product elements cannot be merged onto the store UI’s background texture. However, the border, price, name and other elements of the product’s UI element could be merged onto the product’s background. Depending on the size and number of icons, the fill-rate savings can be considerable.
商店UI需要一个背景,但是因为它的商品在背景上滚动,商品元素不能合并到商店UI的背景贴图上。但是,商品UI元素的边框、价格、名称和其他元素可以合并到商品的背景上。根据图标的大小和数量,可以节省相当多的填充率。
There are several drawbacks to combining layered elements. Specialized elements can no longer be reused, and require additional artist resources to create. The addition of large new textures may significantly increase the amount of memory needed to hold the UI textures, particularly if the UI textures are not loaded and unloaded on demand.
这种优化方式有几个缺点,这种定制的元素不能再被重用,并且需要美术同学帮忙修改。添加大的新贴图可能会显著增加保存UI贴图所需的内存数量,特别是在UI贴图没有按需加载和卸载的情况下。
UI shaders and low-spec devices- UI着色器(shader)和低端设备
The built-in shader used by Unity UI incorporates support for masking, clipping and numerous other complex operations. Because of this added complexity, the UI shader performs poorly compared to the simpler Unity 2D shader on low-end devices such as the iPhone 4.
Unity UI使用的内置着色器包含了对遮罩、裁剪和许多其他复杂操作的支持。由于增加了复杂性,UI着色器在低端设备(如iPhone 4)上的性能表现不如简单的Unity 2D着色器。
If masking, clipping and other “fancy” features are unneeded for an application targeted at low-end devices, it is possible to create a custom shader that omits the unused operations, such as this minimal UI shader:
如果遮罩、裁剪和其他“花哨”的功能对于针对低端设备的应用程序来说是不需要的,那么可以创建一个自定义着色器来省略不需要的操作,比如这个最小的UI着色器:
Shader "UI/Fast-Default" {
Properties {
[PerRendererData]
_MainTex ("Sprite Texture", 2D) = "white" {}
_Color ("Tint", Color) = (1,1,1,1)
}
SubShader
{
Tags
{
"Queue"="Transparent"
"IgnoreProjector"="True"
"RenderType"="Transparent"
"PreviewType"="Plane"
"CanUseSpriteAtlas"="True"
}
Cull Off
Lighting Off
ZWrite Off
ZTest [unity_GUIZTestMode]
Blend SrcAlpha OneMinusSrcAlpha
Pass
{
CGPROGRAM
#pragma vertex vert
#pragma fragment frag
#include "UnityCG.cginc"
#include "UnityUI.cginc"
struct appdata_t
{
float4 vertex : POSITION;
float4 color : COLOR;
float2 texcoord : TEXCOORD0;
};
struct v2f
{
float4 vertex : SV_POSITION;
fixed4 color : COLOR;
half2 texcoord : TEXCOORD0;
float4 worldPosition : TEXCOORD1;
};
fixed4 _Color;
fixed4 _TextureSampleAdd;
v2f vert(appdata_t IN)
{
v2f OUT;
OUT.worldPosition = IN.vertex;
OUT.vertex = mul(UNITY_MATRIX_MVP, OUT.worldPosition);
OUT.texcoord = IN.texcoord;
#ifdef UNITY_HALF_TEXEL_OFFSET
OUT.vertex.xy += (_ScreenParams.zw-1.0)*float2(-1,1);
#endif
OUT.color = IN.color * _Color;
return OUT;
}
sampler2D _MainTex;
fixed4 frag(v2f IN) : SV_Target
{
return (tex2D(_MainTex, IN.texcoord) + _TextureSampleAdd) * IN.color;
}
ENDCG
}
}
}
UI Canvas rebuilds - Canvas 重绘
To display any UI, the UI system must construct geometry for each UI component represented on-screen. This includes running dynamic layout code, generating polygons to represent characters in UI text strings, and merging as much geometry as possible into single meshes in order to minimize draw calls. This is a multi-step process and is described in detail in the Fundamentals section at the beginning of this guide.
要显示任何UI, UI系统必须为屏幕上表示的每个UI组件构造图形。这包括运行动态布局代码,生成多边形来表示UI文本字符串中的字符,并将尽可能多的图形合并到单个网格中以最小化draw calls。这是一个多步骤的过程,在本指南开头的基础部分有详细的描述。
Canvas rebuilds can become performance problems for two primary reasons:
Canvas重绘可以成为性能问题的两个主要原因:
Both of these problems tend to become acute as the number of elements on a Canvas increases.
Important reminder: Whenever any drawable UI element on a given Canvas changes, the Canvas must re-run the batch building process. This process re-analyzes every drawable UI element on the Canvas, regardless of whether it has changed or not. Note that a “change” is any change which affects a UI object’s appearance, including the sprite assigned to a sprite renderer, transform position & scale, the text contained in a text mesh, etc.
随着Canvas上元素数量的增加,这两个问题都变得越来越严重。
重要提示:当Canvas上的任何可绘制UI元素发生更改时,画布必须重新进行一遍合批过程。这个过程重新分析Canvas上的每个可绘制的UI元素,不管它是否已经更改。注意,“更改”是影响UI对象外观的任何更改,包括替换sprite、修改位置和大小、修改文本网格中包含的文本等。
Child order - 子物体的排序
Unity UIs are constructed back-to-front, with objects’ order in the hierarchy determining their sort order. Objects earlier in the hierarchy are considered behind objects later in the hierarchy. Batches are built by walking the hierarchy top-to-bottom and collecting all objects which use the same material, the same texture and do not have intermediate layers. An “intermediate layer” is a graphical object with a different material, whose bounding box overlaps two otherwise-batchable objects and is placed in the hierarchy between the two batchable objects. Intermediate layers force batches to be broken.
Unity的ui对象在层次结构(hierarchy面板)中的顺序决定了它们的排序顺序。层次结构中靠上的对象将排在在层次结构中靠下的对象之后。合批是通过遍历整个层次结构并收集所有使用相同材质球、相同贴图且没有中间层的对象来构建的。“中间层”是一个具有不同材质球的图形对象,它会使原本可以合批的两个对象被打破合批。
As discussed in the Unity UI Profiling Tools step, the UI profiler and frame debugger can be used to inspect a UI for intermediate layers. This is the situation where one drawable object interposes itself between two other drawable objects that are otherwise batchable.
正如在 Unity UI Profiling Tools章节中所讨论的,UI profiler和 frame debugger可用于检查UI界面中的“中间层”。在这种情况下,一个可绘制对象将自己置于另外两个可绘制对象之间,而这两个可绘制对象在其他情况下是可以合批的。
This problem most commonly occurs when text and sprites are located near one another: the text’s bounding box can invisibly overlap nearby sprites, because much of a text glyph’s polygon is transparent. This can be solved in two ways:
这个问题通常发生在text 和sprites组合的时候:text 的边界框可以在不可见的情况下重叠在精灵附近,因为大多数文本字形的多边形是透明的。这可以通过两种方式解决:
Both of these operations can be carried out in the Unity Editor with the Unity Frame Debugger open and enabled. By simply observing the number of draw calls visible in the Unity Frame Debugger, it is possible to find an order and position that minimizes the number of draw calls wasted due to overlapping UI elements.
这两个操作都可以在Unity编辑器中执行,打开并启用Frame Debugger。通过观察Frame Debugger中可见的drawcalls的数量,就可以找到一个顺序和位置来最小化由于重叠UI元素而浪费的drawcalls。
Splitting Canvases - 拆分Canvas
In all but the most trivial cases, it is generally a good idea to split up a Canvas, either by moving elements to a Sub-canvas or to a sibling Canvas.
Sibling Canvases are best used in cases where certain portions of a UI must have their draw depth controlled separately from the rest of the UI, to be always above or below other layers (e.g. tutorial arrows).
In most other cases, Sub-canvases are more convenient as they inherit their display settings from their parent Canvas.
While it may seem at first glance that it is a best practice to subdivide a UI into many Sub-canvases, remember that the Canvas system also does not combine batches across separate Canvases. Performant UI design requires a balance between minimizing the cost of rebuilds and minimizing wasted draw calls.
通常最好的方法是将元素移动到Sub-canvas或同级的canvas来拆分canvas。
当UI的某些部分必须与UI的其他部分分开控制绘制深度时,新建一个同级的canvas是最好的选择,因为它们总是在其他层的上面或下面(例如,新手教程的箭头)。
在大多数其他情况下,Sub-canvas(子canvas)更方便,因为它们从父canvas继承显示设置。
乍一看,将UI细分为多个Sub-canvas似乎是最佳实践,但请记住,canvas系统也不会跨多个canvas进行合批。高性能UI设计需要在rebuild和drawcalls之间取得平衡。
General guidelines - 常见指导方针
Because a Canvas rebatches any time any of its constituent drawable components changes, it is generally best to split any non-trivial Canvas into at least two parts. Further, it is best to try to co-locate elements on the same Canvas if the elements are expected to change simultaneously. An example might be a progress bar and a countdown timer. These both rely on the same underlying data and therefore will require updates at the same time, and so they should be placed on the same Canvas.
On one Canvas, place all elements that are static and unchanging, such as backgrounds and labels. These will batch once, when the Canvas is first displayed, and then will no longer need to rebatch afterwards.
因为Canvas 在其组成的可绘制组件发生任何更改时都会重绘,所以通常最好将重要的Canvas至少分割为两部分。此外,如果有很多元素都需要频繁修改,那么把他们放到同一个Canvas就好了,例如进度条和倒计时条。
在一个Canvas上,放置所有静态和不变的元素,如背景和标签。这些将合批一次,当Canvas第一次显示,然后将不再需要重绘/重新批处理。
On the second Canvas, place all of the “dynamic” elements – the ones that change frequently. This will ensure that this Canvas is rebatching primarily dirty elements. If the number of dynamic elements grows very large, it may be necessary to further subdivide the dynamic elements into a set of elements that are constantly changing (e.g. progress bars, timer readouts, anything animated) and a set of elements that change only occasionally.
This is actually rather difficult in practice, especially when encapsulating UI controls into prefabs. Many UIs instead elect to subdivide a Canvas by splitting out the costlier controls onto a Sub-canvas.
在第二个Canvas上,放置所有的“动态”元素——那些频繁变化的元素。确保此Canvas主要重新合批那些dirty元素。如果动态元素的数量增长得非常大,那么可能需要将动态元素进一步细分为一组不断变化的元素(例如进度条、计时器读数、任何动画)和一组偶尔变化的元素。
实际上,这在实践中相当困难,特别是在将UI控件封装到prefab中时。许多ui选择通过将更复杂的控件拆分到Sub-canvas上来细分Canvas。
Unity 5.2 and Optimized Batching - Unity 5.2 优化合批
In Unity 5.2, the batching code was substantially rewritten, and is considerably more performant compared to Unity 4.6, 5.0 and 5.1. Further, on devices with more than 1 core, the Unity UI system will move most of the processing to worker threads. In general, Unity 5.2 reduces the need for aggressively splitting a UI into dozens of Sub-canvases. Many UIs on mobile devices can now be made performant with as few as two or three Canvases.
More information on the optimizations in Unity 5.2 can be found in this blog post.
在Unity 5.2中,合批代码被大量重写,与Unity 4.6、5.0和5.1相比,它的性能要高得多。此外,在拥有1个以上内核的设备上,Unity UI系统将把大部分处理工作转移到子线程。一般来说,Unity 5.2减少了将UI划分成许多Sub-canvas的需求。许多移动设备上的ui现在只需两三个 Canvase就可以实现性能。
更多关于Unity 5.2优化的信息可以在 this blog post中找到。
Input and raycasting in Unity UI - 输入和射线
By default, Unity UI uses the Graphic Raycaster component to handle input events, such as touch events and pointer-hover events. This is generally handled by the Standalone Input Manager component. Despite the name, the Standalone Input Manager is meant to be a “universal” input manager system, and will handle both pointers and touches.
默认情况下,Unity UI使用Graphic Raycaster组件来处理输入事件,比如触摸事件和鼠标悬停事件。这通常由Standalone Input Manager组件处理。尽管叫这个名字(Standalone多指电脑),但它是一个“通用”的输入管理器系统,可以处理指针和触摸。
Erroneous mouse input detection on mobile (5.3) - 移动端鼠标输入检测错误(5.3)
Prior to Unity 5.4, each active Canvas with a Graphic Raycaster attached will run a raycast once per frame to check the position of the pointer so long as there is currently no touch input available. This will occur regardless of platform; iOS and Android devices without mice will still query the mouse’s position and attempt to discover which UI elements are beneath that position to determine if any hover events need to be sent.
This is a waste of CPU time, and has been witnessed consuming 5% or more of a Unity application’s CPU frame time.
在Unity 5.4之前,每一个带有Raycaster的Canvas会在每一帧运行一次raycast来检查指针的位置,即使当前没有可用的触摸输入。所有平台上都是这样;没有鼠标的iOS和Android设备仍然会查询鼠标的位置,并试图发现哪些UI元素位于该位置之下,以确定是否需要发送悬停事件。
这是对CPU的浪费,并且已经见证了它消耗了Unity应用程序5%或更多的CPU帧时间。
This issue is resolved in Unity 5.4. From 5.4 onward, devices without mice will not query for the mouse position and will not perform unnecessary raycasts.
If using a version of Unity older than 5.4, it is strongly recommended that mobile developers create their own Input Manager class. This can be as simple as copying Unity’s Standard Input Manager from the Unity UI source and commenting out the ProcessMouseEvent method as well as all calls to that method.
就是说5.4之后解决了,如果还在用旧版本请自己实现一个Input Manager类。
Raycast optimization - 射线优化
The Graphic Raycaster is a relatively straightforward implementation that iterates over all Graphic components that have the ‘Raycast Target’ setting set to true. For each Raycast Target, the Raycaster performs a set of tests. If a Raycast Target passes all of its tests, then it is added to the list of hits.
Graphic Raycaster是一个相对简单的实现,它遍历所有将' Raycast Target '设置为true的Graphic组件。每一个Raycast Target都会被进行测试。如果一个Raycast Target通过了所有的测试,那么它就会被添加到“被命中”列表中。
Raycast implementation details - 射线实现细节
The tests are:
The list of hit Raycast Targets is then sorted by depth, filtered for reversed targets, and filtered to ensure that elements rendered behind the camera (i.e. not visible in the screen) are removed.
The Graphic Raycaster also may cast a ray into the 3D or 2D physics system if the respective flag is set on the Graphic Raycaster’s “Blocking Objects” property. (From script, the property is named blockingObjects.)
If 2D or 3D blocking objects are enabled, then any Raycast Targets that draw beneath a 2D or 3D object on a raycast-blocking Physics Layer will also be eliminated from the list of hits.
The final list of hits is then returned.
测试内容如下:
被命中的Raycast Target列表根据深度排序,并进行过滤,以确保在相机后面的元素(即在屏幕上不可见)的被删除掉。
如果将Graphic Raycaster的“Blocking Objects”属性打开,Graphic Raycaster就可以将射线投射到3D或2D物理系统中。(在脚本中,属性名为blockingObjects。)
意思就是打开这个选项之后,射线就会被3D或2D物体挡住,影响对UI的点击。
Raycasting optimization tips - 射线优化技巧
Given that all Raycast Targets must be tested by the Graphic Raycaster, it is a best practice to only enable the ‘Raycast Target’ setting on UI components that must receive pointer events. The smaller the list of Raycast Targets, and the shallower the hierarchy that must be traversed, the faster each Raycast test will be.
For composite UI controls that have multiple drawable UI objects that must respond to pointer events, such as a button that wishes to have its background and text both change colors, it is generally better to place a single Raycast Target at the root of the composite UI control. When that single Raycast Target receives a pointer event, it can then forward the event to each interested component within the composite control.
考虑到所有的Raycast Targets都必须经过Graphic Raycaster的测试,最好的做法是只在必须接收指针事件的UI组件上启用“Raycast Targets”设置。Raycast目标的列表越小,必须遍历的层次越浅,每个Raycast测试的速度就越快。
对于具有多个可绘制UI对象的复合UI控件,这些对象必须响应指针事件,例如希望点击按钮背景和文字都发生变化,通常最好将单个Raycast Target放在复合UI控件的根节点下。当单个Raycast Target 接收到指针事件时,它可以影响所有符合组件。(比如说新建个Button,Image和Text都是开着“Raycast Targets”的,其实Text的应该关上)
Hierarchy depth and raycast filters - 层次深度和射线过滤器
Each Graphic Raycast traverses the Transform hierarchy all the way to the root when searching for raycast filters. The cost of this operation grows linearly in proportion to the depth of the hierarchy. All components found attached to each Transform in the hierarchy must be tested to see if they implement ICanvasRaycastFilter, so this is not a cheap operation.
There are several standard Unity UI components that make use of ICanvasRaycastFilter, such as CanvasGroup, Image, Mask and RectMask2D, so this traversal cannot be eliminated trivially.
在搜索Raycast过滤器时,每个Graphic Raycast遍历层次结构(hierarchy面板)直到根节点。此操作的成本与层次结构的深度成正比线性增长。在层次结构中,所有包含Transform的对象都必须进行测试,以确定它们是否实现了ICanvasRaycastFilter,因此这个操作消耗不小。
有几个标准的Unity UI组件使用了ICanvasRaycastFilter,比如CanvasGroup、Image、Mask和RectMask2D,因此这种遍历不能被简单地去掉。
Sub-canvases and the OverrideSorting property - Sub-canvas和OverrideSorting
The overrideSorting property on a Sub-canvas will cause a Graphic Raycast test to stop climbing the transform hierarchy. If it can be enabled without causing sorting or raycast detection issues, then it should be used to decrease the cost of raycast hierarchy traversals.
Sub-canvas上的overrideSorting属性将导致Graphic Raycast测试停止继续向根节点遍历。如果它不需要射线相关功能 的话,那么可以通过这个方式来降低遍历到根节点的性能消耗。