UI ProfilingTools
Unity UI分析工具
确认版本:5.3
难易度: 上級
There are several profiling tools usefulfor analyzing a Unity UI’s performance. The keytools are:
有一些有用的分析Unity UI的分析工具。主要的工具有:
· UnityProfiler
· UnityFrame Debugger
· XCode’s Instruments or Intel VTune
· XCode’s Frame Debugger or Intel GPA
The external tools provide method-level CPUprofiling with millisecond (or better) resolution, as well as detaileddraw-call and shader profiling.
这些外部工具提供了毫秒级(或更好)分辨率,及详细的draw-call和shader分析,对CPU性能分析。
Instructions for setting up and using theabove tools lie beyond the scope of this guide. Note that the XCode FrameDebugger and Instruments are only usable on IL2CPP builds for Apple platforms,and therefore can currently only be used to profile iOS builds.
设置和使用这些工具的说明不在本指南范围。注意XCode Frame Debugger and Instruments 仅适用于苹果系统的IL2CPP,因此仅适用IOS编译。
Unity Profiler
UnityProfiler
The primary use for the Unity Profiler isto perform comparative profiling: enabling and disabling elements of a UI whilethe Unity Profiler is running can quickly narrow down the portions of a UIhierarchy that are most responsible for performance issues.
Unity Profiler基本的使用是进行比较分析:运行Unity profiling时,enable和disable UI,可以快速缩小影响性能的UI层次结构的部分。
To analyze this, watch the “Canvas.BuildBatch” and “Canvas.SendWillRenderCanvases” lines in the profiler’s output.
为了分析这个,在profiler输出中看“Canvas.BuildBatch” 和“Canvas.SendWillRenderCanvases”。
Canvas.BuildBatch is the native-code calculations thatperform the Canvas Batch Building process, as described above.
Canvas.BuildBatch 是本机代码,计算CanvasBatch Building 过程的效率,如上所述。
Canvas.SendWillRenderCanvases contains the invocation of the C#scripts that are subscribed to the Canvas component’s willRenderCanvases Event.Unity UI’s CanvasUpdateRegistry classreceives this event and uses it to run the Rebuild process, described above. It is expected that any dirty UI components will update their Canvas Renderers atthis time.
Canvas.SendWillRenderCanvases包含了C#脚本的调用,也就是画布组件willRenderCanvases事件。Unity UI中CanvasUpdateRegistry 类接收这个事件,用它来运行重建过程,如上所述。预计,任何dirty UI组件这时候都会更新它们的画布渲染器。
Note: To more easily see differences in UI performance,it is generally advisable to disable all of the trace categories aside from “Rendering” and “Scripts”. This can be done by clicking onthe colored boxes beside the name of the trace category on the left-hand sideof the CPU Usage profiler.
注意:为了更容易看到用户界面性能差异,一般建议禁用所有跟踪类别,除了“渲染(Rendering)”和“脚本(Scripts)”。这可以通过点击CPU占用率探查器(CPU Usage profiler)左侧的跟踪类别旁边的彩色框来完成。
Also note that the categories can bere-ordered in the CPU profiler. Click and drag the names of the categoriesupwards or downwards to re-order them.
还需要注意的是,类别可以在CPU分析器中重新排序。点击和拖拽类别,向上或向下重新排序它们。
Unity Frame Debugger
Unity帧调试器
The Unity Frame Debugger is a useful toolfor reducing the number of draw calls generated by a Unity UI. This built-intool can be accessed via the Window menu within the Unity Editor. When enabled,it will display all draw calls generated by Unity, including those generated byUnity UI.
Unity 帧调试器,对降低UnityUI产生的draw calls很有用。这个内置的工具可以通过Unity 编辑器中,Window菜单进入。当启用后,它会显示Unity 产生的所有的drawcalls,包括Unity UI生成的draw calls。
Notably, the frame debugger will updateitself with the draw calls generated to display the Game View in the UnityEditor, and therefore can be used to try out different UI configurationswithout even entering Play Mode.
值得注意的是,帧调试器会更新自身,在Unity编辑器状态下,生成显示Game视图,因此即使不在运行模式下,也可以尝试不同的UI配置。
The location of the Unity UI draw callsdepends on the Render Mode selected on the Canvas component being drawn:
Unity UI draw calls的位置依赖于画布组件上选择的渲染模式:
ScreenSpace – Overlay will appear withinCanvas.RenderOverlays group
屏幕空间- Overlay ,会在Canvas.RenderOverlays组中出现。
·ScreenSpace – Camera will appear within the Camera.Rendergroup of
· the selectedRender Camera, as a subgroup of Render.TransparentGeometry
· 屏幕空间-相机,会在Camera.Render下选择的渲染相机,作为Render.TransparentGeometry子组出现。
· WorldSpace will appear as a subgroup ofRender.TransparentGeometry for each World Space camera in which the Canvas isvisible
All UIs can be identified by the “Shader: UI/Default” line(1) in the group or draw call’s details. See the highlighted red boxes in the below screenshot.
世界相机,为每个世界空间相机会作为Render.TransparentGeometry的子组出现。所有的UI可以在组中或drawcall详细信息中,通过“Shader: UI/Default”标识,如下截屏中红色框。
By watching this set of lines whiletweaking a UI, it is relatively simple to maximize the Canvas’ ability to combine UI elements intobatches. The most common design-related cause of broken batches isunintentional overlap.
调整一个UI时,通过观察,它相对很简单,结合UI元素,最大化画布能力。常见的引起broken Batches的原因是没有注意到的重叠。
All Unity UI components generate theirgeometry as a series of quads. However, many UI sprites or UI text glyphsoccupy only a fraction of the quads used to represent them, with the rest beingempty space. As a result, it is quite common to find that the UI’s designer has unintentionallyoverlapped multiple different quads whose textures come from differentmaterials and therefore cannot be batched.
所有Unity UI组件作为一些列四边形生成几何体。然而,很多UI 精灵(Sprites)或UI 文本(Text)仅占了四边形的一部分来表示它们,剩下的都是空白的空间。其结果是,很普遍的发现,UI 多次无意的,不同四边形重叠,其纹理来自不同材质,因此不能进行批处理。
As Unity UI operates entirely in thetransparent queue, any quads that have unbatchable quads overlaid atop themmust be drawn before the unbatchable quads, and therefore cannot be batched with other quads placed atop the unbatchable quads.
如Unity UI在透明队列中完全操作,任何没有批处理的四边形必须在批处理前,绘制覆盖它上面,因此不能在其他四边形放置在没有批处理的四边形上面,进行批处理。
Consider a case of three quads, A, B, andC. Assume all three quads overlap one another, and also assume quads A and Cuse the same material while quad B uses a separate material. Quad B thereforecannot be batched with A or C.
考虑一种例子,A,B,C。建设所有3个四边形相互重叠,同意假设A和C使用一样的材质,B使用不同的材质。四边形B因此不能与A或C分批。
If the order in the hierarchy (from top tobottom) is A, B, C then A and C cannot be batched, because B must be drawn atopA and beneath C. However, if B is placed before or after the batchable quads,then the batchable quads can actually be batched – B needs only to be drawn before or after the batched quads and doesnot interpose them.
如果层次结构(从上到下)是A,B,C, 则A和C不能分批,因此B必须放在A上方,C下方。然而如果B放在前面或Batchable四边形后,然后Batchable四边形实际上可以分批了—B需要只需在分批的四边形之前或之后被绘制,就不会干预它们。
For further discussion of this issue, seethe Child order section of the Canvas chapter.
对于这个问题进一步的讨论,可以看Canvas章节中Childorder 章节。
Instruments & VTune
Instruments&VTune
XCode’s Instruments and Intel’s VTune allow forextremely deep profiling of Unity UI rebuilds and Canvas batch calculations onApple or Intel CPUs, respectively. The method names are nearly identical to theprofiler labels discussed in the Unity Profiler section, above:
XCode的Instruments and Intel’s VTune 允许在Apple或Intel CPU上,深入地对Unity UI 重建和Canvas批计算分析。该方法的名字和上面讨论的Unity Profiler几乎一样:
Canvas::SendWillRenderCanvases is the C++ parent that calls the Canvas.SendWillRenderCanvases C#method and governs that line in the Unity Profiler. It will contain the codeused to run the Rebuild process, as described in the previous chapter.
Canvas::SendWillRenderCanvases ,C++父类,调用Canvas. SendWillRenderCanvases C#方法,在Unity Profiler控制线。包含运行重建过程的代码,如前章所述。
Canvas::UpdateBatches is identical to Canvas.BuildBatch,but includes additional boilerplate code not covered by the Unity Profilerlabel. It runs the actual Canvas Batch Building process, described above.
Canvas:: UpdateBatches 等同于Canvas. BuildBatch,但包括Unity Profiler没有涉及到的额外的样板代码。它运行实际的Canvas Batch Process,如上所述。
When used in conjunction with a Unity appbuilt via IL2CPP, these tools can be used to drill down deeper into thetranspiled C# code of Canvas::SendWillRenderCanvases. Of primary interest willbe the cost of the following methods. (Note: transpiled method namesare approximate.)
当在选择IL2CPP编译UnityApp时使用,这个工具可以用来更深入地分析Canvas::SendWillRenderCanvases的C#代码。主要关注的是下面方法的开销。(注:transpiled方法名字相似)。
IndexedSet_Sort and CanvasUpdateRegistry_SortLayoutList areused to sort the list of dirty Layout components before the layouts arerecalculated. As described above, this involves calculating the number ofparent transforms above each Layout component.
IndexedSet_Sort 和 CanvasUpdateRegistry_SortLayoutList ,在布局重新计算前,用来对dirty布局组件排序。如上所述,这涉及计算每个布局组件的父物体数目。
ClipperRegistry.Cull calls all registered implementors ofthe IClipRegion interface. Built-in implementors includeRectMask2D, which uses the IClippableinterface.
ClipperRegistry.Cull调用所有的IClipRegion 接口。内置方法如使用IClippable接口的RectMask2D。
During ClipperRegistry.Cull calls, RectMask2D componentsloop over all clippable elements contained within their hierarchy and asks themto update their culling information.
调用ClipperRegistry.Cull 时,RectMask2D 组件遍历包含层次结构的元素,要求它们更新culling信息。
Graphic_Rebuild will contain the cost of actuallycalculating the meshes needed to represent Image, Text or other Graphic-derivedcomponents. Beneath this will be several other methods
Graphic_Rebuild 包含实际地计算需要重置图片、文字或其他衍生的图形组件的网格。下面是其他几种方法:
like Graphic_UpdateGeometry and,most notably, Text_OnPopulateMesh.
和Graphic_UpdateGeometry 一样,最值得注意的是Text_OnPopulateMesh。
Text_OnPopulateMesh is generally a hotspot when Best Fitis enabled. This is discussed in more detail later in this guide.
Text_OnPopulateMesh ,当Best Fit启动后,通常是一个热点。这将在后面详细讨论。
Meshmodifiers, such as Shadow_ModifyMesh and Outline_ModifyMesh,will also run here. The cost of calculating component drop shadows, outlinesand other special effects can be seen via these methods.
网格修改器,如Shadow_ModifyMesh 和 Outline_ModifyMesh,也将在这里运行。计算阴影,轮廓和其他特别效果的开销可以通过这个方法看出。
Xcode Frame Debugger & Intel GPA
Xcode Frame Debugger & Intel GPA
Low-level frame debugging tools areessential for profiling the cost of individual portions of the batched UI ,s well as monitoring the cost of UIoverdraw. UI overdraw is discussed in more detail later in this guide.
低级别的帧调试工具是分析UI批处理中单个部分开销,及检测UI透支开销必要的工具。UI透支将在后面详细讨论。
Using the Xcode Frame Debugger
使用Xcode 帧调试器
To test whether a given UI is overstressingthe GPU, Xcode’s built-in GPUdiagnostics tools can be employed. First, configure the project in question touse Metal or OpenGLES3, then make a build and open the resulting Xcode project. Xcodecannot profile Unity if Unity is running under OpenGLES 2 and therefore thesetechniques cannot be used on older devices.
为了检测UI是否过度使用GPU,可以使用Xcode的内置GPU诊断工具。首先,配置工程使用Metal 或则OpenGLES2,然后编译,打开编译后的Xcode工程。Xcode不能分析Unity,如果Unity选择OpenELES2,因此这种技术不能再旧设备上运行。
Note: On some versions of Xcode, it is necessary toselect the appropriate Graphics API in the Build Scheme in order to make thegraphics profiler work. To do this, go to the Product menu in Xcode, expand theScheme menu item, and choose Edit Scheme.... Select the Run target and go tothe Options tab. Change the GPU Frame Capture option to match the API used byyour project. Assuming the Unity project is set up to automatically select agraphics API, then most modern iPads will default to using Metal. If in doubt,start the project and look at the debug logs in Xcode. One of the early linesshould indicate which rendering path (Metal, GLES3 or GLES2) is beinginitialized.
注意:在一些Xcode版本中,为了让图形分析器(graphics profiler)工作,编译时需要选择合格的图形 (Graphic)API,在Xcode,project菜单中,展开Schemu菜单,选择Edit Scheme……选择Runtarget,在选项中。改变GPU Frame Capture选项来,来匹配你项目中使用的API。假设Unity工程,自动设置图形API,那么最现代化的Ipads将默认使用Metal。如果有以为,启动工程,在Xcode中调试日志。
Note: The above adjustments should not necessary as ofXcode 7.4, but is still occasionally observed to be necessary in Xcode 7.3.1and older.
注意:上面这些修改在Xcode 7.4中没有必要,但在Xcode7.3.1或更老的版本中,这些修改有时会有必要。
Build and run the project on an iOS device.The GPU profiler can be found by showing the Debug pane in Xcode’s Navigator sidebar, and clicking onthe “FPS” entry.
在IOS设备上编译运行工程。GPU分析器会在Xcode导航栏中调试窗口中看见。点击”FPS”进入。
The first point of interest in the GPUprofiler is the set of three bars in the center of the screen, labeled “Tiler”, “Renderer”, and “Device”. Of these two:
GPU分析器最大的好处是在屏幕中央有三个条状图,标志了“Renderer”和“ Device”。这两种:
“Tiler” is generally a measure of how stressed the GPU is by processinggeometry, which includes time spent in vertex shaders.
“Tiler”通常,衡量GPU处理几何体时的压力,包括顶点着色器的时间开销。
Generally,a high “Tiler” usage indicates either excessively slow vertex shaders or anexcessive number of vertices being drawn.
通常,“Tiler”高的使用表示了过度的缓慢的顶点着色器或则绘制过度的顶点数量。
”Renderer” is generally a measure of how stressed the GPU’s pixel pipelines are.
“Renderer”,通常衡量GPU像素管线的压力。
Generally,high “Renderer” usage indicates that an application is exceeding the maximumfill-rate of the GPU, or has inefficient fragment shaders.
通常“Renderer”高的使用,表明了一个应用程序超越GPU填充率的最大值,或则影响了片段着色器。
“Device” is a composite measure of overall GPU usage, which includes both “Tiler” and “Renderer” performance. It can generally be ignored, as it will roughly trackthe higher of the “Tiler” or “Renderer” measurements.
“Device”,合成衡量GPU使用,包括“Tiler”和“Renderer”性能。由于粗超地跟踪“Tiler”或“Renderer”,它很容易被忽略。
For more information on Xcode’s GPU profiler, see this documentation article.
关于Xcode GPU分析器更多信息,可以看这边文章。
Xcode’s Frame Debugger can be triggered by clicking on the small ‘Camera’ icon hidden at the bottom of the GPUprofiler. It is highlighted by an arrow and a red box in the followingscreenshot.
Xcode帧调试器可以通过点击隐藏在GPU分析器下,”Camera”图标,触发。在下面的截图中,红色箭头和红色方框强调了这个按钮。
After a brief pause, the Frame Debugger’s summary view should appear, like so:
file:///C:/Users/user/AppData/Local/Temp/msohtmlclip1/01/clip_image003.png
短暂的暂停后,帧调试器总结了会出现的大概内容,如下:
When using the default UI shader, the costof rendering geometry generated by the Unity UI system will show up under the “UI/Default” shader pass, assuming the default UIshader has not been replaced with a custom shader. It is possible to see thisdefault UI shader in the above screenshot as Render Pipeline “UI/Default.”
file:///C:/Users/user/AppData/Local/Temp/msohtmlclip1/01/clip_image004.png
当使用默认UI着色器,渲染UnityUI系统产生的几何的开销会在 “UI/Default”着色器上。假设默认的UI着色器没有被一个自定义着色器修改。它将会在默认的UI着色器,如上面截图所示的,渲染管线““UI/Default”。
Unity UI only generates quads and so thevertex shader is unlikely to stress the tiler pipeline of the GPU. Any problemsthat appear in this shader pass are likely due to fill-rate issues.
Unity UI仅生成四边形,因此顶点着色器不会使GPU”tilter”管线有压力。任何着色器出现的问题几乎都是填充率的问题。
Analyzing profiler results
分析分析器的结果
After gathering profiling data, severalconclusions might be drawn.
在收集了分析器的数据后,可以总结几个结论。
If Canvas.BuildBatch or Canvas::UpdateBatches seemsto be using an excessive amount of CPU time, then the likely problem is anexcessive number of Canvas Renderer components on a single Canvas. See the Splitting Canvasessection of the Canvas chapter.
如果Canvas.BuildBatch 或 Canvas::UpdateBatches 过多花费GPU时间,最可能的原因是单个的画布上,过度画布渲染组件,查看画布章节,Spliting Canvases部分。
If an excessive amount of time is spentdrawing the UI on the GPU, and the frame debugger indicates that the fragmentshader pipeline is the bottleneck, then the UI is likely exceeding the pixelfill rate which the GPU is capable of. The most likely cause is excessive UIoverdraw. See the Remediating fill-rate issues section ofthe Fill-rate, Canvases and input chapter.
如果GPU在绘制UI时时间花费过多,帧调试器指示片段着色器遇到了瓶颈,那么UI很可能是超越GPU承受范围的像素填充率。最可能的原因是,UI过度透支。查看 Fill-rate, Canvases and input章节Remediating fill-rate issuse部分。
If Graphic Rebuilds are using excessiveCPU, as seen by a large portion of CPU time going to Canvas.SendWillRenderCanvases or Canvas::SendWillRenderCanvases,then deeper analysis is needed. Some portion of the Graphic Rebuild process islikely responsible.
如果图形重建过度使用CPU,大部分时间花费在Canvas.SendWillRenderCanvases 或 Canvas::SendWillRenderCanvase,那么需要更进一步的分析。图形重建很可能要负责。
In the case that a large portion of WillRenderCanvas isspent inside IndexedSet_Sort orCanvasUpdateRegistry_SortLayoutList,then time is being spent sorting the list of dirty Layout components. Considerreducing the number of Layout components on the Canvas. SeeReplacing layouts with RectTransforms and Splitting Canvases sections for possibleremediations.
如果WillRenderCanvas 花费了很多在 IndexedSet_Sort 或CanvasUpdateRegistry_SortLayoutList,那么花费了很多时间在dirty布局组件上。考虑减少画布上布局组件的数量。
If excessive time seems to be spent in Text_OnPopulateMesh,then the culprit is simply the generation of text meshes. See the Best Fitand Disabling Canvas Renderers sections forpossible remediations, and consider the advice inside Splitting Canvases if much of the textbeing rebuilt is not actually having its underlying string data changed.
如果过多的时间花费在Text_OnPopulateMesh,那么起因是Text 网格。查看 Best Fit and Disabling Canvas Renderers部分,如果有很多文本需要重建,实际上不是它的字符串数据在改变,考虑Splitting Canvases的建议。
If time is spent inside Shadow_ModifyMesh or Outline_ModifyMesh (orany other implementation of ModifyMesh), then the problem isexcessive time spent calculating mesh modifiers. Consider removing thesecomponents and achieving their visual effect via static images.
如果时间花费在 Shadow_ModifyMesh 或 Outline_ModifyMesh (或任何其他的ModifyMesh),那么问题是过多时间花费在计算网格修改。考虑移除这些组件,通过静态图片达到效果。
If there is no particular hotspot within Canvas.SendWillRenderCanvases,or it appears to be running every frame, then the problem is likely thatdynamic elements have been grouped together with static elements and areforcing the entire Canvas to rebuild too frequently. See the Splitting Canvases section.
如果在Canvas.SendWillRenderCanvases中没有特别的热点,或者它似乎每帧都在运行,那么问题很有可能是动态元素已经和静态元素结合在一起,并强制整个画布频繁重建。查看 Splitting Canvases 部分。
Endnotes
尾注
1. Assuming that the UI shader has not beenreplaced with a custom shader.
假设UI着色器没有被自定义着色器替代。
原文地址:http://unity3d.com/jp/learn/tutorials/topics/best-practices/unity-ui-profiling-tools?playlist=30089
|