优化图形性能 Optimizing Graphics Performance

Good performance is critical to the success of many games. Below are some simple guidelines for maximizing the speed of your game's graphical rendering.

好的性能,是很多游戏成功的关键。下面是一些简单的指引,最大限度地提高你的游戏图形渲染速度。

Optimizing Meshes 优化网格

You only pay a rendering cost for objects that have a Mesh Renderer attached and are within the view frustum. There is no rendering cost from empty GameObjects in the scene or from objects that are out of the view of any camera.

您只需支付附有网格渲染器(Mesh Renderer)的、而且在摄像机视景体内部的对象渲染的开销。有很多空的游戏物体(GameObjects )在你的场景,并不产生渲染开销。

Modern graphics cards are really good at handling a lot of polygons but there is a significant overhead for each batch (ie, mesh) that you submit to the graphics card. So if you have a 100-triangle object it is going to be just as expensive to render as a 1500-triangle object. The "sweet spot" for optimal rendering performance is somewhere around 1500-4000 triangles per mesh.

现在的显卡性能都很好,可以处理大量的多边形,但您提交给显卡的每个批处理都会造成相当大的开销。如果你有一个100个三角面的物件和有1500个三角面的物件渲染的开销一样大。大约每1500-4000三角形一个网格,优化渲染性能"最合适"。

Usually, the best way to improve rendering performance is to combine objects together so that each mesh has around 1500 or more triangles and uses only one Material for the entire mesh. It is important to understand that combining two objects which don't share a material does not give you any performance increase at all. The most common reason for having multiple materials is that two meshes don't share the same textures, so to optimize rendering performance, you should ensure that any objects you combine share the same textures.

通常,提高渲染性能最好的办法是把对象合并在一起,使每个网格有1500左右或更多的三角面和整个网格仅使用一个材质(Material )。重要的是要明白,只合并两个物体而没有共享材质,这样不会给你带来任何性能提高。如果你想有效地合并,你需要确保你的网格结合后,只使用一种材质。多维材质最常见的原因是两个网格没有共享相同的纹理。所以,如果你要优化渲染性能,你需要确保合并的物体共享纹理。

However, when using many pixel lights in the Forward rendering path, there are situations where combining objects may not make sense, as explained below.

然而,当在正向渲染路径下使用一些像素灯,有一些情况会使得合并物体不奏效,下面解释说明。

Pixel Lights in the Forward Rendering Path 
正向渲染路径的像素灯

Note: this applies only to the Forward rendering path.
注:这仅适用于正向渲染路径。

If you use pixel lighting then each mesh has to be rendered as many times as there are pixel lights illuminating it. If you combine two meshes that are very far apart, it will increase the effective size of the combined object. All pixel lights that illuminate any part of this combined object will be taken into account during rendering, so the number of rendering passes that need to be made could be increased. Generally, the number of passes that must be made to render the combined object is the sum of the number of passes for each of the separate objects, and so nothing is gained by combining. For this reason, you should not combine meshes that are far enough apart to be affected by different sets of pixel lights.

如果您使用像素光照,那么每个网格渲染的次数和被像素灯照亮的物体渲染的次数一样多。如果你把两个相距很远的物体合并,这会增加物体的有效大小。照亮这个合并后物体的任何一小部分的所有像素灯都会在渲染过程中计算。因此需要的渲染通道数量就会增加。一般情况下,要渲染合并物体的通道数是每个单独物体的通道数之和,所以通过合并没有得到好处。出于这个原因,你不应该把相距很远而不会同时受到不同的像素灯影响的这些网格合并。

During rendering, Unity finds all lights surrounding a mesh and calculates which of those lights affect it most. The Quality Settings are used to modify how many of the lights end up as pixel lights and how many as vertex lights. Each light calculates its importance based on how far away it is from the mesh and how intense its illumination is. Furthermore, some lights are more important than others purely from the game context. For this reason, every light has a Render Mode setting which can be set to Important or Not Important; lights marked as Not Important will typically have a lower rendering overhead.

渲染网格时,Unity 找到网格周围的所有灯光。然后计算出哪些灯光影响网格最大。 质量设置是用来修改最终的灯有多少是像素光照,有多少是顶点光照。每个灯光根据灯光离网格的距离,和灯光的强度计算出它的重要性。取决于游戏环境,有些灯比其他更重要。出于这个原因,每一个灯光可以设置渲染模式( Render Mode),可以设置重要(Important )或不重要(Not Important)。灯光标记为不重要( Not Important)通常具有较低的渲染开销。

As an example, consider a driving game where the player's car is driving in the dark with headlights switched on. The headlights are likely to be the most visually significant light sources in the game, so their Render Mode would probably be set to Important. On the other hand, there may be other lights in the game that are less important (other cars' rear lights, say) and which don't improve the visual effect much by being pixel lights. The Render Mode for such lights can safely be set to Not Important so as to avoid wasting rendering capacity in places where it will give little benefit.

举个例子,试想一下一款赛车游戏,玩家的汽车打开车头灯在夜间行驶。车头灯是在游戏中最重要的灯光。出于这个原因,车头灯的渲染模式应设置为"重要"(Important)。另一方面,在游戏中其他不太重要的灯(比如汽车的尾灯),不会由像素灯而提升太多的视觉效果。这种灯的渲染模式可以放心地设置为不重要 Not Important ,以避免在不会让你得到多少好处的地方浪费渲染性能。

Per-Layer Cull Distances 每层消隐距离

In some games, it may be appropriate to cull small objects more aggressively than large ones in order to reduce number of draw calls. For example, small rocks and debris could be made invisible at long distances while large buildings would still be visible. To accomplish this culling, you can put small objects into a separate layer and setup per-layer cull distances using the Camera.layerCullDistances script function.

在一些游戏中,您可能需要将小物件剔除,以减少绘图调用的数量。例如,在足够远的距离,大型建筑物仍然可见,小石块和碎片可以隐藏掉。要做到这一点,小物件放入一个单独的层(separate layer)和使用Camera.layerCullDistances函数,设置每一层消隐距离。

Shadows 阴影

If you are deploying for Desktop platforms then you should be careful when using shadows because they can add a lot of rendering overhead to your game if not used correctly. For further details, see the Shadows page.

如果你是组建的目标是台式机平台,那么你要注意阴影;阴影开销一般较大。如果不正确使用,它们可能会为你的游戏带来大量的性能开销。有关阴影的更多细节,请阅读阴影页。

Note: Shadows are not currently supported on iOS or Android devices.

注意:请记住目前iOS或Android设备不支持阴影。

See Also 另请参见

  • Modeling Characters for Optimal Performance 人物建模优化性能
  • Rendering Statistics Window 渲染统计窗口

iOS

A useful background to iOS optimization can be found on the iOS hardware page.

IOS硬件页上可以找到iOS优化有用的资料。

Alpha-Testing (Alpha测试)

Unlike desktop machines, iOS devices incur a high performance overhead for alpha-testing (or use of the discard and clip operations in pixel shaders). You should replace alpha-test shaders with alpha-blend if at all possible. Where alpha-testing cannot be avoided, you should keep the overall number of visible alpha-tested pixels to a minimum.

与台式机不同,iOS设备的alpha测试产生比较高的性能开销。您应该替换带有alpha混色的alpha 测试着色器,尽一切可能。alpha测试无法避免的,你应该保持整体可见alpha 测试的像素数目减少到最低。

Vertex Performance 顶点性能

Generally you should aim to have no more than 40,000 vertices visible per frame when targeting iPhone 3GS or newer devices. You should keep the vertex count below 10,000 for older devices equipped with the MBX GPU, such as the original iPhone, iPhone 3G and iPod Touch 1st and 2nd Generation.

一般来说,针对iPhone 3GS或更新的设备时,目标应该使每帧可见的顶点不超过40,000。在配备MBX GPU的旧设备,你应该保持顶点数低于10,000,如原来的iPhone,iPhone 3G和iPod Touch一代和第二代。

Lighting Performance 照明性能

Per-pixel dynamic lighting will add significant rendering overhead to every affected pixel and can lead to objects being rendered in multiple passes. Avoid having more than one Pixel Light illuminating any single object and use directional lights as far as possible. Note that a Pixel Light is a one which has itsRender Mode option set to Important.

逐像素的动态照明将显着增加每个受影响的像素的渲染开销,并可能导致对象多次渲染。避免多于一个像素灯 Pixel Light照亮任何单一的物件,并尽量使用方向灯。请注意,一个像素灯,渲染模式选项设置为重要Important。

Per-vertex dynamic lighting can add significant cost to vertex transformations. Try to avoid situations where multiple lights illuminate any given object. For static objects, baked lighting is much more efficient.

逐顶点的动态照明显着增加顶点转变的开销。尽量避免多个灯照亮任何给定的物体的情况,对于静态对象,烘焙照明更加高效。

Optimize Model Geometry (优化模型几何体)

When optimizing the geometry of a model, there are two basic rules:

优化模型的几何体时,有两个基本规则:
  • Don't use any more triangles than necessary 
    三角形数目不要太多
  • Try to keep the number of UV mapping seams and hard edges (ie, doubled-up vertices) as low as possible 
    尽量保持尽可能少的UV贴图接缝和硬边缘的数目(顶点增加一倍)

Note that the actual number of vertices that graphics hardware has to process is usually not the same as the number reported by a 3D application. Modeling applications usually display the geometric vertex count, ie, the number of distinct corner points that make up a model.

请注意,图形硬件处理顶点的实际数量通常和3D应用程序显示的不一样。建模应用程序通常显示几何顶点的数量,例如构建模型不同角点的数量。

For a graphics card, however, some geometric vertices will need to be split into two or more logical vertices for rendering purposes. A vertex must be split if it has multiple normals, UV coordinates or vertex colors. Consequently, the vertex count in Unity is invariably a lot higher than the count given by the 3D application.

然而,对于一个图形卡,将需要一些几何顶点拆分成两个或两个以上的逻辑顶点来渲染。如果有多个法线,UV坐标或顶点颜色的顶点必须分割。因此,在Unity 的顶点计数总是比3D应用程序计数高了很多。

Texture Compression 纹理压缩

Using iOS's native PVRT compression formats will decrease the size of your textures (resulting in faster load times and smaller memory footprint) and can also dramatically increase rendering performance. Compressed textures use only a fraction of the memory bandwidth needed for uncompressed 32bit RGBA textures. A comparison of uncompressed vs compressed texture performance can be found in the iOS Hardware Guide.

使用IOS的原生PVRT压缩格式,将减少纹理的大小(结果是更快的加载时间和较小的内存占用),也可以大大提高渲染性能。压缩纹理仅使用未压缩32位的RGBA纹理所需的内存带宽的一小部分。未压缩与压缩纹理性能的比较,可以在iOS硬件指南找到。

Some images are prone to visual artifacts in the alpha channels of PVRT-compressed textures. In such cases, you might need to tweak the PVRT compression parameters directly in your imaging software. You can do that by installing the PVR export plugin or using PVRTexTool from Imagination Tech, the creators of the PVRT format. The resulting compressed image file with a .pvr extension will be imported by the Unity editor directly and the specified compression parameters will be preserved.

有些图片在PVRT压缩纹理的alpha通道容易产生视觉缺陷。在这种情况下,你可能需要在图像处理软件直接调整PVRT的压缩参数。你可以通过安装PVR导出插件PVR export plugin或使用Imagination Tech的 PVRTexTool,用于创建PVRT格式。产生的扩展名为 .pvr的压缩图像文件将通过Unity 编辑器直接导入和指定的压缩参数将被保留。

If PVRT-compressed textures do not give good enough visual quality or you need especially crisp imaging (as you might for GUI textures, say) then you should consider using 16-bit textures instead of RGBA textures. By doing so, you will reduce the memory bandwidth by half.

如果PVRT的压缩纹理没有给出足够好的视觉质量,或者您需要特别明快的显像(可能是GUI的纹理),那么你应该考虑使用16位的纹理,而不是RGBA的纹理。这样做,你将减少一半的内存带宽。

Tips for writing high-performance shaders
编写高性能着色器的小技巧

The GPUs on iOS devices have fully supported pixel and vertex shaders since the iPhone 3GS. However, the performance is nowhere near what you would get from a desktop machine, so you should not expect desktop shaders to port to iOS unchanged. Typically, shaders will need to be hand optimized to reduce calculations and texture reads in order to get good performance.

从iPhone 3GS开始,iOS设备的GPU已经完全支持像素和顶点着色器。然而,性能远不及台式机,所以你不应该指望台式机的着色器运用到iOS设备上效果会维持不变。通常情况下,着色器将需要手工优化,以减少计算和纹理读取,以获得良好的性能。

Complex mathematical operations 复杂的数学运算

Transcendental mathematical functions (such as pow, exp, log, cos, sin, tan, etc) will tax the GPU greatly, so a good rule of thumb is to have no more than one such operation per fragment. Consider using lookup textures as an alternative where applicable.

复杂的数学函数(如pow,exp,log,cos,sin,tan等等)会大大增加GPU负担,所以一个好的经验法则是,每一个片段不超过一个这样的操作。考虑使用查找纹理作为替代品。

It is not advisable to attempt to write your own normalize, dot, inversesqrt operations, however. If you use the built-in ones then the driver will generate much better code for you.

尝试编写自己的normalize,dot,inversesqrt 等操作,这是不可取。然而如果您使用内置的驱动程序会为你产生更好的代码。

Bear in mind also that the discard operation will make your fragments slower.

紧记discard 操作,会使你的片段速度变慢。

Floating point operations 浮点运算

You should always specify the precision of floating point variables when writing custom shaders. It is critical to pick the smallest possible floating point format in order to get the best performance.

编写自定义的着色器时,你总是指定浮点变量精度。关键是挑选精度尽可能小的浮点格式,以获得最佳的性能。

If the shader is written in GLSL ES then the floating point precision is specified as follows:-

如果用GLSL ES书写的着色器,浮点精确度规定如下: -

  • highp - full 32-bit floating point format, suitable for vertex transformations but has the slowest performance. 
    highp  - 32位浮点格式,适合用于顶点变换,但性能最慢。
  • mediump - reduced 16-bit floating point format, suitable for texture UV coordinates and roughly twice as fast as highp
    mediump  - 16位浮点格式,适用于纹理UV坐标和比highp 大约快两倍
  • lowp - 10-bit fixed point format, suitable for colors, lighting calculation and other high-performance operations and roughly four times faster thanhighp
    lowp  - 10位的顶点格式,适合对色彩,照明计算和其它高性能操作,速度大约是highp 的4倍

If the shader is written in CG or it is a surface shader then precision is specified as follows:-

如果是用CG书写的着色器或它是一个表面着色器,指定精度如下: -

  • float - analogous to highp in GLSL ES, slowest 
    float  - 类似于在GLSL ES 的highp ,最慢
  • half - analogous to mediump in GLSL ES, roughly twice as fast as float
    half  - 类似于在GLSL ES 的mediump ,比float大约快两倍
  • fixed - analogous to lowp in GLSL ES, roughly four times faster than float
    fixed - 类似于在GLSL ES的lowp,速度大约是float 的4倍

For further details about shader performance, please read the Shader Performance page.

有关着色器性能的更多细节,请阅读的着色器性能页面。

Hardware documentation 硬件说明文件

Take your time to study Apple documentations on hardware and best practices for writing shaders. Note that we would suggest to be more aggressive with floating point precision hints however.

花一点时间去学习苹果的文档, hardware and best practices for writing shaders。

Bake Lighting into Lightmaps 烘焙光照到光照贴图

Bake your scene static lighting into textures using Unity built-in Lightmapper. The process of generating a lightmapped environment takes only a little longer than just placing a light in the scene in Unity, but:

使用Unity内置的产生光照贴图工具,将你场景中的静态光照烘焙至纹理。产生使用光照贴图的环境的过程仅仅比放置一个灯光在Unity的场景多花一点点时间,但:

  • It is going to run a lot faster (2-3 times for eg. 2 pixel lights) 
    它运行速度快很多(2-3倍,例如:2个像素灯)
  • And look a lot better since you can bake global illumination and the lightmapper can smooth the results 
    看上去好很多,因为你可以烘焙全局照明和光照贴图工具可以平滑结果

Share Materials 共享材质

If a number of objects being rendered by the same camera uses the same material, then Unity iOS will be able to employ a large variety of internal optimizations such as:

相同相机,如果被渲染的物体使用相同的材质, Unity IOS能够运用多种内部优化,如:

  • Avoiding setting various render states to OpenGL ES. 
    避免设置各种渲染状态OpenGL ES的。
  • Avoiding calculation of different parameters required to setup vertex and pixel processing 
    避免设置顶点和像素处理所需的不同参数的计算
  • Batching small moving objects to reduce draw calls 
    批处理小的移动物体,减少绘制调用
  • Batching both big and small objects with enabled "static" property to reduce draw calls 
    批处理大的和小的物体,启用"静态static"的属性,以减少绘制调用

All these optimizations will save you precious CPU cycles. Therefore, putting extra work to combine textures into single atlas and making number of objects to use the same material will always pay off. Do it!

所有这些优化会为您节省宝贵的CPU周期。因此,把额外的工作放在合并纹理成单一的图集的和让物体使用相同的材质,总会有回报的。做到这一点!

Simple Checklist to make Your Game Faster
简要清单,让你的游戏速度更快

  • Keep vertex count below:
    保持顶点数如下:
    • 40K per frame when targeting iPhone 3GS and newer devices (with SGX GPU) 
      针对iPhone 3GS和更加新的设备(带SGX GPU),每帧40K
    • 10K per frame when targeting older devices (with MBX GPU) 
      针对旧设备(带MBX GPU)每帧10K时
  • If you're using built-in shaders, peek ones from Mobile category. Keep in mind that Mobile/VertexLit is currently the fastest shader. 
    如果您使用的是内置的着色器,对于移动平台。记住,Mobile / VertexLit 是目前最快的着色器。
  • Keep the number of different materials per scene low - share as many materials between different objects as possible. 
    使每场景中不同的材质的数量尽可能少 – 物体之间尽可能共享相同的材质
  • Set Static property on a non-moving objects to allow internal optimizations. 
    非移动的物体上设置静态Static 属性,使可以进行内部优化。
  • Use PVRTC formats for textures when possible, otherwise choose 16bit textures over 32bit. 
    在可能的情况下使用PVRTC格式的纹理,否则选择16bi纹理优于32bit。
  • Use combiners or pixel shaders to mix several textures per fragment instead of multi-pass approach. 
    使用合并器或像素着色器,混合每帧的多个纹理,而不是用多通道方法。
  • If writing custom shaders, always use smallest possible floating point format:
    如果编写自定义的着色器,用尽可能小的浮点格式:
    • fixed / lowp -- perfect for color, lighting information and normals, 
      用于颜色,灯光信息和法线
    • half / mediump -- for texture UV coordinates, 
      用于纹理UV坐标
    • float / highp -- avoid in pixel shaders, fine to use in vertex shader for vertex position calculations. 
      避免在像素着色器,而是使用顶点着色器,计算顶点的位置。
  • Minimize use of complex mathematical operations such as pow, sin, cos etc in pixel shaders. 
    尽量减少在像素着色器使用复杂的数学运算,如pow, sin, cos 等。
  • Do not use Pixel Lights when it is not necessary -- choose to have only a single (preferably directional) pixel light affecting your geometry. 
    当没有必要时,不要使用像素灯- 选择只有一个(最好是方向光)像素灯的光线影响您的几何图形。
  • Do not use dynamic lights when it is not necessary -- choose to bake lighting instead. 
    当没有必要时,不要使用动态光源- 选择烘焙照明。
  • Choose to use less textures per fragment. 
    每段使用较少的纹理。
  • Avoid alpha-testing, choose alpha-blending instead. 
    避免使用alpha测试,而是选择alpha混合。
  • Do not use fog when it is not necessary. 
    当没有必要时,不要使用雾效。
  • Learn benefits of Occlusion culling and use it to reduce amount of visible geometry and draw-calls in case of complex static scenes with lots of occlusion. Plan your levels to benefit from Occlusion culling. 
    了解遮挡剔除的好处,在复杂的静态场景的情况下,有很多遮挡,用它来减少可见几何体的数量和绘制调用。计划你的关卡,受益于遮挡剔除。
  • Use skyboxes to "fake" distant geometry. 
    使用天空盒制造出"假"遥远的几何体。

See Also 另见

  • Optimizing iOS Performance 优化的iOS性能
  • iOS Hardware Guide iOS硬件指南
  • iOS Automatic Draw Call Batching iOS自动描绘批处理
  • Modeling Optimized Characters 优化人物建模
  • Rendering Statistics 渲染统计

Android

Lighting Performance 光照性能

Per-pixel dynamic lighting will add significant cost to every affected pixel and can lead to rendering object in multiple passes. Avoid having more than onePixel Light affecting any single object, prefer it to be a directional light. Note that Pixel Light is a light which has a Render Mode setting set toImportant.

逐像素的动态照明将显着增加每个受影响的像素的渲染开销,并可能导致对象多次渲染。避免多于一个像素灯 Pixel Light照亮任何单一的物件,并尽量使用方向灯。请注意,一个像素灯,渲染模式选项设置为重要Important。

Per-vertex dynamic lighting can add significant cost to vertex transformations. Avoid multiple lights affecting single objects. Bake lighting for static objects.

逐顶点的动态照明显着增加顶点变换的开销。尽量避免多个灯照亮任何给定的物体的情况,对于静态对象,烘焙静态物体的照明。

Optimize Model Geometry 优化模型几何体

When optimizing the geometry of a model, there are two basic rules:

优化模型的几何体时,有两个基本规则:

  • Don't use excessive amount of faces if you don't have to 
    如果没有必要,不要使用过多的面
  • Keep the number of UV mapping seams and hard edges as low as possible 
    尽量保持尽可能少的UV贴图接缝和硬边缘的数目

Note that the actual number of vertices that graphics hardware has to process is usually not the same as what is displayed in a 3D application. Modeling applications usually display the geometric vertex count, i.e. number of points that make up a model.

请注意,图形硬件处理顶点的实际数量通常和3D应用程序显示的不一样。建模应用程序通常显示几何顶点的数量,例如构建模型不同角点的数量。

For a graphics card however, some vertices have to be split into separate ones. If a vertex has multiple normals (it's on a "hard edge"), or has multiple UV coordinates, or has multiple vertex colors, it has to be split. So the vertex count you see in Unity is almost always different from the one displayed in 3D application.

然而,对于一个图形卡,将需要一些几何顶点拆分成两个或两个以上的逻辑顶点来渲染。如果有多个法线("硬边缘上"),或者多个UV坐标或有多个顶点颜色的顶点必须分割。因此,在Unity 的顶点计数总是比3D应用程序计数高了很多。

Texture Compression 纹理压缩

All Android devices with support for OpenGL ES 2.0 also support the ETC1 compression format; it's therefore encouraged to whenever possible use ETC1 as the prefered texture format. Using compressed textures is important not only to decrease the size of your textures (resulting in faster load times and smaller memory footprint), but can also increase your rendering performance dramatically! Compressed textures require only a fraction of memory bandwidth compared to full blown 32bit RGBA textures.

支持OpenGL ES 2.0的所有Android设备还支持ETC1压缩格式(ETC1 compression format) ,因此,它鼓励尽可能优选使用ETC1纹理格式。使用压缩纹理不仅是重要的,以减少您的纹理的大小(导致更快的加载时间和较小的内存占用),但也可以极大地提高渲染性能!压缩纹理仅使用未压缩32位的RGBA纹理所需的内存带宽的一小部分。

If targeting a specific graphics architecture, such as the Nvidia Tegra or Qualcomm Snapdragon, it may be worth considering using the proprietary compression formats available on those architectures. The Android Market also allows filtering based on supported texture compression format, meaning a distribution archive (.apk) with for example DXT compressed textures can be prevented for download on a device which doesn't support it.

如果针对一个特定的图形架构,如Nvidia Tegra或者Qualcomm Snapdragon,可能要考虑在这些架构上使用专有的压缩格式。 Android Market还可以根据支持的纹理压缩格式进行过滤, 意味着分发的.apk包带有比如DXT压缩纹理,可以防止下载的设备上不支持它。

Enable Mip Maps 启动多级纹理

As a rule of thumb, always have Generate Mip Maps enabled. In the same way Texture Compression can help limit the amount of texture data transfered when the GPU is rendering, a mip mapped texture will enable the GPU to use a lower-resolution texture for smaller triangles. The only exception to this rule is when a texel (texture pixel) is known to map 1:1 to the rendered screen pixel, as with UI elements or in a pure 2D game.

根据经验,总是启用生成多级纹理。当GPU渲染时,在同样的方式纹理压缩可0以帮助限制纹理数据传输量。多级纹理让GPU能让较小的三角形使用较低分辨率的纹理,此规则的唯一例外是当texel(纹理像素)1:1映射到渲染屏幕像素,像UI元素或在纯2D游戏。

Tips for writing well performing shaders 编写性能好的着色器的技巧

Although all Android OpenGL ES 2.0 GPUs fully support pixel and vertex shaders, do not expect to grab a desktop shader with complex per-pixel functionality and run it on Android device at 30 frames per second. Most often shaders will have to be hand optimized, calculations and texture reads kept to a minimum in order to achieve good frame rates.

尽管所有的Android的OpenGL ES 2.0 GPU完全支持像素和顶点着色器,不要指望可以拿一个复杂的逐像素功能的台式机着色器在Android设备上运行每秒30帧。大多数情况下着色器将必须手工优化,计算和纹理读取保持在最低限度,以达到良好的帧速率。

Complex arithmetic operations 复杂的数学运算

Arithmetic operations such as pow, exp, log, cos, sin, tan etc heavily tax GPU. Rule of thumb is to have not more than one such operation per fragment. Consider that sometimes lookup textures could be a better alternative.

复杂的数学函数(如pow,exp,log,cos,sin,tan等等)会大大增加GPU负担,所以一个好的经验法则是,每一个片段不超过一个这样的操作。考虑使用查找纹理作为替代品。

Do NOT try to roll your own normalize, dot, inversesqrt operations however. Always use built-in ones -- this was driver will generate much better code for you.

不要尝试编写自己的normalize,dot,inversesqrt 等操作。然而如果您使用内置的,会为你产生更好的代码。

Keep in mind that discard operation will make your fragments slower.

紧记discard 操作,会使你的帧速度变慢。

Floating point operations 浮点运算

Always specify precision of the floating point variables while writing custom shaders. It is crucial to pick smallest possible format in order to achieve best performance.

编写自定义的着色器时,你总是指定浮点变量精度。关键是挑选精度尽可能小的浮点格式,以获得最佳的性能。

If shader is written in GLSL ES, then precision is specified as following:

如果用GLSL ES编写的着色器,浮点精确度规定如下: -

  • highp - full 32 bits floating point format, well suitable for vertex transformations, slowest 
    highp  - 32位浮点格式,适合用于顶点变换,但性能最慢。
  • mediump - reduced 16 bits floating point format, well suitable for texture UV coordinates, roughly x2 faster than highp
    mediump  - 16位浮点格式,适用于纹理UV坐标和比highp 大约快两倍
  • lowp - 10 bits fixed point format, well suitable for colors, lighting calculation and other high performant operations, roughly x4 faster than highp
    lowp - 10位的顶点格式,适合对颜色,照明计算和其它高性能操作,速度大约是highp 的4倍

If shader is written in CG or it is a surface shader, then precision is specified as following:

如果是用CG编写的着色器或是一个表面着色器,指定精度如下: -

  • float - analogous to highp in GLSL ES, slowest 
    float  - 类似于在GLSL ES 的highp ,最慢
  • half - analogous to mediump in GLSL ES, roughly x2 faster than float 
    half  - 类似于在GLSL ES 的mediump ,比float大约快两倍
  • fixed - analogous to lowp in GLSL ES, roughly x4 faster than float 
    fixed - 类似于在GLSL ES的lowp,速度大约是float 的4倍

For more details about general shader performance, please read the Shader Performance page. Quoted performance figures are based on the PowerVR graphics architecture, available in devices such as the Samsung Nexus S. Other hardware architectures may experience less (or more) benefit from using reduced register precision.

有关着色器性能的更多细节,请阅读的着色器性能页面。引述的性能数据是基于PowerVR图形架构,可用设备如Samsung Nexus S。其他硬件架构可能遇到从使用减少的寄存器精度或多或少的好处。

Bake Lighting into Lightmaps 烘焙光照到光照贴图

Bake your scene static lighting into textures using Unity built-in Lightmapper. The process of generating a lightmapped environment takes only a little longer than just placing a light in the scene in Unity, but:

使用Unity内置的产生光照贴图的工具,将你场景中的静态光照烘焙至纹理。产生使用光照贴图的环境的过程仅仅比放置一个灯光在Unity的场景多花一点点时间,但:

  • It is going to run a lot faster (2-3 times for eg. 2 pixel lights) 
    它运行速度快很多(2-3倍,例如:2个像素灯)
  • And look a lot better since you can bake global illumination and the lightmapper can smooth the results 
    看上去好很多,因为你可以烘焙全局照明和光照贴图工具可以平滑结果

Share Materials 共享材质

If a number of objects being rendered by the same camera uses the same material, then Unity Android will be able to employ a large variety of internal optimizations such as:

相同相机,如果被渲染的物体使用相同的材质, Unity Android能够运用多种内部优化,如:

  • Avoiding setting various render states to OpenGL ES. 
    避免设置各种渲染状态为OpenGL ES。
  • Avoiding calculation of different parameters required to setup vertex and pixel processing 
    避免设置顶点和像素处理所需的不同参数的计算
  • Batching small moving objects to reduce draw calls 
    批处理小的移动物体,减少绘制调用
  • Batching both big and small objects with enabled "static" property to reduce draw calls 
    批处理大的和小的物体,启用"静态static"的属性,以减少绘制调用

All these optimizations will save you precious CPU cycles. Therefore, putting extra work to combine textures into single atlas and making number of objects to use the same material will always pay off. Do it!

所有这些优化会为您节省宝贵的CPU周期。因此,把额外的工作放在合并纹理成单一的图集的和让物体使用相同的材质,总会有回报的。做到这一点!

Simple Checklist to make Your Game Faster
简要清单,让你的游戏速度更快

  • If you're using built-in shaders, peek ones from Mobile category. Keep in mind that Mobile/VertexLit is currently the fastest shader. 
    如果您使用的是内置的着色器,对于移动平台。记住,Mobile / VertexLit 是目前最快的着色器。
  • Keep the number of different materials per scene low - share as many materials between different objects as possible. 
    每场景中不同的材质的数量尽可能少 – 物体之间尽可能共享相同的材质
  • Set Static property on a non-moving objects to allow internal optimizations. 
    非移动的物体上设置静态Static 属性,使可以进行内部优化。
  • Use ETC1 format for textures when possible, otherwise choose 16bit textures over 32bit for uncompressed texture data. 
    在可能的情况下使用ETC1格式的纹理,否则,选择16bit纹理优于32bit未压缩的纹理数据。
  • Use mipmaps. 使用多级纹理
  • Use combiners or pixel shaders to mix several textures per fragment instead of multi-pass approach. 
    使用合并器或像素着色器,混合每帧的多个纹理,而不是用多通道方法
  • If writing custom shaders, always use smallest possible floating point format:
    如果编写自定义的着色器,用尽可能小的浮点格式
    • fixed / lowp -- perfect for color, lighting information and normals, 
      用于颜色,灯光信息和法线
    • half / mediump -- for texture UV coordinates, 
      用于纹理UV坐标,
    • float / highp -- avoid in pixel shaders, fine to use in vertex shader for vertex position calculations. 
      避免在像素着色器,而是使用顶点着色器,计算顶点的位置
  • Minimize use of complex mathematical operations such as pow, sin, cos etc in pixel shaders. 
    尽量减少在像素着色器使用复杂的数学运算,如pow, sin, cos 等。
  • Do not use Pixel Lights when it is not necessary -- choose to have only a single (preferably directional) pixel light affecting your geometry. 
    当没有必要时,不要使用像素灯- 选择只有一个(最好是方向光)像素灯的光线影响您的几何图形
  • Do not use dynamic lights when it is not necessary -- choose to bake lighting instead. 
    当没有必要时,不要使用动态光源- 选择烘焙照明
  • Choose to use less textures per fragment. 
    每段使用较少的纹理
  • Avoid alpha-testing, choose alpha-blending instead. 
    避免使用alpha测试,而是选择alpha混合。
  • Do not use fog when it is not necessary. 
    当没有必要时,不要使用雾效。
  • Learn benefits of Occlusion culling and use it to reduce amount of visible geometry and draw-calls in case of complex static scenes with lots of occlusion. Plan your levels to benefit from Occlusion culling. 
    了解遮挡剔除的好处,在复杂的静态场景的情况下,有很多遮挡,用它来减少可见几何体的数量和绘制调用。计划你的关卡,受益于遮挡剔除。
  • Use skyboxes to "fake" distant geometry. 
    使用天空盒制造出"假"遥远的几何体

See Also 参见

  • iPhone Optimizing Graphics Performance (for when the graphics architecture is known to be Imagination Tech's PowerVR.) 
    iPhone的优化图形性能(Imagination Tech的PowerVR图形架构。)

你可能感兴趣的:(优化图形性能 Optimizing Graphics Performance)