原文链接:http://game.ceeger.com/Components/SL-ShaderPerformance.html
Compute only things that you need; anything that is not actually needed can be eliminated. For example, supporting per-material color is nice to make a shader more flexible, but if you always leave that color set to white then it's useless computations performed for each vertex or pixel rendered on screen.
仅仅计算你所需要的部分,消除任何实际上不必要的部分。例如,支持逐材质着色是一件非常好的事情,并且这样做可以让着色器更加灵活易用,但是如果你总是将颜色设置为白色,那么逐顶点或逐像素的计算将是无用的。
Another thing to keep in mind is frequency of computations. Usually there are many more pixels rendered (hence their pixel shaders executed) than there are vertices (vertex shader executions); and more vertices than objects being rendered. So generally if you can, move computations out of pixel shader into the vertex shader; or out of shaders completely and set the values once from a script.
还有一件需要注意的事情是计算频率。通常地说,所需要渲染的像素(像素着色器执行)个数要比所需要渲染的顶点(顶点着色器执行)个数要多,同时,所需要渲染的顶点个数也要比模型的个数要多。所以,一般来说,尽可能地将计算量从像素着色器移到顶点着色器中,或者完全从着色器中移除并从脚本中来赋值。
Surface Shaders are great for writing shaders that interact with lighting. However, their default options are tuned for "general case". In many cases, you can tweak them to make shaders run faster or at least be smaller:
表面着色器非常适于书写与光照相关的着色器。但是,对于"一般情况",它们的默认选项已经过优化,。在许多情况下,你可以调整它们使其运行得更快,或者即便没有提速,也可以让其更小:
approxview
directive for shaders that use view direction (i.e. Specular) will make view direction be normalized per-vertex instead of per-pixel. This is approximate, but often good enough. halfasview
for Specular shader types is even faster. Half-vector (halfway between lighting direction and view vector) will be computed and normalized per vertex, and lighting function will already receive half-vector as a parameter instead of view vector. noforwardadd
will make a shader fully support only one directional light in Forward rendering. The rest of the lights can still have an effect as per-vertex lights or spherical harmonics. This is great to make shader smaller and make sure it always renders in one pass, even with multiple lights present. noambient
will disable ambient lighting and spherical harmonics lights on a shader. This can be slightly faster. When writing shaders in Cg/HLSL, there are three basic number types: float
, half
and fixed
(as well as vector & matrix variants of them, e.g. half3 and float4x4):
当使用Cg/HLSL来写着色器的时候,主要会用到三种基本的数据类型:float,half和fixed(以及由他们组成的向量和矩阵变量,即half3和float4x4):
float
: high precision floating point. Generally 32 bits, just like float type in regular programming languages. half
: medium precision floating point. Generally 16 bits, with a range of -60000 to +60000 and 3.3 decimal digits of precision. fixed
: low precision fixed point. Generally 11 bits, with a range of -2.0 to +2.0 and 1/256th precision. Use lowest precision that is possible; this is especially important on mobile platforms like iOS and Android. Good rules of thumb are:
尽可能地使用最低的精度,这点对于iOS和Android平台特别重要。推荐的经验法则:
fixed
. float
. On mobile platforms, the key is to ensure as much as possible stays in low precision in the fragment shader. On most mobile GPUs, applying swizzles to low precision (fixed/lowp) types is costly; converting between fixed/lowp and higher precision types is quite costly as well.
在移动平台上,关键是在片段着色器中使用尽可能多低精度数据计算。在大多数移动设备的GPU中,在低精度 (fixed/lowp) 类型上应用swizzles是比较耗时的;同时,在fixed/lowp 和高精度类型之间进行转换也是需要付出很大代价的。
Fixed function AlphaTest or it's programmable equivalent, clip()
, has different performance characteristics on different platforms:
固定函数AlphaTest或者与其等功能的可编程函数clip()在不同的平台上拥有不同的性能特点:
On some platforms (mostly mobile GPUs found in iOS and Android devices), using ColorMask to leave out some channels (e.g. ColorMask RGB) can be expensive, so only use it if really necessary.
在某些平台上 (大多是iOS移动设备上的GPU和Android 设备),使用 ColorMask 中删除一些通道(例如 ColorMask RGB) 也是比较昂贵的,因此,除非真有必要,否则不要使用它。