GPU Shader ALU读书笔记

TheVertex Shader ALU is a multithreaded vector processor that operates onquad-float data.It consists of two functional units. The SIMD Vector Unit is responsible forthe mov, mul, add,mad, dp3, dp4, dst, min, max, slt, and sge instructions. The Special FunctionUnit is responsiblefor the rcp, rsq, logp, expp, and lit instructions. Most of these instructionstake one cycleto execute, although rcp and rsq take more than one cycle under specificcircumstances. Theytake only one slot in the vertex shader, but they actually take longer than onecycle to executewhen the result is used immediately because that leads to a register stall.

GPU Shader ALU读书笔记_第1张图片

Nowlet’s see how these registers and instructions are typically used in the vertexshader ALU.

There are 16 input registers, 96 constant registers, 12 temporaryregisters, 1 address register, and up to 13 output registers per rasterizer.Each register can handle 4x32-bitvalues. Each 32-bit value is accessible via anx, y, z, and w subscript. That is, a 128-bit value consists of an x, y, z, andw value. To access these register components, you must add .x, .y, .z,and .w atthe end of the register name.

 

顶点着色器ALU是一个多线程的向量处理器,在quad float上运行数据。它由两个功能单元组成。SIMD矢量单元负责mov、mul、add、mad、dp3、dp4、dst、min、max、slt和sge指令。特殊功能单元负责rcp、rsq、logp、expp和lit指令。大多数指令需要一个周期才能执行,尽管在特定情况下,rcp和rsq需要不止一个周期。它们只在顶点着色器中使用一个插槽,但当结果立即使用时,它们实际上需要比一个周期更长的时间来执行,因为这会导致寄存器暂停。

现在让我们看看这些寄存器和指令通常是如何在vertexshader ALU中使用的。


每个寄存器有16个输入寄存器、96个常量寄存器、12个临时寄存器、1个地址寄存器和最多13个输出寄存器光栅化器。每个寄存器可以处理4x32位值。每个32位值都可以通过anx、y、z和w下标访问。也就是说,128位值由x、y、z和w值组成。要访问这些登记组件,必须在登记名的末尾添加.x,y,z,.w。

 

你可能感兴趣的:(Unity3d,Shader)