GPU Shader ALU读书笔记

TheVertex Shader ALU is a multithreaded vector processor that operates onquad-float data.It consists of two functional units. The SIMD Vector Unit is responsible forthe mov, mul, add,mad, dp3, dp4, dst, min, max, slt, and sge instructions. The Special FunctionUnit is responsiblefor the rcp, rsq, logp, expp, and lit instructions. Most of these instructionstake one cycleto execute, although rcp and rsq take more than one cycle under specificcircumstances. Theytake only one slot in the vertex shader, but they actually take longer than onecycle to executewhen the result is used immediately because that leads to a register stall.


Nowlet’s see how these registers and instructions are typically used in the vertexshader ALU.

There are 16 input registers, 96 constant registers, 12 temporaryregisters, 1 address register, and up to 13 output registers per rasterizer.Each register can handle 4x32-bitvalues. Each 32-bit value is accessible via anx, y, z, and w subscript. That is, a 128-bit value consists of an x, y, z, andw value. To access these register components, you must add .x, .y, .z,and .w atthe end of the register name.


你可能感兴趣的:(GPU架构,GPU,HW)