SIMD与SIMT的区别

Midgard is also a Single Instruction Multiple Data (SIMD) architecture, such that most instructions operate on multiple data elements packed in 128-bit vector registers.

摘自:https://community.arm.com/developer/tools-software/graphics/b/blog/posts/arm-mali-compute-architecture-fundamentals

The example below shows how a vec3 arithmetic operation may map onto a pure SIMD unit (pipeline executes one thread per clock):

image.png

... vs a quad-based unit (pipeline executes one lane per thread for four threads per clock):

image.png

The advantages in terms of the ability to keep the hardware units full of useful work, irrespective of the vector length in the program, is clearly highlighted by these diagrams.

摘自:https://community.arm.com/developer/tools-software/graphics/b/blog/posts/the-mali-gpu-an-abstract-machine-part-4---the-bifrost-shader-core

In SIMD, you need to specify the data array + an instruction (on which to operate the data on) + THE INSTRUCTION WIDTH.
Eg: You might want to add 2 integer arrays of length 16, then a SIMD instruction would look like (the instruction has been cooked-up by me for demo)

add.16 arr1 arr2

However, SIMT doesn't bother about the instruction width. So, essentially, you could write the above example as:

arr1[i] + arr2[i]

and then launch as many threads as the length of the array, as you want.
Note that, if the array size was, let us say, 32, then SIMD EXPECTS you to explicitly call two such 'add.16' instructions! Whereas, this is not the case with SIMT.

摘自:http://www.gpgpu-sim.org/micro2012-tutorial/4-Microarchitecture.pptx

你可能感兴趣的:(SIMD与SIMT的区别)