[MetalKit]Working with Particles in Metal part3粒子系统3

本系列文章是对 http://metalkit.org 上面MetalKit内容的全面翻译和学习.


上次我们关注的是如何操纵GPU上的Model I/O对象的顶点.本文我们用另一种方式来通过计算线程来创建粒子.我们可以重用上次的playground,从修改metal视图代理类的Particle结构体开始,只需要包含两个GPU更新用的成员就行了-positionvelocity:

struct Particle {
    var position: float2
    var velocity: float2

我们还可以删除timer变量, 及translate(by:)update()方法了.改得最多的是initializeBuffers()方法:

func initializeBuffers() {
    for _ in 0 ..< particleCount {
        let particle = Particle(
                position: float2(Float(arc4random() %  UInt32(side)), 
                        Float(arc4random() % UInt32(side))), 
                velocity: float2((Float(arc4random() %  10) - 5) / 10, 
                        (Float(arc4random() %  10) - 5) / 10))
    let size = particles.count * MemoryLayout.size
    particleBuffer = device.makeBuffer(bytes: &particles, length: size, options: [])


最重要的部分则是在配置指令编码器时.设置threads per group数量为2D网格,一边为thread execution width,另一边为maximum total threads per threadgroup,这两个值是GPU的硬件特征值,且在执行期间不会改变.设置threads per grid为一维数组,size由粒子数量决定:

let w = pipelineState.threadExecutionWidth
let h = pipelineState.maxTotalThreadsPerThreadgroup / w
let threadsPerGroup = MTLSizeMake(w, h, 1)
let threadsPerGrid = MTLSizeMake(particleCount, 1, 1)
commandEncoder.dispatchThreads(threadsPerGrid, threadsPerThreadgroup: threadsPerGroup)

注意:在新的Metal 2中,dispatchThreads(:)可以不指定线程组数而直接工作.与使用旧的dispatchThreadgroups(:)方法相比,新方法计算组数,并当网格尺寸不是组尺寸的倍数时提供nonuniform thread groups,并确保没有未使用的线程.


Particle particle = particles[id];
float2 position = particle.position;
float2 velocity = particle.velocity;
int width = output.get_width();
int height = output.get_height();
if (position.x < 0 || position.x > width) { velocity.x *= -1; }
if (position.y < 0 || position.y > height) { velocity.y *= -1; }
position += velocity;
particle.position = position;
particle.velocity = velocity;
particles[id] = particle;
uint2 pos = uint2(position.x, position.y);
output.write(half4(1.), pos);
output.write(half4(1.), pos + uint2( 1, 0));
output.write(half4(1.), pos + uint2( 0, 1));
output.write(half4(1.), pos - uint2( 1, 0));
output.write(half4(1.), pos - uint2( 0, 1));




至此,粒子渲染系统结束,感谢FlexMonkey分享对计算概念的见解,源代码source code已发布在Github上.

