在Directx11中,针对如何在Directx11中获取HLSL变量(主要以structured为例)并对其读写的问题,这两天做了几个小小实验。整理下思路如下。
一般用CSSetShaderResources()来将Directx的resource(主要以structure buffer为例,所以下面有时会说resource,有时会说structured buffer)和HLSL中相应的structured buffer对应起来。
如果在HLSL中的structured有指定register(比如StructuredBuffer<TestBufType> testInBuffer:register(t0)),那么就根据CSSetShaderResources(UINT startSlot,UINT numViews,ID3D11ShaderResoucesView** srvViews) 来将Directx中resource(structured buffer)对应到HLSL中相应的register(tstartSlot)。比如如果startSlot==0,那么就将srvViews指向的第一个structured buffer绑定到register(t0),同理startSlot==1,就绑定register(t1)。
如果没有指定register,但structure类型相同,那么应该可以按照从上到下的顺序来排列。但是如果structure的类型不同,那么可能会出现一些问题,详见下面示例。
对于CSSetUnorderedAccessViews(),因为只能有一个register(u0),感觉是根据structure的类型自动匹配。
UnorderedAccessViews只能有一个register:register(u0)。我曾经设置多个register(ui)的时候编译报错如下:
{
error X4509: maximum UAV register index exceeded, target has 1 slots, manual bind to
slot u1 failed.the following operation failed:CompileComputeShader( L"HandsOnLab_SimpleCS.hlsl", "SimpleCS" )
请按任意键继续. . .
}
但后来看到一些techPaper说UnorderedAccessViews可以有多个register,但是上面的编译错误怎么解释?只能以后再查资料看看。
不能对同一个resource同一时刻(注意同一时刻的概念,可以都创建,但不能同一时刻通过两种views来access resource)用ShaderResourceView来读和用UnorderedAccessViews来写,只需要用UnorderedAccessViews来读写即可。
比如对于以下HLSL程序,如果HLSL中testInBuffer和testBuffer变量都是绑定自Directx中同一个StructuredBuffer(其类型为struct Type{XMLFLOAT2;XMLFLOAT2;};)假设其全部数值为2.0f)的ShaderResourceView和UnorderedAccessView,那么程序执行的结果貌似是testInBuffer即通过shaderResourceView来access会失败,导致其传入的数值全部为0。
//==================================================
struct TestBufType
{
float2 pos;
float2 velocity;
};
StructuredBuffer<TestBufType> testInBuffer:register(t0);
RWStructuredBuffer<TestBufType> testBuffer:register(u0);
//--------------------------------------------------------------------------------------
// SimpleCS
// main entry point
//--------------------------------------------------------------------------------------
// execute one thread per group
[numthreads(1,1,1)]
void SimpleCS( uint3 Gid : SV_GroupID, uint3 DTid : SV_DispatchThreadID, uint3 GTid : SV_GroupThreadID, uint GI : SV_GroupIndex )
{
testBuffer[DTid.x].pos.x=testInBuffer[DTid.x].pos.x+0.5f;
testBuffer[DTid.x].pos.y=testInBuffer[DTid.x].pos.y;
testBuffer[DTid.x].velocity.x=testInBuffer[DTid.x].velocity.x;
testBuffer[DTid.x].velocity.y=testInBuffer[DTid.x].velocity.y+1.5f;
}
//===========================================
程序执行结果:
可以看到结果不对,因为testInBuffer[DTid.x].pos.x等的数值全部是2.0f。
Successfully created a DirectCompute device
Successfully created simple data resources
Successfully compiled the "SimpleCS" compute shader
Successfully executed the simple compute shader
Output GPU result:
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
0.5 0 0.3 0
//=====================================================
如果只用UnorderedAccessViews来读写,则结果正常
//==========================================================
struct TestBufType
{
float2 pos;
float2 velocity;
};
//a test buffer resource
RWStructuredBuffer<TestBufType> testBuffer:register(u0);
//--------------------------------------------------------------------------------------
// SimpleCS
// main entry point
//--------------------------------------------------------------------------------------
[numthreads(1,1,1)]
void SimpleCS( uint3 Gid : SV_GroupID, uint3 DTid : SV_DispatchThreadID, uint3 GTid : SV_GroupThreadID, uint GI : SV_GroupIndex )
{
testBuffer[DTid.x].pos.x=testBuffer[DTid.x].pos.x+0.5f;
testBuffer[DTid.x].pos.y=testBuffer[DTid.x].pos.y;
testBuffer[DTid.x].velocity.x=testBuffer[DTid.x].velocity.x+0.3f;;
testBuffer[DTid.x].velocity.y=testBuffer[DTid.x].velocity.y;
}
正确的程序结果如下:
Successfully created a DirectCompute device
Successfully created simple data resources
Successfully compiled the "SimpleCS" compute shader
Successfully executed the simple compute shader
Output GPU result:
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
2.5 2 2.3 2
请按任意键继续. . .
//========================================================
对于CSSetShaderResources()绑定相应resource,如果在HLSL中的structured有注明放到相应的register(t0),register(t1)中,那么就根据CSSetShaderResources(UINT startSlot,……)来放在相应的slot中。如果没有声明register,而且structure的类型不同,可能就会出现一些问题,就算按顺序排列都不行。如下
//=======================================================
// definition of the simple buffer element
struct SimpleBufType
{
int i;
float f;
};
struct TestBufType
{
float2 pos;
float2 velocity;
};
// a read-only view of the first input buffer resource
StructuredBuffer<SimpleBufType> InputBuffer0;
// a read-only view of the second input buffer resource
StructuredBuffer<SimpleBufType> InputBuffer1;
//a read-only for test buffer
StructuredBuffer<TestBufType> testInBuffer;
// a read-write view of the result buffer resource
RWStructuredBuffer<SimpleBufType> ResultBuffer:register(u0);
//a read-write view of the test result buffer resource
RWStructuredBuffer<TestBufType> testBuffer:register(u0);
//--------------------------------------------------------------------------------------
// SimpleCS
// main entry point
//--------------------------------------------------------------------------------------
// execute one thread per group
[numthreads(1,1,1)]
void SimpleCS( uint3 Gid : SV_GroupID, uint3 DTid : SV_DispatchThreadID, uint3 GTid : SV_GroupThreadID, uint GI : SV_GroupIndex )
{
testBuffer[DTid.x].pos.x=testInBuffer[DTid.x].pos.x+0.5f;
testBuffer[DTid.x].pos.y=testInBuffer[DTid.x].pos.y+1.5f;
testBuffer[DTid.x].velocity.x=testInBuffer[DTid.x].velocity.x+0.3f;
testBuffer[DTid.x].velocity.y=testInBuffer[DTid.x].velocity.y+2.5f;
}
//============================================================
//下面Directx中的代码:
/*give the compute shader access to the input buffer resources, via the shader resource views we created*/
ID3D11ShaderResourceView* aSRViews[ 3 ] ={m_pInputBuffer0SRV,m_pInputBuffer1SRV,m_pTestBuffer1SRV};
按理说上面的程序将数值全是是2.0f的ID3D11ShaderResourceView* m_pTestBuffer1SRV与HLSL中的structured的testInBuffer绑定了,然后输出testBuffer的结果。
代码运行结果如下,可以看到结果不对: