在本篇文章中,我将介绍如何使用Direct3D的CS来实现对图像的高斯模糊。我们先做需求分析
老实说,第一个需求就是很麻烦的,我们不得不借助其他的库来帮我们完成这个图片的解码。在进行图片解码之前,我们需要考虑一个问题,我们需要把图片数据解码成什么格式?GPU使用什么数据的格式,在本文中,GPU操作的输入是DXGI_FORMAT_R32G32B32A32_FLOAT的Texture2D。输出是DXGI_FORMAT_R32G32B32A32_FLOAT 的RWTexture2D<float4>
但是图片本身的格式是不限的,各种各样的图片格式让我们眼花缭乱。这意味着,专门需要一个类来负责图片的解码和编码。DirectX提供了一个函数 D3DX11CreateTextureFromFile 但是我并没有使用他(其实我也不会用)。不使用他的原因很简单,我们还需要Draw原来的图片和Draw模糊后的图片,你可能会说,使用Direct3D来干这件事情,但是杀鸡不用牛刀。微软所提供的WIC组件和Direct2D足够把这件事情干得足够好,还有一个原因是,这个类的数据不止被GPU使用,还有可能被CPU使用。我们需要在内存里面保存图片的数据,而不是D3DX11CreateTextureFromFile 直接到显存了(在用户的显卡不支持Direct3D11的情况下,我们还是需要模糊的)
不过这毕竟是篇教学的文章,所以本文不会牵扯到图片的编码和解码,让我们假设有这样一个类,他的定义看起来就像这样
class WICFramePixels
{
public:
…….
bool Load(LPWSTR pszFilename);
bool Save(LPWSTR pszFilename);
///The texture2D format is R32G32B32A32float
void ConvertToTexture2D(ID3D11Texture2D * &);
void ConvertFromTexture2D(ID3D11Texture2D * &);
inline UINT Width() const {return mWidth;}
inline UINT Height() const{return mHeight;}
/// other function
private:
/// other …. members
};
很好,按照这样一个思想,我们再来定义一个BluFliterr类,这个类能够进行模糊的一些相关接口,他的定义看起来就像这样。
class WICFramePixels;
class BlurFilter
{
public:
// Generate Gaussian blur weights.
static void SetGaussianWeights(float sigma);
// Manually specify blur weights.
static void SetWeights(const float weights[9]);
static void Blur(WICFramePixels&,UINT = 1);
static bool SetGpuCalcState(bool);
private:
const static int gRadius = 5;
static float mWeights[11];
static UINT mWidth;
static UINT mHeight;
static DXGI_FORMAT mFormat;
static ID3D11Texture2D * moffscreenTex;
static ID3D11ShaderResourceView * moffscreenSRV;
static ID3D11UnorderedAccessView * moffscreenUAV;
static ID3D11ShaderResourceView* mBlurredOutputTexSRV;
static ID3D11UnorderedAccessView* mBlurredOutputTexUAV;
static bool GpuCalcState;
private:
static void BuildView(WICFramePixels &);
static void GPUBlur();
static void CPUBlur(WICFramePixels &);
///<summary>
/// The width and height should match the dimensions of the input texture to blur.
/// It is OK to call Init() again to reinitialize the blur filter with a different
/// dimension or format.
///</summary>
static void Init(UINT width, UINT height);
private:
BlurFilter();
~BlurFilter();
};
我将会解释BlurFiter的那些成员的含意,让我们来走进高斯模糊的算法本身,请参考此篇文章 :http://www.cnblogs.com/leohawke/p/3257903.html
我们是在GPU里面执行代码,我们把模糊拆成了两步(这样更便于GPU执行),横向一次,竖向一次。指定注意的是,竖向应该作用在横向的结果上
但是细心的你注意到了一个问题,GPU算法的输入格式和输出格式是不同的,我们使用CPU与之建立联系的变量类型是ID3D11ShaderResourceView* 和ID3D11UnorderedAccessView*
这好像意味着我们把第一次的输出ID3D11UnorderedAccessView*(简称UAV)转换为第二次输入的格式ID3D11ShaderResourceView*(SRV)。实际上,我们不需要这样做,我们可以让这个东西指向同一快显存,这样是可行的。反过来来说,我们从同一块显存(Texture)里面创建出这两个对象。事实上,调用GPU执行的代码是这样的(伪代码)
// HORIZONTAL blur pass.
SetInputMap(moffscreenSRV);
SetOutputMap(mBlurredOutputTexUAV);
Apply();
// VERTICAL blur pass.
SetInputMap(mBlurredOutputTexSRV);
SetOutputMap(moffscreenUAV);
Apply(0);
moffscreenTex创建出moffscreenSRV和moffscreenUAV,这样最终的输出就输出到moffscreenTex,至于mBlurredOutputTexUAV和RSV,创建一个等宽高的buffer即可
这便是Init函数所做的事情代码如下
void BlurFilter::Init(UINT width, UINT height)
{
mWidth = width;
mHeight = height;
mFormat = DXGI_FORMAT_R32G32B32A32_FLOAT;
if(!GpuCalcState)
return;
// Start fresh.
ReleaseCOM(mBlurredOutputTexSRV);
ReleaseCOM(mBlurredOutputTexUAV);
// Note, compressed formats cannot be used for UAV. We get error like:
// ERROR: ID3D11Device::CreateTexture2D: The format (0x4d, BC3_UNORM)
// cannot be bound as an UnorderedAccessView, or cast to a format that
// could be bound as an UnorderedAccessView. Therefore this format
// does not support D3D11_BIND_UNORDERED_ACCESS.
D3D11_TEXTURE2D_DESC blurredTexDesc;
blurredTexDesc.Width = width;
blurredTexDesc.Height = height;
blurredTexDesc.MipLevels = 1;
blurredTexDesc.ArraySize = 1;
blurredTexDesc.Format =DXGI_FORMAT_R32G32B32A32_FLOAT;
blurredTexDesc.SampleDesc.Count = 1;
blurredTexDesc.SampleDesc.Quality = 0;
blurredTexDesc.Usage = D3D11_USAGE_DEFAULT;
blurredTexDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS;
blurredTexDesc.CPUAccessFlags = 0;
blurredTexDesc.MiscFlags = 0;
ID3D11Texture2D* blurredTex = 0;
HR(Direct3D::md3dDevice->CreateTexture2D(&blurredTexDesc, 0, &blurredTex));
D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc;
srvDesc.Format =DXGI_FORMAT_R32G32B32A32_FLOAT;
srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D;
srvDesc.Texture2D.MostDetailedMip = 0;
srvDesc.Texture2D.MipLevels = 1;
HR(Direct3D::md3dDevice->CreateShaderResourceView(blurredTex, &srvDesc, &mBlurredOutputTexSRV));
D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc;
uavDesc.Format =DXGI_FORMAT_R32G32B32A32_FLOAT;
uavDesc.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE2D;
uavDesc.Texture2D.MipSlice = 0;
HR(Direct3D::md3dDevice->CreateUnorderedAccessView(blurredTex, &uavDesc, &mBlurredOutputTexUAV));
// Views save a reference to the texture so we can release our reference.
ReleaseCOM(blurredTex);
}
细心的你一定注意到了Direct3D::md3dDevice。关于这个的创建,你可以参见如下文章: http://www.cnblogs.com/leohawke/p/3375953.html
不过与那篇文章的不同的是,实际上我枚举了显卡,一些参数发生了改变,大概是这样 的:
INT createDeviceFlags = 0;
#if defined( DEBUG) || defined(_DEBUG)
createDeviceFlags |= D3D11_CREATE_DEVICE_DEBUG;
#endif
IDXGIFactory * pFactory = nullptr;
IDXGIAdapter * pAdapter = nullptr;
DXGI_MODE_DESC * pDesc = nullptr;
D3D_FEATURE_LEVEL featureLevel;
if(FAILED(CreateDXGIFactory(__uuidof(IDXGIFactory),(void**)&pFactory)))
{
VSPRINTF(L"Error:Failed to call CreateDXGIFactory\n");
SupportState = false;
return;
}
HRESULT hr = -1;
for(UINT i = 0;pFactory->EnumAdapters(i,&pAdapter) != DXGI_ERROR_NOT_FOUND;++i)
{
hr = D3D11CreateDevice(
pAdapter, // the ith adapter
D3D_DRIVER_TYPE_UNKNOWN,
0, // no software device
createDeviceFlags,
0, 0, // default feature level array
D3D11_SDK_VERSION,
&md3dDevice,
&featureLevel,
&md3dImmediateContext);
if( SUCCEEDED(hr) )
{
break;
}
}
if( FAILED(hr) )
{
VSPRINTF(L"Error:Failed to Enum Adapter,Please Check the Hardware\n");
SupportState = false;
return;
}
if( featureLevel != D3D_FEATURE_LEVEL_11_0 )
{
VSPRINTF(L"Waring:Direct3D Feature Level 11 unsupported.\n");
SupportState = false;
return;
}
我们还有一些地方需要学习才能进入我们的 GPU不会,比如SetInputMap,SetOutputMap。在GPU代码中有这样两个变量Texture2D gInput;RWTexture2D<float4> gOutput;。我们需要建议与之的联系,这时候我们需要动用effect11,不过,这不在我们讲述的内容之内的,有个专门的类来管理这种Effect,在这里我贴出定义和内联实现,因为实现细节会增加心智负担,代码如下:
#pragma region Effect
class Effect
{
public:
Effect(ID3D11Device* device, const std::wstring& filename);
virtual ~Effect();
private:
Effect(const Effect& rhs);
Effect& operator=(const Effect& rhs);
protected:
ID3DX11Effect* mFX;
};
#pragma endregion
#pragma region BlurEffect
class BlurEffect : public Effect
{
public:
BlurEffect(ID3D11Device* device, const std::wstring& filename);
~BlurEffect();
void SetWeights(const float weights[9]) { Weights->SetFloatArray(weights, 0, 9); }
void SetInputMap(ID3D11ShaderResourceView* tex) { InputMap->SetResource(tex); }
void SetOutputMap(ID3D11UnorderedAccessView* tex) { OutputMap->SetUnorderedAccessView(tex); }
ID3DX11EffectTechnique* HorzBlurTech;
ID3DX11EffectTechnique* VertBlurTech;
ID3DX11EffectScalarVariable* Weights;
ID3DX11EffectShaderResourceVariable* InputMap;
ID3DX11EffectUnorderedAccessViewVariable* OutputMap;
};
#pragma endregion
#pragma region Effects
class Effects
{
public:
static void InitAll(ID3D11Device* device);
static void DestroyAll();
static BlurEffect* BlurFX;
};
#pragma endregion
最后CPU调用代码如下,我将会详细解释这些语句:
void BlurFilter::Blur(WICFramePixels& pixels,UINT blurCount)
{
Init(pixels.Width(),pixels.Height());
if(GpuCalcState)
{
BuildView(pixels);
for(UINT i =0;i < blurCount;i++)
GPUBlur();
// Disable compute shader.
Direct3D::md3dImmediateContext->CSSetShader(0, 0, 0);
pixels.ConvertFromTexture2D(moffscreenTex);
}
}
//
// Run the compute shader to blur the offscreen texture.
//
void BlurFilter::GPUBlur()
{
// HORIZONTAL blur pass.
D3DX11_TECHNIQUE_DESC techDesc;
HR(Effects::BlurFX->HorzBlurTech->GetDesc( &techDesc ));
for(UINT p = 0; p < techDesc.Passes; ++p)
{
Effects::BlurFX->SetInputMap(moffscreenSRV);
Effects::BlurFX->SetOutputMap(mBlurredOutputTexUAV);
Effects::BlurFX->HorzBlurTech->GetPassByIndex(p)->Apply(0, Direct3D::md3dImmediateContext);
// How many groups do we need to dispatch to cover a row of pixels, where each
// group covers 256 pixels (the 256 is defined in the ComputeShader).
UINT numGroupsX = (UINT)ceilf(mWidth / 256.0f);
Direct3D::md3dImmediateContext->Dispatch(numGroupsX, mHeight, 1);
}
// Unbind the input texture from the CS for good housekeeping.
ID3D11ShaderResourceView* nullSRV[1] = { 0 };
Direct3D::md3dImmediateContext->CSSetShaderResources( 0, 1, nullSRV );
// Unbind output from compute shader (we are going to use this output as an input in the next pass,
// and a resource cannot be both an output and input at the same time.
ID3D11UnorderedAccessView* nullUAV[1] = { 0 };
Direct3D::md3dImmediateContext->CSSetUnorderedAccessViews( 0, 1, nullUAV, 0 );
// VERTICAL blur pass.
Effects::BlurFX->VertBlurTech->GetDesc( &techDesc );
for(UINT p = 0; p < techDesc.Passes; ++p)
{
Effects::BlurFX->SetInputMap(mBlurredOutputTexSRV);
Effects::BlurFX->SetOutputMap(moffscreenUAV);
Effects::BlurFX->VertBlurTech->GetPassByIndex(p)->Apply(0, Direct3D::md3dImmediateContext);
// How many groups do we need to dispatch to cover a column of pixels, where each
// group covers 256 pixels (the 256 is defined in the ComputeShader).
UINT numGroupsY = (UINT)ceilf(mHeight / 256.0f);
Direct3D::md3dImmediateContext->Dispatch(mWidth, numGroupsY, 1);
}
Direct3D::md3dImmediateContext->CSSetShaderResources( 0, 1, nullSRV );
Direct3D::md3dImmediateContext->CSSetUnorderedAccessViews( 0, 1, nullUAV, 0 );
}
void BlurFilter::BuildView(WICFramePixels & pixels)
{
ReleaseCOM(moffscreenTex);
ReleaseCOM(moffscreenSRV);
ReleaseCOM(moffscreenUAV);
pixels.ConvertToTexture2D(moffscreenTex);
// Null description means to create a view to all mipmap levels using
// the format the texture was created with.
HR(Direct3D::md3dDevice->CreateShaderResourceView(moffscreenTex, 0, &moffscreenSRV));
HR(Direct3D::md3dDevice->CreateUnorderedAccessView(moffscreenTex, 0, &moffscreenUAV));
}
这个部分基本可以告一段落,看着上面的代码,你一定会觉得是如此的麻烦,简直就是难用,还有GPU.和部分实现代码没写.对于普通的应用程序来说,确实如此.使用DirectX的CS甚烦
但是对于游戏来说,并非如此,考虑到游戏的初始化已经做了一部分工作,我们想要个特殊的效果,也只需要对渲染出来的Textrue直接做就行了.
.GPU的代码将花费同样的篇幅来讲解,但是代码量不及这个的一半.
注:VSPRINTF宏能将字符串输出到VS的输出窗口,调试模式下.