承接上一篇博客,我们探讨下阴影丢失和阴影抖动的原因和提出相应的解决办法。
再次看看试验场景的阴影丢失或者阴影显示错乱的现象:
错误的显示:(上个教程不完善的CSM算法)
正确的显示(提前放出来,本节改进可以达到的效果):
这里探讨上节教程的CSM算法为什么是不完善的?其实渲染阴影不确实,不用猜,原因是渲染到ShadowMap存储的深度值不正确,其实可以尝试VS Graphcis Debugger 断点下Shader程序,你会发现原本应该有阴影但没有显示阴影的像素的PixelShader中,从ShadowMap采样的深度值和该像素点在LightSpace空间计算得到的深度值相差不超过0.0001,毫无疑问,其实就是ShadowMap的深度值就是该像素点在LightSpace空间计算得到的深度值,如下A点所示:
根据平行光的入射方向,我们判断出A点的遮挡点B,也就是应该渲染到ShadowMap被A点采样的点,如下图所示:
为什么渲染A点对应的遮挡点B点无法渲染到ShadowMap,原因很简单,A点和B点不同属于同一个Cascasde,继续看下面一张图:
(1)假设1:B点是高建筑物或者山这些高的物体,点A的被遮挡点为点B。而点A和点B的处于不同的AABB包围盒或者说层级(Cascade),按照上篇博客的那种CSM算法, 毫无疑问,A点是无法渲染出阴影的,因为CSM的每张ShadowMap只负责渲染自己相应的Cascade或者自己层级内的三角形。
(2)假设2:C点是一栋高楼大厦,但是C点完全不位于所有的ShadowMap的Cacascade内,而C点对绿色所在的那个层级的某点造成了阻挡。毫无疑问,此时,我们按照最原始的CSM计算办法也是没有用的。
综上两点:长方形的Cacadcade得把所有投射在本层级中的点都得在这个长方形之内。
那么怎么解决办法是什么?其实很简单,就是拉大每个层级(Cacade)或者说每个AABB包围盒的远截面或者近截面(far plane,near plane),让每个层级的所有被遮挡点包含进来。比如拉大上面的红色的层级的远截面或者近截面把A点的遮挡点B包含进来。
那么怎么计算每个层级新诞生的远截面或者近截面(far plane,near plane)来满足 每个层级把遮挡点包含进去。其实很简单,步骤为下:
[1].计算整个场景在世界空间的AABB包围盒
[2].将场景的AABB包围盒(八个顶点)从世界空间变化到LightViewSpace空间,此时应该是变为OBB包围盒(还是长方体)
[3].将场景在LightViewSpace(我更喜欢称其为LightSpace)的OBB包围盒分为12个三角形,然后让着这12个三角形以相应层级的AABB包围盒为中心,进行裁剪,生成N个三角形。具体说下这一步,因为OBB或者AABB是长方体,有6个面,可以分为12个三角形。然后让每个三角形以相应层级的AABB包围盒为中心,进行裁剪。
这里让三角形对着相应层级的AABB包围盒的Xmin,Xmax,Ymin,Ymax分别裁剪。记得一个三角形对着一个面裁剪有三种情况,(1)完全位于一个面之外 ,三角形被剔除 (2)完全位于一个面之内 ,三角形保留 (3)三角形一个顶点位于面之内,两个顶点位于面之外,这样三角形被消灭的同时生成一个新的三角形。(4)三角形两个顶点位于面之内,一个顶点位于面之外,这样三角形被消灭的同时生成两个新的三角形。
如下图所示:
当然一个三角形经历Xmin裁剪后,生成的三角形,继续加入Xmax的裁剪,生成的三角形,然后加入Ymin的裁剪,一次类推。。。。因此一个三角形经过Xmin,Xmax,Ymin,Ymax四个面裁剪后最多可能生成16个更小的三角形(假设每个面裁剪都生成新的两个,2X2X2X2=16). 随便提一下,前面的教程“软件光栅器(SoftRastar)的实现”的CVV裁剪就可以这样做,虽然这样的算法有点粗糙。
对整个场景的OBB包围盒的12个三角形的每个三角形,都进行一次包围盒的裁剪,然后对生成的在层级AABB包围盒内部的所有三角形进行遍历,计算所有三角形顶点的minZ和MaxZ, 即得到nearPlane和farPlane,用这两个值作为层级AABB包围盒的远截面和近截面值。代码如下所示:
void CascadedShadowsManager::ComputeNearAndFar(FLOAT& fNearPlane, FLOAT& fFarPlane, FXMVECTOR vLightCameraOrthographicMin, FXMVECTOR vLightCameraOrthographicMax,
XMVECTOR* pvPointsInCameraView)
{
//初始化nearPlane和farPlane
fNearPlane = FLT_MAX;
fFarPlane = -FLT_MAX;
Triangle triangleList[16];
int iTriangleCnt = 1;
triangleList[0].pt[0] = pvPointsInCameraView[0];
triangleList[0].pt[1] = pvPointsInCameraView[1];
triangleList[0].pt[2] = pvPointsInCameraView[2];
triangleList[0].culled = false;
//将场景的AABB分为多个三角形
static const int iAABBTriIndexes[] =
{
0,1,2, 1,2,3,
4,5,6, 5,6,7,
0,2,4, 2,4,6,
1,3,5, 3,5,7,
0,1,4, 1,4,5,
2,3,6, 3,6,7
};
int iPointsPassesCollision[3];
//遍历构成场景的AABB的12个三角形,求出每个三角形与Xmin,Xmax,Ymin,Ymax四个面裁剪得到的三角形。
//求出最终裁剪得到的所有三角形的Zmin和Zmax来充当fNearPlane,fFarPlane
float lightCameraMinX = XMVectorGetX(vLightCameraOrthographicMin);
float lightCameraMaxX = XMVectorGetX(vLightCameraOrthographicMax);
float lightCameraMinY = XMVectorGetY(vLightCameraOrthographicMin);
float lightCameraMaxY = XMVectorGetY(vLightCameraOrthographicMax);
for (int index = 0; index < 12; ++index)
{
triangleList[0].pt[0] = pvPointsInCameraView[iAABBTriIndexes[index * 3 + 0]];
triangleList[0].pt[1] = pvPointsInCameraView[iAABBTriIndexes[index * 3 + 1]];
triangleList[0].pt[2] = pvPointsInCameraView[iAABBTriIndexes[index * 3 + 2]];
iTriangleCnt = 1;
triangleList[0].culled = false;
//每个三角形经过四个面(Xmin,Xmax,Ymin,Ymax)的裁剪,最多衍生出16个三角形
for (int frustumPlaneIter = 0; frustumPlaneIter < 4; ++frustumPlaneIter)
{
float fEdge;
int iComponent;
if (frustumPlaneIter == 0)
{
fEdge = lightCameraMinX;
iComponent = 0;
}
else if (frustumPlaneIter == 1)
{
fEdge = lightCameraMaxX;
iComponent = 0;
}
else if (frustumPlaneIter == 2)
{
fEdge = lightCameraMinY;
iComponent = 1;
}
else if (frustumPlaneIter == 3)
{
fEdge = lightCameraMaxY;
iComponent = 1;
}
for (int triIter = 0; triIter < iTriangleCnt; ++triIter)
{
//跳过被删除的三角形
if (!triangleList[triIter].culled)
{
int iInsideVertCount = 0;
XMVECTOR temOrder;
//x=MinX 面裁剪
if (frustumPlaneIter == 0)
{
for (int triPiIter = 0; triPiIter < 3; ++triPiIter)
{
if (XMVectorGetX(triangleList[triIter].pt[triPiIter]) > XMVectorGetX(vLightCameraOrthographicMin))
{
iPointsPassesCollision[triPiIter] = 1;
}
else
{
iPointsPassesCollision[triPiIter] = 0;
}
iInsideVertCount += iPointsPassesCollision[triPiIter];
}
}
//x=MaxX 面裁剪
else if (frustumPlaneIter == 1)
{
for (int triPiIter = 0; triPiIter < 3; ++triPiIter)
{
if (XMVectorGetX(triangleList[triIter].pt[triPiIter]) < XMVectorGetX(vLightCameraOrthographicMax))
{
iPointsPassesCollision[triPiIter] = 1;
}
else
{
iPointsPassesCollision[triPiIter] = 0;
}
iInsideVertCount += iPointsPassesCollision[triPiIter];
}
}
//x=MinY 面裁剪
else if (frustumPlaneIter == 2)
{
for (int triPiIter = 0; triPiIter < 3; ++triPiIter)
{
if (XMVectorGetY(triangleList[triIter].pt[triPiIter]) > XMVectorGetY(vLightCameraOrthographicMin))
{
iPointsPassesCollision[triPiIter] = 1;
}
else
{
iPointsPassesCollision[triPiIter] = 0;
}
iInsideVertCount += iPointsPassesCollision[triPiIter];
}
}
//x=MinX 面裁剪
if (frustumPlaneIter == 3)
{
for (int triPiIter = 0; triPiIter < 3; ++triPiIter)
{
if (XMVectorGetY(triangleList[triIter].pt[triPiIter]) < XMVectorGetY(vLightCameraOrthographicMax))
{
iPointsPassesCollision[triPiIter] = 1;
}
else
{
iPointsPassesCollision[triPiIter] = 0;
}
iInsideVertCount += iPointsPassesCollision[triPiIter];
}
}
//移动要裁剪点到顶点数组的开头
if (iPointsPassesCollision[1] && !iPointsPassesCollision[0])
{
temOrder = triangleList[triIter].pt[0];
triangleList[triIter].pt[0] = triangleList[triIter].pt[1];
triangleList[triIter].pt[1] = temOrder;
iPointsPassesCollision[0] = true;
iPointsPassesCollision[1] = false;
}
if (iPointsPassesCollision[2] && !iPointsPassesCollision[1])
{
temOrder = triangleList[triIter].pt[1];
triangleList[triIter].pt[1] = triangleList[triIter].pt[2];
triangleList[triIter].pt[2] = temOrder;
iPointsPassesCollision[1] = true;
iPointsPassesCollision[2] = false;
}
if (iPointsPassesCollision[1] && !iPointsPassesCollision[0])
{
temOrder = triangleList[triIter].pt[0];
triangleList[triIter].pt[0] = triangleList[triIter].pt[1];
triangleList[triIter].pt[1] = temOrder;
iPointsPassesCollision[0] = true;
iPointsPassesCollision[1] = false;
}
//三个点都相应面的外边
if (iInsideVertCount == 0)
{
triangleList[triIter].culled = true;
}
//一个点在面的内部
else if (iInsideVertCount == 1)
{
triangleList[triIter].culled = false;
//
XMVECTOR vVert0ToVert1 = triangleList[triIter].pt[1] - triangleList[triIter].pt[0];
XMVECTOR vVert0ToVert2 = triangleList[triIter].pt[2] - triangleList[triIter].pt[0];
//找到碰撞比例
float fHitPointTimeRatio = fEdge - XMVectorGetByIndex(triangleList[triIter].pt[0], iComponent);
float fDistanceAlongVector01 = fHitPointTimeRatio/ XMVectorGetByIndex(vVert0ToVert1, iComponent);
float fDistanceAlongVector02 = fHitPointTimeRatio / XMVectorGetByIndex(vVert0ToVert2, iComponent);
vVert0ToVert1 *= fDistanceAlongVector01;
vVert0ToVert1 += triangleList[triIter].pt[0];
vVert0ToVert2 *= fDistanceAlongVector02;
vVert0ToVert2 += triangleList[triIter].pt[0];
triangleList[triIter].pt[1] = vVert0ToVert2;
triangleList[triIter].pt[2] = vVert0ToVert1;
}
//两个点在面的内部
else if (iInsideVertCount == 2)
{
triangleList[iTriangleCnt] = triangleList[triIter + 1];
triangleList[triIter].culled = false;
triangleList[triIter+1].culled = false;
XMVECTOR vVert2ToVert0 = triangleList[triIter].pt[0] - triangleList[triIter].pt[2];
XMVECTOR vVert2ToVert1 = triangleList[triIter].pt[1] - triangleList[triIter].pt[2];
float fHitPointTimeRatio_2_0 = fEdge - XMVectorGetByIndex(triangleList[triIter].pt[2], iComponent);
float fDistanceAlongVector_2_0 = fHitPointTimeRatio_2_0 / XMVectorGetByIndex(vVert2ToVert0, iComponent);
vVert2ToVert0 *= fDistanceAlongVector_2_0;
vVert2ToVert0 += triangleList[triIter].pt[2];
triangleList[triIter + 1].pt[0] = triangleList[triIter].pt[0];
triangleList[triIter + 1].pt[1] = triangleList[triIter].pt[1];
triangleList[triIter + 1].pt[2] = vVert2ToVert0;
float fHitPointTimeRatio_2_1 = fEdge - XMVectorGetByIndex(triangleList[triIter].pt[2], iComponent);
float fDistanceAlongVector_2_1 = fHitPointTimeRatio_2_1 / XMVectorGetByIndex(vVert2ToVert1, iComponent);
vVert2ToVert1 *= fDistanceAlongVector_2_1;
vVert2ToVert1 += triangleList[triIter].pt[2];
triangleList[triIter ].pt[0] = triangleList[triIter+1].pt[1];
triangleList[triIter ].pt[1] = triangleList[triIter+1].pt[2];
triangleList[triIter ].pt[2] = vVert2ToVert1;
++iTriangleCnt;
++triIter;
}
else
{
triangleList[triIter].culled = false;
}
} //cull
}//iTriagnleCnt
} //4 clPlane
for (int i = 0; i < iTriangleCnt; ++i)
{
if (!triangleList[i].culled)
{
for (int j = 0; j < 3; ++j)
{
float fTriangleCoodZ = XMVectorGetZ(triangleList[i].pt[j]);
if (fNearPlane > fTriangleCoodZ)
{
fNearPlane = fTriangleCoodZ;
}
if (fFarPlane < fTriangleCoodZ)
{
fFarPlane = fTriangleCoodZ;
}
}
}
}
}
}
float fNearPlane = -10000.0f;
float fFarPlane = 20000.0f;
ComputeNearAndFar(fNearPlane, fFarPlane, lightCameraFrustumOrthoMin, lightCameraFrustumOrthoMax, sceneAABBPointsLightSpace);
mShadowProj[iCascadeIndex] = XMMatrixOrthographicOffCenterLH(XMVectorGetX(lightCameraFrustumOrthoMin),
XMVectorGetX(lightCameraFrustumOrthoMax),
XMVectorGetY(lightCameraFrustumOrthoMin),
XMVectorGetY(lightCameraFrustumOrthoMax), fNearPlane, fFarPlane);
最终显示效果如下:
//通过光的改变来适应相机以达到移除阴影边缘的闪烁效果
XMVECTOR vDiagonal = frustumPoints[0] - frustumPoints[6]; //层级视截体在世界空间的对角线
vDiagonal = XMVector3Length(vDiagonal);
//世界空间包围盒的范围
float fCascadeBoung = XMVectorGetX(vDiagonal);
//用一定位移来填充OrthoProj,为了AABB包围盒能覆盖整个层级
XMVECTOR vBorderOffset = (vDiagonal - (lightCameraFrustumOrthoMax - lightCameraFrustumOrthoMin))*XMVectorSet(0.5f, 0.5f, 0.5f, 0.5f);
//OrthoProj只扩充X和Y
vBorderOffset *= XMVectorSet(1.0f, 1.0f, 0.0f, 0.0f);
//扩大frustumAABB体的范围
lightCameraFrustumOrthoMax += vBorderOffset;
lightCameraFrustumOrthoMin -= vBorderOffset;
//计算每个纹素对应世界单位长度来对齐OrthoProj来防止相机移动时候的阴影闪烁现象
float fWorldUnitPerTexel = fCascadeBoung / (float)mCascadeConfig.m_iBufferSize;
vWorldUnitsPe1Texel = XMVectorSet(fWorldUnitPerTexel, fWorldUnitPerTexel, 0.0f, 0.0f);
float lightCameraOrthographicMinZ = XMVectorGetZ(lightCameraFrustumOrthoMin);
float lightCameraOrthographicMaxZ = XMVectorGetZ(lightCameraFrustumOrthoMax);
lightCameraFrustumOrthoMin /= vWorldUnitsPe1Texel;
lightCameraFrustumOrthoMin = XMVectorFloor(lightCameraFrustumOrthoMin);
lightCameraFrustumOrthoMin *= vWorldUnitsPe1Texel;
lightCameraFrustumOrthoMax/= vWorldUnitsPe1Texel;
lightCameraFrustumOrthoMax = XMVectorFloor(lightCameraFrustumOrthoMax);
lightCameraFrustumOrthoMax *= vWorldUnitsPe1Texel;
[1]. Directx Samples SDK 的CSM算法例子
[2].https://msdn.microsoft.com/en-us/library/windows/desktop/ee416307?ranMID=24542&ranEAID=TnL5HPStwNw&ranSiteID=TnL5HPStwNw-6XqNnhs2CFZrh4f6WF_b9w&tduid=(6936a70c84b18e2114dc180de68c9033)(256380)(2459594)(TnL5HPStwNw-6XqNnhs2CFZrh4f6WF_b9w)()
[3].http://blog.csdn.net/qq_29523119/article/details/7286251
[4]《GPU GEMS3 》 Chapter 10. Parallel-Split Shadow Maps on Programmable GPUs
模型资源在 directx-sdk-samples 的Media\powerplant 目录
集成了CSM算法的3D渲染引擎源码:
https://github.com/2047241149/SDEngine
DX11版本:
http://download.csdn.net/download/qq_29523119/10244944