Speedtree 剔除优化原理

Motivation

This section outlines a culling algorithm specifically designed for rendering very large forests and has been tested on forests as large as 10,000,000 trees covering 1,600 square miles.  An algorithm over and above standard culling algorithms like a quad tree or octtree was needed to address three goals simultaneously:

  • Efficiently determine which trees are in the current frustum (a requirement of any culling algorithm) - one that can handle quickly reporting the locations of hundreds of thousands of visible trees.

  • Efficiently determine the LOD value of each of the potentially hundreds of thousands of visible trees.

  • For a given frustum with a reasonably deep visibility, about 98% of the trees will be rendered as billboards.  Hence it is important to render these billboards using a batching technique that renders using as large a batch size as possible.  The culling algorithm should quickly provide the batches as part of the frustum updates.

 

 

Technique Overview

The entire forest is divided into a grid of evenly spaced 2D cells (Figure 1).  The grid extends across the ground extents of the forest.  The view frustum is then projected onto the grid using an orthographic projection.  Five points are used:  the camera location, and the four points that lie in the far plane of the frustum.  The axis-aligned (AA) extents of this frustum projection are computed and converted to the nearest cells.  The sub-grid defined by the AA extents defines the cells that are potentially visible while every cell outside of this box is automatically culled.

 

Figure 1.  SpeedTree Large Forest Culling Algorithm

 

 

Each of the cells in the AA frustum bounding box are tested against the frustum using a frustum/sphere intersection test, resulting in figure X - the final set of cells used by the rest of the algorithm.

 

Each remaining cell is then assigned one of two LOD values: LOD_ALL_BILLBOARDS or LOD_MIXED.  LOD_ALL_BILLBOARDS is assigned when the cell is distant enough from the camera such that every tree would be rendered at LOD level 0.0.  In this way, the LOD values of every tree in an LOD_ALL_BILLBOARDS is known without addressing the individual trees.  LOD_MIXED is assigned to the remaining cells that are close enough to the camera to have a variety of LOD values.  The algorithm wil use a simple equation to compute the LOD values for each of these trees:

 

 

// SpeedTreeRT assumes 1.0 = highest, 0.0 = lowest

 

// fDistance is the 3D distance from the tree to the camera
float fLod = 1.0f - (fDistance - m_fNearLod) / (m_fFarLod - m_fNearLod);
fLod = st_max(fLod, 0.0f);
fLod = st_min(fLod, 1.0f);
 

 

At this point, the visibilty and LOD values for the entire forest have been determined and are queried from the culling class using tree accessor functions for the 3D trees and cell accessor functions for the batched billboards.  More below.

 

 

Billboards

The culling algorithm creates a batch of billboards for each cell in the frustum.  There are two types of batches:

  • A batch of billboards where every tree is at LOD level 0.0.

  • A batch of billboards where some trees in the cell are 3D, some are fully billboards, and some are in a 3D-to-billboard transitional state.  These batches are the most computationally expensive to setup since each tree must be computed separately.

 

To handle the fact that thousands of billboards are coming into and going out of the frustum, the culling class uses a hash map that maps the active visible billboard batches to a set of rotated render buffers.  As a billboard cell comes into the frustum, the next available render buffer is assigned to it and filled with the billboard vertex data.  This data is static (all of the LOD and lighting dynamics are handled by shaders) and will stay in the render buffer until the cell moves out of the frustum.

 

As the cell moves out of the frustum, it surrenders its render buffer back on the render buffer static to be reused by the next cell coming into the frustum.  Note that all of the render buffers are created with the capacity to store the most dense cell in the forest so that it can easily be used with any cell.

 

 

Accessing Functions

It isn't practical for the culling class to provide a convenient contiguous array or vector of visible trees, as just building this vector, if it contained hundreds of thousands of trees, would kill any efficiency already achieved.  Hence there are special accessor functions that internally move through the visible cells in the forest, returning the visible trees and their respective LOD values.  One set of functions is used to access the 3D trees, while another set facilitates returning the batches of billboards to be rendered.

 

The 3D tree accessor functions are CSpeedTreeCullingEngine::GetFirstTree(void), and CSpeedTreeCullingEngine::GetNextTree(void) and they return CSpeedTreeInstance pointers.  NULL is returned when no more trees are in the frustum.  A simple example usage is below:

 

 

    CSpeedTreeInstance* pTree = m_cCullEngine.GetFirstTree( );

    while (pTree)

    {

        RenderMyTree(pTree);

        pTree = m_cCullEngine.GetNextTree( );

    }

 

 

The billboard accessor functions are CSpeedTreeCullingEngine::GetFirstCell(int& nNumBBs) and CSpeedTreeCullingEngine::GetNextCell(int& nNumBBs) and they return CIdvInterlaevedBuffer pointers (link to this - an example vertex buffer / index buffer that works in OpenGL, DirectX 9.0c, Xbox 360, and PS3) while also filling out nNumBBs with the number of billboards in the vertex buffer.  A simple example usage is below:

 

 

    int nNumBBs = 0;

    CIdvInterleavedBuffer* pRenderBuffer = m_cCullEngine.GetFirstCell(nNumBBs);

    while (pRenderBuffer)

    {

        pRenderBuffer->DrawArrays(CIdvInterleavedBuffer::QUADS, 0, nNumBBs * 4);

        pRenderBuffer = m_cCullEngine.GetNextCell(nNumBBs);

    }

 

 

Notes

  • The number of cells in the forest grid plays an important part in the algorithm's performance.  Experimentation is encouraged to determine the ideal grid resolution for your application.  The fewer the number of cells, the larger the billboard batches will be, but this is not without a trade-off.  The culling algorithm works more efficiently when culling the 3D trees when smaller cells are used.

 

 

你可能感兴趣的:(Speedtree 剔除优化原理)