关于KD-Tree搜索的文章不多,其实在opencv中,所谓kd-tree搜索,只是flann“Fast Approximate Nearest Neighbor Search”中索引的一种。即kd-tree搜索,指的是在建立索引这一步骤中建立的是kd-tree索引。
所以本文实质介绍的是:OpenCV与FLANN库的接口。 FLANN(近似近邻的快速搜素库)是一个工具库,其中包含针对大数据集中的快速最近邻搜索和高维度特征优化的算法集合。
环境: 笔者使用的是OpenCV 2.49 (3.0以后可能有变化)
需要引用的头文件: #include “opencv2//flann/miniflann.hpp”
flann::Index_::Index_(const Mat& features, const IndexParams& params)
共有:线性索引、KD-Tree索引、K均值索引、复合索引、LSH方法索引、自动索引 六种,下面详细介绍每种索引及其相应参数:
LinearIndexParams //线性索引
struct LinearIndexParams : public IndexParams
KDTreeIndexParams //KD-Tree索引
struct KDTreeIndexParams : public IndexParams
KDTreeIndexParams( int trees = 4 );
- trees The number of parallel kd-trees to use. 范围: [1-16]
/** 建立kd树索引 **/
/* features */
Mat source = cv::Mat(m_pOriPtXY).reshape(1); //vector m_pOriPtXY
/* params */
flann::KDTreeIndexParams indexParams(2); //此处我将trees参数设置为2(这也就是kd-trees唯一需要设置的params)
flann::Index kdtree(source, indexParams); //kd树索引建立完毕
KMeansIndexParams //K均值索引
struct KMeansIndexParams : public IndexParams
int branching = 32,
int iterations = 11,
flann_centers_init_t centers_init = CENTERS_RANDOM,
float cb_index = 0.2 );
- branching The branching factor to use for the hierarchical k-means tree
- iterations The maximum number of iterations to use in the k-means clustering stage when building the k-means tree. A value of -1 used here means that the k-means clustering should be iterated until convergence
- centers_init The algorithm to use for selecting the initial centers when performing a k-means clustering step. The possible values are CENTERS_RANDOM (picks the initial cluster centers randomly), CENTERS_GONZALES (picks the initial centers using Gonzales’ algorithm) and CENTERS_KMEANSPP (picks the initial centers using the algorithm suggested in arthur_kmeanspp_2007 )
- cb_index This parameter (cluster boundary index) influences the way exploration is performed in the hierarchical kmeans tree. When cb_index is zero the next kmeans domain to be explored is chosen to be the one with the closest center. A value greater then zero also takes into account the size of the domain.
CompositeIndexParams //该符合结构索引结合随机kd树和层次k均值树来构建索引
struct CompositeIndexParams : public IndexParams
int trees = 4,
int branching = 32,
int iterations = 11,
flann_centers_init_t centers_init = CENTERS_RANDOM,
float cb_index = 0.2 );
LshIndexParams //该结构使用multi-probe LSH方法创建索引
struct LshIndexParams : public IndexParams
unsigned int table_number,
unsigned int key_size,
unsigned int multi_probe_level );
- table_number the number of hash tables to use (between 10 and 30 usually).
- key_size the size of the hash key in bits (between 10 and 20 usually).
- multi_probe_level the number of bits to shift to check for neighboring buckets (0 is regular LSH, 2 is recommended).
AutotunedIndexParams // 根据数据自动选取最优的索引类型来提供最好的性能
struct AutotunedIndexParams : public IndexParams
float target_precision = 0.9,
float build_weight = 0.01,
float memory_weight = 0,
float sample_fraction = 0.1 );
- target_precision Is a number between 0 and 1 specifying the percentage of the approximate nearest-neighbor searches that return the exact nearest-neighbor. Using a higher value for this parameter gives more accurate results, but the search takes longer. The optimum value usually depends on the application.
- build_weight Specifies the importance of the index build time raported to the nearest-neighbor search time. In some applications it’s acceptable for the index build step to take a long time if the subsequent searches in the index can be performed very fast. In other applications it’s required that the index be build as fast as possible even if that leads to slightly longer search times.
- memory_weight Is used to specify the tradeoff between time (index build time and search time) and memory used by the index. A value less than 1 gives more importance to the time spent and a value greater than 1 gives more importance to the memory usage.
- sample_fraction Is a number between 0 and 1 indicating what fraction of the dataset to use in the automatic parameter configuration algorithm. Running the algorithm on the full dataset gives the most accurate results, but for very large datasets can take longer than desired. In such case using just a fraction of the data helps speeding up this algorithm while still giving good approximations of the optimum parameters.
SavedIndexParams // 读取先前的索引文件
struct SavedIndexParams : public IndexParams
SavedIndexParams( std::string filename );
- filename The filename in which the index was saved.
flann::Index_::knnSearch //搜索k邻近
flann::Index_::radiusSearch //搜索半径最近
void flann::Index_::knnSearch(const vector& query, vector& indices, vector& dists, int knn, const SearchParams& params) //参数类型都vector数组
void flann::Index_::knnSearch(const Mat& queries, Mat& indices, Mat& dists, int knn, const SearchParams& params) //参数类型为mat类型
int flann::Index_::radiusSearch(const vector& query, vector& indices, vector& dists, float radius, const SearchParams& params)
int flann::Index_::radiusSearch(const Mat& query, Mat& indices, Mat& dists, float radius, const SearchParams& params)
Mat source = cv::Mat(m_pOriPtXY).reshape(1);
flann::KDTreeIndexParams indexParams(2);
flann::Index kdtree(source, indexParams); //此部分建立kd-tree索引同上例,故不做详细叙述
unsigned queryNum = 7;//用于设置返回邻近点的个数
vector<float> vecQuery(2);//存放 查询点 的容器(本例都是vector类型)
vector<int> vecIndex(queryNum);//存放返回的点索引
vector<float> vecDist(queryNum);//存放距离
flann::SearchParams params(32);//设置knnSearch搜索参数
vecQuery[0] = (float)dX; //查询点x坐标
vecQuery[1] = (float)dY; //查询点y坐标
kdtree.knnSearch(vecQuery, vecIndex, vecDist, queryNum, params);