关于KD-Tree搜索的文章不多,其实在opencv中,所谓kd-tree搜索,只是flann“Fast Approximate Nearest Neighbor Search”中索引的一种。即kd-tree搜索,指的是在建立索引这一步骤中建立的是kd-tree索引。
所以本文实质介绍的是:OpenCV与FLANN库的接口。 FLANN(近似近邻的快速搜素库)是一个工具库,其中包含针对大数据集中的快速最近邻搜索和高维度特征优化的算法集合。
环境: 笔者使用的是OpenCV 2.49 (3.0以后可能有变化)
需要引用的头文件: #include “opencv2//flann/miniflann.hpp”
使用flann的搜索,整体来说分为两步,一是建立索引,二是搜索。
flann::Index_::Index_(const Mat& features, const IndexParams& params)
Parameters:
其实就是要两部分参数,一是数据也就是mat矩阵,二是一些具体参数,这个参数要根据建立的索引类型来设置。而有哪些索引类型呢?
共有:线性索引、KD-Tree索引、K均值索引、复合索引、LSH方法索引、自动索引 六种,下面详细介绍每种索引及其相应参数:
LinearIndexParams //线性索引
struct LinearIndexParams : public IndexParams
{
};
没什么要设置的参数
KDTreeIndexParams //KD-Tree索引
struct KDTreeIndexParams : public IndexParams
{
KDTreeIndexParams( int trees = 4 );
};
- trees The number of parallel kd-trees to use. 范围: [1-16]
知道不同的索引该设置对应的参数,那么具体应该怎么设置呢?刚好以kd-tree索引为例:
如何建立KD-Tree索引:
/** 建立kd树索引 **/
/* features */
Mat source = cv::Mat(m_pOriPtXY).reshape(1); //vector m_pOriPtXY
//即m_pOriPtXY是Point2d类型的vector数组,由于建立索引需要mat类型,就直接用m_pOriPtXY去生成一个mat对象
source.convertTo(source,CV_32F);
/* params */
flann::KDTreeIndexParams indexParams(2); //此处我将trees参数设置为2(这也就是kd-trees唯一需要设置的params)
flann::Index kdtree(source, indexParams); //kd树索引建立完毕
KMeansIndexParams //K均值索引
struct KMeansIndexParams : public IndexParams
{
KMeansIndexParams(
int branching = 32,
int iterations = 11,
flann_centers_init_t centers_init = CENTERS_RANDOM,
float cb_index = 0.2 );
};
- branching The branching factor to use for the hierarchical k-means tree
- iterations The maximum number of iterations to use in the k-means clustering stage when building the k-means tree. A value of -1 used here means that the k-means clustering should be iterated until convergence
- centers_init The algorithm to use for selecting the initial centers when performing a k-means clustering step. The possible values are CENTERS_RANDOM (picks the initial cluster centers randomly), CENTERS_GONZALES (picks the initial centers using Gonzales’ algorithm) and CENTERS_KMEANSPP (picks the initial centers using the algorithm suggested in arthur_kmeanspp_2007 )
- cb_index This parameter (cluster boundary index) influences the way exploration is performed in the hierarchical kmeans tree. When cb_index is zero the next kmeans domain to be explored is chosen to be the one with the closest center. A value greater then zero also takes into account the size of the domain.
CompositeIndexParams //该符合结构索引结合随机kd树和层次k均值树来构建索引
struct CompositeIndexParams : public IndexParams
{
CompositeIndexParams(
int trees = 4,
int branching = 32,
int iterations = 11,
flann_centers_init_t centers_init = CENTERS_RANDOM,
float cb_index = 0.2 );
};
LshIndexParams //该结构使用multi-probe LSH方法创建索引
struct LshIndexParams : public IndexParams
{
LshIndexParams(
unsigned int table_number,
unsigned int key_size,
unsigned int multi_probe_level );
};
- table_number the number of hash tables to use (between 10 and 30 usually).
- key_size the size of the hash key in bits (between 10 and 20 usually).
- multi_probe_level the number of bits to shift to check for neighboring buckets (0 is regular LSH, 2 is recommended).
AutotunedIndexParams // 根据数据自动选取最优的索引类型来提供最好的性能
struct AutotunedIndexParams : public IndexParams
{
AutotunedIndexParams(
float target_precision = 0.9,
float build_weight = 0.01,
float memory_weight = 0,
float sample_fraction = 0.1 );
};
- target_precision Is a number between 0 and 1 specifying the percentage of the approximate nearest-neighbor searches that return the exact nearest-neighbor. Using a higher value for this parameter gives more accurate results, but the search takes longer. The optimum value usually depends on the application.
- build_weight Specifies the importance of the index build time raported to the nearest-neighbor search time. In some applications it’s acceptable for the index build step to take a long time if the subsequent searches in the index can be performed very fast. In other applications it’s required that the index be build as fast as possible even if that leads to slightly longer search times.
- memory_weight Is used to specify the tradeoff between time (index build time and search time) and memory used by the index. A value less than 1 gives more importance to the time spent and a value greater than 1 gives more importance to the memory usage.
- sample_fraction Is a number between 0 and 1 indicating what fraction of the dataset to use in the automatic parameter configuration algorithm. Running the algorithm on the full dataset gives the most accurate results, but for very large datasets can take longer than desired. In such case using just a fraction of the data helps speeding up this algorithm while still giving good approximations of the optimum parameters.
SavedIndexParams // 读取先前的索引文件
struct SavedIndexParams : public IndexParams
{
SavedIndexParams( std::string filename );
};
- filename The filename in which the index was saved.
有两种搜索方式
flann::Index_::knnSearch //搜索k邻近
flann::Index_::radiusSearch //搜索半径最近
从返回结果考虑两者的不同之处在于:
knnSearch返回最近邻的点(具体点的个数由用户设定,设n个就一定返回n个);
radiusSearch返回搜索半径内所有点(即可能不存在符合条件的点,则返回空的)。
下面具体来看用法:
void flann::Index_::knnSearch(const vector& query, vector& indices, vector& dists, int knn, const SearchParams& params) //参数类型都vector数组
void flann::Index_::knnSearch(const Mat& queries, Mat& indices, Mat& dists, int knn, const SearchParams& params) //参数类型为mat类型
参数:
int flann::Index_::radiusSearch(const vector& query, vector& indices, vector& dists, float radius, const SearchParams& params)
int flann::Index_::radiusSearch(const Mat& query, Mat& indices, Mat& dists, float radius, const SearchParams& params)
参数:
以kd-tree索引为例,使用knnSearch
/**建立kd树索引**/
Mat source = cv::Mat(m_pOriPtXY).reshape(1);
source.convertTo(source,CV_32F);
flann::KDTreeIndexParams indexParams(2);
flann::Index kdtree(source, indexParams); //此部分建立kd-tree索引同上例,故不做详细叙述
/**预设knnSearch所需参数及容器**/
unsigned queryNum = 7;//用于设置返回邻近点的个数
vector<float> vecQuery(2);//存放 查询点 的容器(本例都是vector类型)
vector<int> vecIndex(queryNum);//存放返回的点索引
vector<float> vecDist(queryNum);//存放距离
flann::SearchParams params(32);//设置knnSearch搜索参数
/**KD树knn查询**/
vecQuery[0] = (float)dX; //查询点x坐标
vecQuery[1] = (float)dY; //查询点y坐标
kdtree.knnSearch(vecQuery, vecIndex, vecDist, queryNum, params);
//请注意这句的逻辑:由先前生成的kdtree索引对象调用knnSearch()函数,进行点的knn搜索
OpenCV2.49文档
http://www.mamicode.com/info-detail-495502.html
http://blog.csdn.net/readzw/article/details/8591593
http://blog.csdn.net/readzw/article/details/8591593