tree

kd-tree:

k-dimensional tree is a space-partitioning data structure for organizing points in a k-dimensional space. kd-trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches).

 

construct

function kdtree (list of points pointList, int depth)
{
    if pointList is empty
        return nil;
    else
    {
        // Select axis based on depth so that axis cycles through all valid values
        var int axis := depth mod k;

        // Sort point list and choose median as pivot element
        select median by axis from pointList;                         //将pointList按第axis维排序,得到中间元素

        // Create node and construct subtrees
        var tree_node node;
        node.location := median;
        node.leftChild := kdtree(points in pointList before median, depth+1);
        node.rightChild := kdtree(points in pointList after median, depth+1);
        return node;
    }
}
Nearest neighbor search
  
  
  
  
Animation of NN searching with a KD Tree in 2D
The nearest neighbor (NN) algorithm aims to find the point in the tree which is nearest to a given input point. This search can be done efficiently by using the tree properties to quickly eliminate large portions of the search space. Searching for a nearest neighbor in a kd-tree proceeds as follows:
  1. Starting with the root node, the algorithm moves down the tree recursively, in the same way that it would if the search point were being inserted (i.e. it goes right or left depending on whether the point is greater or less than the current node in the split dimension).
  2. Once the algorithm reaches a leaf node, it saves that node point as the "current best"
  3. The algorithm unwinds the recursion of the tree, performing the following steps at each node:
    1. If the current node is closer than the current best, then it becomes the current best.
    2. The algorithm checks whether there could be any points on the other side of the splitting plane that are closer to the search point than the current best. In concept, this is done by intersecting the splitting hyperplane with a hypersphere around the search point that has a radius equal to the current nearest distance. Since the hyperplanes are all axis-aligned this is implemented as a simple comparison to see whether the difference between the splitting coordinate of the search point and current node is less than the distance (overall coordinates) from the search point to the current best.
      1. If the hypersphere crosses the plane, there could be nearer points on the other side of the plane, so the algorithm must move down the other branch of the tree from the current node looking for closer points, following the same recursive process as the entire search.
      2. If the hypersphere doesn't intersect the splitting plane, then the algorithm continues walking up the tree, and the entire branch on the other side of that node is eliminated.
  4. When the algorithm finishes this process for the root node, then the search is complete.
Generally the algorithm uses squared distances for comparison to avoid computing square roots. Additionally, it can save computation by holding the squared current best distance in a variable for comparison.
 
 
High-Dimensional Data
kd-trees are not suitable for efficiently finding the nearest neighbour in high dimensional spaces. As a general rule, if the dimensionality is k, then number of points in the data, N, should be N >> 2k. Otherwise, when kd-trees are used with high-dimensional data, most of the points in the tree will be evaluated and the efficiency is no better than exhaustive search,[3] and approximate nearest-neighbour methods are used instead.

Complexity

  • Building a static kd-tree from n points takes O(n log 2 n) time if an O(n log n) sort is used to compute the median at each level. The complexity is O(n log n) if a linear median-finding algorithm such as the one described in Cormen et al.[4] is used.
  • Inserting a new point into a balanced kd-tree takes O(log n) time.
  • Removing a point from a balanced kd-tree takes O(log n) time.
  • Querying an axis-parallel range in a balanced kd-tree takes O(n1-1/k +m) time, where m is the number of the reported points, and k the dimension of the kd-tree.
libkdtree++ usage:
struct triplet{int d[3]; typedef int value_type;}
将triplet用到kdtree中:
template <size_t const __K, typename _Val,
            typename _Acc = _Bracket_accessor<_Val>,
     typename _Dist = squared_difference<typename _Acc::result_type,
       typename _Acc::result_type>,
            typename _Cmp = std::less<typename _Acc::result_type>,
            typename _Alloc = std::allocator<_Node<_Val> > >
    class KDTree
定义树: typedef KDTree::KDTree<3, triplet, std::pointer_to_binary_function<triplet,size_t,double> > tree_type;
KDTree构造函数接受一个_Acc 对象。
tree_type::_Region_  r(cc540, std::ptr_fun(tac));                         //Region接受一个对象、以及访问该对象的函数 operator []
int c = src.count_within_range(r); //得到与cc540相等的对象个数
triplet cc540(7, 4, 0);

tree_type::size_type c = src.count_within_range(cc540, int(1)); //得到与cc540各维度正负1范围内的所有对象的个数

你可能感兴趣的:(tree)