

1. 简介

  • 功能:为一个模板函数,把数据类型为_Tp的一组集合进行聚类,分成若干个类别。
  • 思想:该算法为《算法导论》(Introduction to Algorythms)中Data structures for disjoint sets章节描述的不相交集的实现,算法思想见博文(Algorithm)不相交集(Disjoint-set) 。
  • 该算法为聚类算法,属于层次聚类算法(Hierarchical Clustering),思想上符合AGNES (Agglomerative Nesting),一种从底向上聚类的算法。但实现上还有有些区别。

2. 源代码


/** @brief Splits an element set into equivalency classes. @param _vec Set of elements stored as a vector. @param labels Output vector of labels. It contains as many elements as vec. Each label labels[i] is a 0-based cluster index of `vec[i]`. @param predicate Equivalence predicate (pointer to a boolean function of two arguments or an instance of the class that has the method bool operator()(const _Tp& a, const _Tp& b) ). The predicate returns true when the elements are certainly in the same class, and returns false if they may or may not be in the same class. */
template<typename _Tp, class _EqPredicate> int
partition( const vector<_Tp>& _vec, vector<int>& labels,
           _EqPredicate predicate=_EqPredicate())
    int i, j, N = (int)_vec.size();
    const _Tp* vec = &_vec[0];

    const int PARENT=0;
    const int RANK=1;

    vector<int> _nodes(N*2);
    int (*nodes)[2] = (int(*)[2])&_nodes[0];

    // The first O(N) pass: create N single-vertex trees
    for(i = 0; i < N; i++)
        nodes[i][RANK] = 0;
    // The main O(N^2) pass: merge connected components
    // 注意:
    // root表示i的根节点
    // root2表示j的根节点
    // 在执行predicate时是i,j节点而不是root,root2节点,这样就保证了
    // 原始的N个基本元素间互相都做了比较
    for( i = 0; i < N; i++ )
        int root = i;

        // find root
        while( nodes[root][PARENT] >= 0 )
            root = nodes[root][PARENT];

        for( j = 0; j < N; j++ )
            if( i == j || !predicate(vec[i], vec[j]))
            int root2 = j;

            while( nodes[root2][PARENT] >= 0 )
                root2 = nodes[root2][PARENT];

            if( root2 != root )
                // unite both trees
                int rank = nodes[root][RANK], rank2 = nodes[root2][RANK];
                if( rank > rank2 )
                    nodes[root2][PARENT] = root;
                    nodes[root][PARENT] = root2;
                    nodes[root2][RANK] += rank == rank2;
                    root = root2;
                assert( nodes[root][PARENT] < 0 );

                int k = j, parent;

                // compress the path from node2 to root
                while( (parent = nodes[k][PARENT]) >= 0 )
                    nodes[k][PARENT] = root;
                    k = parent;

                // compress the path from node to root
                k = i;
                while( (parent = nodes[k][PARENT]) >= 0 )
                    nodes[k][PARENT] = root;
                    k = parent;

    // Final O(N) pass: enumerate classes
    int nclasses = 0;

    for( i = 0; i < N; i++ )
        int root = i;
        while( nodes[root][PARENT] >= 0 )
            root = nodes[root][PARENT];
        // re-use the rank as the class label
        // 这部分代码写的很漂亮、巧妙
        // 把i所在类的根节点的秩赋值为类别值
        // 但做了取反得到对应的负数,下一次if就不用执行了
        if( nodes[root][RANK] >= 0 )
            nodes[root][RANK] = ~nclasses++;
        labels[i] = ~nodes[root][RANK];
    return nclasses;

3. 应用

在去除物体检测中,常常需要对重复的bounding box删除,需要使用groupRectangles函数,该函数则是对partition的一层封装。在gropRectangels中,计算矩形间相似度的_EqPredicate为SimilarRects,其定义如下,几何上来说,如果矩阵r1,r2在空间上是比较接近的,则返回true,否则false。但是值得注意的是,如果r1包含r2,则不一定会判定为r1与r2很接近,所以在HOG中的,gropRectangels还进行了一层去除包含关系的操作。

class CV_EXPORTS SimilarRects
    SimilarRects(double _eps) : eps(_eps) {}
    inline bool operator()(const Rect& r1, const Rect& r2) const
        // delta为最小长宽的eps倍,
        double delta = eps*(std::min(r1.width, r2.width) + std::min(r1.height, r2.height))*0.5;
        // 如果矩形的四个顶点的位置差别都小于delta,则表示相似的矩形
        return std::abs(r1.x - r2.x) <= delta &&
               std::abs(r1.y - r2.y) <= delta &&
               std::abs(r1.x + r1.width - r2.x - r2.width) <= delta &&
               std::abs(r1.y + r1.height - r2.y - r2.height) <= delta;
    double eps;
