

  • 1.实验目的
  • 2.论文详读
  • 3.实验原理
  • 4.实验环境
  • 5.程序设计和实现
  • 6.实验结果




We propose a new method for generating superpixels which is faster than existing methods, more memory efficient, exhibits state-of-the-art boundary adherence, and improves the perfor- mance of segmentation algorithms. Simple linear iterative clustering is an adaptation of k-means for superpixel generation, with two important distinctions:
1.The number of distance calculations in the optimization is dramatically reduced by limiting the search space to a region proportional to the superpixel size. This reduces the complexity to be linear in the number of pixels N—and independent of the number of superpixels k.
2.A weighted distance measure combines color and spatial proximity while simultaneously providing control over the size and compactness of the superpixels.
SLIC is similar to the approach used as a preprocessing step for depth estimation described in [30], which was not fully explored in the context of superpixel generation.
3 超像素
3.1 Algorithm
SLIC is simple to use and understand. By default, the only parameter of the algorithm is k, the desired number of approximately equally sized superpixels . For color images in the CIELAB color space, the clustering procedure begins with an initialization step where k initial cluster centers Ci = [li ai bi xi yi]T are sampled on a regular grid spaced S pixels apart. To produce roughly roughly equally sized superpixels, the grid interval is S= √(N/K). The centers are moved to seed locations corresponding to the lowest gradient position in a 3 * 3 neighborhood. This is done to avoid centering a superpixel on an edge and to reduce the chance of seeding a superpixel with a noisy pixel.
Next, in the assignment step, each pixel i is associated with the nearest cluster center whose search region overlaps its location, as depicted in Fig. 2. This is the key to speeding up our algorithm because limiting the size of the search region significantly reduces the number of distance calculations, and results in a significant speed advantage over conventional k-means clustering where each pixel must be compared with all cluster centers. This is only possible through the introduction of a distance measure D, which determines the nearest cluster center for each pixel, as discussed in Section 3.2. Since the expected spatial extent of a superpixel is a region of approximate size S * S, the search for similar pixels is done in a region2S * 2Saround the superpixel center.
Once each pixel has been associated to the nearest cluster center, an update step adjusts the cluster centers to be the mean [l a b x y]Tvector of all the pixels belonging to the cluster. The L2 norm is used to compute a residual error E between the new cluster center locations and previous cluster center locations. The assignment and update steps can be repeated iteratively until the error converges, but we have found that 10 iterations suffices for most images, and report all results in this paper using this criteria. Finally, a postprocessing step enforces connectivity by reassigning disjoint pixels to nearby superpixels. The entire algorithm is summarized in Algorithm 1.
SLIC使用简单易懂。默认情况下,算法的唯一参数是k,其含义是大小大致相等的超像素的个数。对于CIELAB色彩空间中的彩色图像,聚类过程从初始化步骤开始,其中k个初始聚类中心Ci = [li ai bi xi yi]T 在间隔S个像素的规则网格上采样。为了产生大致相等大小的超像素,网格间隔为S= √(N/K) 将中心移动到与3×3邻域中的最低梯度位置相对应的种子位置。这样做是为了避免将超像素定位在边缘上,并且减少用噪声像素接种超像素的机会。
一旦每个像素已经与最近的聚类中心相关联,更新步骤将聚类中心调整为属于该聚类的所有像素的平均向量[l a b x y]T。L2范数用于计算新聚类中心位置和先前聚类中心位置之间的残差误差E.分配和更新步骤可以迭代重复,直到错误收敛,但我们发现10次迭代足够大多数图像,并报告本文中使用此标准的所有结果。最后,后处理步骤通过将不相交像素重新分配给附近的超像素来实施连通性。算法1中总结了整个算法。
Fig. 2. Reducing the superpixel search regions. The complexity of SLIC is linear in the number of pixels in the image O(N), while the conventional k-means algorithm is O(kNI), where I is the number of iterations. This is achieved by limiting the search space of each cluster center in the assignment step. (a) In the conventional k-means algorithm, distances are computed from each cluster center to every pixel in the image. (b) SLIC only computes distances from each cluster center to pixels within a 2S * 2S region. Note that the expected superpixel size is only S * S, indicated by the smaller square. This approach not only reduces distance computations but also makes SLIC’s complexity independent of the number of superpixels.
3.2 Distance Measure
SLIC superpixels correspond to clusters in the labxy color-image plane space. This presents a problem in defining the distance measure D, which may not be immediately obvious. D computes the distance between a pixel i and cluster center Ck in Algorithm 1. A pixel’s color is represented in the CIELAB color space [l a b]T, whose range of possible values is known. The pixel’s position position [x y]T, on the other hand, may take a range of values that varies according to the size of the image.
Simply defining D to be the 5D euclidean distance in labxy space will cause inconsistencies in clustering behavior for different superpixel sizes. For large superpixels, spatial distances outweigh color proximity, giving more relative importance to spatial proximity than color. This produces compact superpixels that do not adhere well to image boundaries. For smaller superpixels, the converse is true.
To combine the two distances into a single measure, it is necessary to normalize color proximity and spatial proximity by their respective maximum distances within a cluster, Ns and Nc. Doing so, D’ is written
SLIC 超像素对应于 labxy色像平面空间中的簇。这提出了定义距离测量D的问题,这可能不是立即显而易见的。口在算法1中计算像素i和聚类中心Ck之间的距离。像素的颜色在CIELAB 颜色空间[l a b]T中表示,其取值范围是己知的。另一方面,像素的位置[x y]T的取值范国随着图像的尺寸变化而变化。
The maximum spatial distance expected within a given cluster should correspond to the sampling interval, Ns=S= √(N/k) Determining the maximum color distance Nc is not so straightfor- ward, as color distances can vary significantly from cluster to cluster and image to image. This problem can be avoided by fixing Nc to a constant m so that (1) becomes
给定群集内预期的最大空间距离应对应于采样间隔,Ns=S= √(N/k) 。确定最大颜色距离Nc不是那么简单,因为颜色距离可以从簇到簇和图像到图像显著不同。这个问题可以通过将Nc固定为常数m来避免,变为
which simplifies to the distance measure we use in practice:
By defining D in this manner, m also allows us to weigh the relative importance between color similarity and spatial proximity. When m is large, spatial proximity is more important and the resulting superpixels are more compact (i.e., they have a lower area to perimeter ratio). When m is small, the resulting superpixels adhere more tightly to image boundaries, but have less regular size and shape. When using the CIELAB color space, m can be in the range[1, 40].
Equation (3) can be adapted for grayscale images by setting
It can also be extended to handle 3D supervoxels, as depicted in Fig. 3, by including the depth dimension to the spatial proximity term of (3):
Fig. 3. SLIC supervoxels computed for a video sequence. (Top) Frames from a short video sequence of a flag waving. (Bottom left) A volume containing the video. The last frame appears at the top of the volume. (Bottom right) A supervoxel segmentation of the video. Supervoxels with orange cluster centers are removed for display purposes
3.3 Postprocessing
Like some other superpixel algorithms [8], SLIC does not explicitly enforce connectivity. At the end of the clustering procedure, some “orphaned” pixels that do not belong to the same connected component as their cluster center may remain. To correct for this, such pixels are assigned the label of the nearest cluster center using a connected components algorithm.
3.4 Complexity
By localizing the search in the clustering procedure, SLIC avoids performing thousands of redundant distance calculations. In practice, a pixel falls in the neighborhood of less than eight cluster centers, meaning that SLIC is O(N) complex. In contrast, the trivial upper bound for the classical k-means algorithm is O(k^N) , and the practical time complexity is O(NkI) , where I is the number of iterations required for convergence. While schemes to reduce the complexity of k-means have been proposed using prime number length sampling, random sampling, local cluster swapping , and by setting lower and upper bounds, these methods are very general in nature. SLIC is specifically tailored to the problem of superpixel clustering. Finally, unlike most superpixel methods and the aforementioned approaches to speed up k-means, the complexity of SLIC is linear in the number of pixels, irrespective of k.
通过在聚类过程中定位搜索,SLIC避免执行数千个冗余距离计算。在实践中,像素落在小于8个聚类中心的附近,这意味著SLIC是O(N)复杂度。相比之下,经典k均值算法的平凡上限是 O(kN),实际时间复杂度为O(KNI),其中I是收敛所需的选代次数。虽然己经提出了使用素数长度采样,随机抽样,局部簇交换以及通过设置下限和上限来降低k均值复杂度的方案一般性质。SLIC专门针对超像素聚类的问题。最后,与大多数超像素方法和上述加速k-均值的方法不同,SLIC 的复杂性在像素数量上是线性的,与k无关。



Gradient(x,y)=dx(i,j) + dy(i,j);
dx(i,j) = I(i+1,j) - I(i,j); 
dy(i,j) = I(i,j+1) - I(i,j);






const float param_13 = 1.0f / 3.0f;
const float param_16116 = 16.0f / 116.0f;
const float Xn = 0.950456f;
const float Yn = 1.0f;
const float Zn = 1.088754f;

using namespace std;
using namespace cv;

float gamma(float x)
    return x > 0.04045 ? powf((x + 0.055f) / 1.055f, 2.4f) : (x / 12.92);

float gamma_XYZ2RGB(float x)
    return x > 0.0031308 ? (1.055f * powf(x, (1 / 2.4f)) - 0.055) : (x * 12.92);

void XYZ2RGB(float X, float Y, float Z, int *R, int *G, int *B)
    float RR, GG, BB;
    RR = 3.2404542f * X - 1.5371385f * Y - 0.4985314f * Z;
    GG = -0.9692660f * X + 1.8760108f * Y + 0.0415560f * Z;
    BB = 0.0556434f * X - 0.2040259f * Y + 1.0572252f * Z;

    RR = gamma_XYZ2RGB(RR);
    GG = gamma_XYZ2RGB(GG);
    BB = gamma_XYZ2RGB(BB);

    RR = int(RR * 255.0 + 0.5);
    GG = int(GG * 255.0 + 0.5);
    BB = int(BB * 255.0 + 0.5);

    *R = RR;
    *G = GG;
    *B = BB;

void Lab2XYZ(float L, float a, float b, float *X, float *Y, float *Z)
    float fX, fY, fZ;

    fY = (L + 16.0f) / 116.0;
    fX = a / 500.0f + fY;
    fZ = fY - b / 200.0f;

    if (powf(fY, 3.0) > 0.008856)
        *Y = powf(fY, 3.0);
        *Y = (fY - param_16116) / 7.787f;

    if (powf(fX, 3) > 0.008856)
        *X = fX * fX * fX;
        *X = (fX - param_16116) / 7.787f;

    if (powf(fZ, 3.0) > 0.008856)
        *Z = fZ * fZ * fZ;
        *Z = (fZ - param_16116) / 7.787f;

    (*X) *= (Xn);
    (*Y) *= (Yn);
    (*Z) *= (Zn);

void RGB2XYZ(int R, int G, int B, float *X, float *Y, float *Z)
    float RR = gamma((float) R / 255.0f);
    float GG = gamma((float) G / 255.0f);
    float BB = gamma((float) B / 255.0f);

    *X = 0.4124564f * RR + 0.3575761f * GG + 0.1804375f * BB;
    *Y = 0.2126729f * RR + 0.7151522f * GG + 0.0721750f * BB;
    *Z = 0.0193339f * RR + 0.1191920f * GG + 0.9503041f * BB;

void XYZ2Lab(float X, float Y, float Z, float *L, float *a, float *b)
    float fX, fY, fZ;

    X /= Xn;
    Y /= Yn;
    Z /= Zn;

    if (Y > 0.008856f)
        fY = pow(Y, param_13);
        fY = 7.787f * Y + param_16116;

    *L = 116.0f * fY - 16.0f;
    *L = *L > 0.0f ? *L : 0.0f;

    if (X > 0.008856f)
        fX = pow(X, param_13);
        fX = 7.787f * X + param_16116;

    if (Z > 0.008856)
        fZ = pow(Z, param_13);
        fZ = 7.787f * Z + param_16116;

    *a = 500.0f * (fX - fY);
    *b = 200.0f * (fY - fZ);

void RGB2Lab(int R, int G, int B, float *L, float *a, float *b)
    float X, Y, Z;
    RGB2XYZ(R, G, B, &X, &Y, &Z);
    XYZ2Lab(X, Y, Z, L, a, b);

void Lab2RGB(float L, float a, float b, int *R, int *G, int *B)
    float X, Y, Z;
    Lab2XYZ(L, a, b, &X, &Y, &Z);
    XYZ2RGB(X, Y, Z, R, G, B);

int main()
    Mat raw_image = imread("../pic6.jpg");
    if (raw_image.empty())
        cout << "read error" << endl;
        return 0;
    vector<vector<vector<float>>> image;//x,y,(L,a,b)

    int rows = raw_image.rows;
    int cols = raw_image.cols;
    cout << "rows:" << rows << " cols:" << cols << endl;
    int N = rows * cols;
    int K = 150;
    cout << "cluster num:" << K << endl;
    int M = 40;
    int S = (int) sqrt(N / K);
    cout << "S:" << S << endl;

    for (int i = 0; i < rows; i++)
        vector<vector<float>> line;
        for (int j = 0; j < cols; j++)
            vector<float> pixel;
            float L;
            float a;
            float b;

            RGB2Lab(raw_image.at<Vec3b>(i, j)[2], raw_image.at<Vec3b>(i, j)[1], raw_image.at<Vec3b>(i, j)[0], &L, &a,


    cout << "RGB2Lab is finished" << endl;

    //聚类中心,[x y l a b]
    vector<vector<float>> Cluster;

    for (int i = S / 2; i < rows; i += S)
        for (int j = S / 2; j < cols; j += S)
            vector<float> c;
            c.push_back((float) i);
            c.push_back((float) j);

    int cluster_num = Cluster.size();
    cout << "init cluster is finished" << endl;

    for (int c = 0; c < cluster_num; c++)
        int c_row = (int) Cluster[c][0];
        int c_col = (int) Cluster[c][1];
        if (c_row + 1 >= rows)
            c_row = rows - 2;
        if (c_col + 1 >= cols)
            c_col = cols - 2;

        float c_gradient =
                image[c_row + 1][c_col][0] + image[c_row][c_col + 1][0] - 2 * image[c_row][c_col][0] +
                image[c_row + 1][c_col][1] + image[c_row][c_col + 1][1] - 2 * image[c_row][c_col][1] +
                image[c_row + 1][c_col][2] + image[c_row][c_col + 1][2] - 2 * image[c_row][c_col][2];

        for (int i = -1; i <= 1; i++)
            for (int j = -1; j <= 1; j++)
                int tmp_row = c_row + i;
                int tmp_col = c_col + j;

                if (tmp_row + 1 >= rows)
                    tmp_row = rows - 2;
                if (tmp_col + 1 >= cols)
                    tmp_col = cols - 2;

                float tmp_gradient =
                        image[tmp_row + 1][tmp_col][0] + image[tmp_row][tmp_col + 1][0] -
                        2 * image[tmp_row][tmp_col][0] + image[tmp_row + 1][tmp_col][1] +
                        image[tmp_row][tmp_col + 1][1] - 2 * image[tmp_row][tmp_col][1] +
                        image[tmp_row + 1][tmp_col][2] + image[tmp_row][tmp_col + 1][2] -
                        2 * image[tmp_row][tmp_col][2];

                if (tmp_gradient < c_gradient)
                    Cluster[c][0] = (float) tmp_row;
                    Cluster[c][1] = (float) tmp_col;
                    Cluster[c][2] = image[tmp_row][tmp_col][0];
                    Cluster[c][3] = image[tmp_row][tmp_col][1];
                    Cluster[c][3] = image[tmp_row][tmp_col][2];
                    c_gradient = tmp_gradient;

    cout << "move cluster is finished";

    //创建一个dis的矩阵for each pixel = ∞
    vector<vector<double>> distance;
    for (int i = 0; i < rows; ++i)
        vector<double> tmp;
        for (int j = 0; j < cols; ++j)

    //创建一个dis的矩阵for each pixel = -1
    vector<vector<int>> label;
    for (int i = 0; i < rows; ++i)
        vector<int> tmp;
        for (int j = 0; j < cols; ++j)

    vector<vector<vector<int>>> pixel(Cluster.size());

    for (int t = 0; t < 10; t++)
        cout << endl << "iteration num:" << t + 1 << "  ";
        int c_num = 0;
        for (int c = 0; c < cluster_num; c++)
            if (c - c_num >= (cluster_num / 10))
                cout << "+";
                c_num = c;
            int c_row = (int) Cluster[c][0];
            int c_col = (int) Cluster[c][1];
            float c_L = Cluster[c][2];
            float c_a = Cluster[c][3];
            float c_b = Cluster[c][4];
            for (int i = c_row - 2 * S; i <= c_row + 2 * S; i++)
                if (i < 0 || i >= rows)

                for (int j = c_col - 2 * S; j <= c_col + 2 * S; j++)
                    if (j < 0 || j >= cols)

                    float tmp_L = image[i][j][0];
                    float tmp_a = image[i][j][1];
                    float tmp_b = image[i][j][2];

                    double Dc = sqrt((tmp_L - c_L) * (tmp_L - c_L) + (tmp_a - c_a) * (tmp_a - c_a) +
                                     (tmp_b - c_b) * (tmp_b - c_b));
                    double Ds = sqrt((i - c_row) * (i - c_row) + (j - c_col) * (j - c_col));
                    double D = sqrt((Dc / (double) M) * (Dc / (double) M) + (Ds / (double) S) * (Ds / (double) S));

                    if (D < distance[i][j])
                        if (label[i][j] == -1)
                            label[i][j] = c;

                            vector<int> point;
                            int old_cluster = label[i][j];
                            vector<vector<int>>::iterator iter;
                            for (iter = pixel[old_cluster].begin(); iter != pixel[old_cluster].end(); iter++)
                                if ((*iter)[0] == i && (*iter)[1] == j)

                            label[i][j] = c;

                            vector<int> point;
                        distance[i][j] = D;

        cout << " start update cluster";

        for (int c = 0; c < Cluster.size(); c++)
            int sum_i = 0;
            int sum_j = 0;
            int number = 0;
            for (int p = 0; p < pixel[c].size(); p++)
                sum_i += pixel[c][p][0];
                sum_j += pixel[c][p][1];

            int tmp_i = (int) ((double) sum_i / (double) number);
            int tmp_j = (int) ((double) sum_j / (double) number);

            Cluster[c][0] = (float) tmp_i;
            Cluster[c][1] = (float) tmp_j;
            Cluster[c][2] = image[tmp_i][tmp_j][0];
            Cluster[c][3] = image[tmp_i][tmp_j][1];
            Cluster[c][4] = image[tmp_i][tmp_j][2];

    vector<vector<vector<float>>> out_image = image;//x,y,(L,a,b)
    for (int c = 0; c < Cluster.size(); c++)
        for (int p = 0; p < pixel[c].size(); p++)
            out_image[pixel[c][p][0]][pixel[c][p][1]][0] = Cluster[c][2];
            out_image[pixel[c][p][0]][pixel[c][p][1]][1] = Cluster[c][3];
            out_image[pixel[c][p][0]][pixel[c][p][1]][2] = Cluster[c][4];
        out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][0] = 0;
        out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][1] = 0;
        out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][2] = 0;
    cout << endl << "export image mat finished" << endl;
    Mat print_image = raw_image.clone();
    for (int i = 0; i < rows; i++)
        for (int j = 0; j < cols; j++)
            float L = out_image[i][j][0];
            float a = out_image[i][j][1];
            float b = out_image[i][j][2];

            int R, G, B;
            Lab2RGB(L, a, b, &R, &G, &B);
            Vec3b vec3b;
            vec3b[0] = B;
            vec3b[1] = G;
            vec3b[2] = R;
            print_image.at<Vec3b>(i, j) = vec3b;

    imshow("print_image", print_image);
    waitKey(0);  //暂停,保持图像显示,等待按键结束
    return 0;


