Graph-Based Segmentation 是经典的图像分割算法,作者Felzenszwalb也是提出DPM算法的大牛。该算法是基于图的贪心聚类算法,实现简单,速度比较快,精度也还行。不过,目前直接用它做分割的应该比较少,毕竟是99年的跨世纪元老,但是很多算法用它作垫脚石,比如Object Propose的开山之作《Segmentation as Selective Search for Object Recognition》就用它来产生过分割(oversegmentation)。还有的语义分割(senmatic segmentation )算法用它来产生超像素
对于一个优秀分割算法来说,通常要拥有如下两条性质
Capture perceptually important groupings or regions, which often reflect global aspects of the image. Two central issues are to provide precise characterizations of what is perceptually important, and to be able to specify what a given seg- mentation technique does. We believe that there should be precise definitions of the properties of a resulting segmentation, in order to better understand the method as well as to facilitate the comparison of different approaches
Be highly efficient, running in time nearly linear in the number of image pixels. In order to be of practical use, we believe that segmentation methods should run at speeds similar to edge detection or other low-level visual processing techniques, meaning nearly linear time and with low constant factors. For example, a segmentation technique that runs at several frames per second can be used in video processing applications
In contrast, the method described in this paper has been used in large-scale image database applications as described in
The segmentation technique developed here both captures certain perceptually important non-local image characteristics and is computationally 2 efficient – running in O(n log n) time for n image pixels and with low constant factors, and can run in practice at video rates.
本文提出的算法在分割图像时主要是基于两点
该算法不仅可以运用在空间点聚类,还可以用与图片分割;若将算法用于图,则边的权重是基于两个像素的强度差值。如果用于point聚类,则权重为distance(距离)
将图像转为一个无向图,图中的顶点都是图像中的像素点。某像素和该像素的临接点构成一条边,边的权重可以是距离,相似度,等
该算法,会将graph分割成一个个的component,每个component中的元素是相似的,不同component中的分量是不相似的。 其中集合S,包含了所有被分割出的component
首先本算法,先是定义了一个D,用来衡量两个region是否能够分割,其定义如右图
算法通过比较类内差异和类间差异,最终可以得到衡量了全局特征和局部特征的分割
通常还会引入一个阈值函数,这是为了防止size(component)为1时,类内差异为1的情况。
当k取值越大,分割出来的component就越大;但是注意,k并不是component的最小值
算法步骤 第零步:将图像转化为图
第一步:将图的边,按权重进行排序(增序)
第三步,初始化一个S,其中的每个component就是图中的每个顶点
第四步,遍历S,比较每个component的weight是否小于Mint,如果小于且还未合并,则合并两区域,否则跳过步管
第五步,返回S
In Step 3 of the algorithm, when considering edge oq, if two distinct com- ponents are considered and not merged then one of these two components will be in the final segmentation
Property 1 For any (finite) graph G = (V, E) there exists some segmentation S that is neither too coarse nor too fine.
The segmentation produced by Algorithm 1 does not depend on which non-decreasing weight order of the edges is used
The segmentation S produced by Algorithm 1 is not too coarse according to Definition 2, using the region comparison predicate D defined in >>
We always use a Gaussian with σ = 0.8, which does not produce any visible change to the image but helps remove artifact
For color images we run the algorithm three times, once for each of the red, green and blue color planes, and then intersect these three sets of components
对于三通道的RGB图,我们可以直接使用类似于distance的定义,取代intensity
k effectively sets a scale of observation, in that a larger k causes a preference for larger components
算法的执行效率与k的值很相关
k effectively sets a scale of observation, in that a larger k causes a preference for larger components
另外一种方法就是将每个像素映射到一个特征空间中,然后从这个特征空间中进行聚类,Nearest Neighbor Graph与Grid Graphs的唯一区别就是像素的映射方式不同
使用NNG时,是将每个像素映射成(x,y,r,g,b)一个五维向量,使用L2作为衡量距离
For the experiments shown here we map each pixel to the feature point (x, y, r, g, b), where (x, y) is the location of the pixel in the image and (r, g, b) is the color value of the pixel. We use the L2(Euclidean) distance between points as the edge weights, although other distance functions are possible
通常想要找到10个相邻的点是非常复杂的,这里使用ANN算法,这个算法是比较快的,可以找到近似相邻点。
NNG与前面提到的Grid Graph不一样的是,前者是在整个特征空间中寻找相似点,而后者只是在像素的临接点中寻找
One of the key differences from the previous section, where the image grid was used to define the graph, is that the nearest neighbors in feature space capture more
Here, points can be far apart in the image and still be among a handful of nearest neighbors (if their color is highly similar and intervening image pixels are of dissimilar color). For instance, this can result segmentations with regions that are disconnected in the image, which did not happen in the grid-graph case