论文笔记:Efficient Graph-Based Image Segmentation

前言

Graph-Based Segmentation 是经典的图像分割算法,作者Felzenszwalb也是提出DPM算法的大牛。该算法是基于图的贪心聚类算法,实现简单,速度比较快,精度也还行。不过,目前直接用它做分割的应该比较少,毕竟是99年的跨世纪元老,但是很多算法用它作垫脚石,比如Object Propose的开山之作《Segmentation as Selective Search for Object Recognition》就用它来产生过分割(oversegmentation)。还有的语义分割(senmatic segmentation )算法用它来产生超像素

Introduction

对于一个优秀分割算法来说,通常要拥有如下两条性质

  1. 算法首先必须要能捕捉到可以反应图像全局特征的region,并且我们可以清楚知道算法在做什么,以及为什么这么做

Capture perceptually important groupings or regions, which often reflect global aspects of the image. Two central issues are to provide precise characterizations of what is perceptually important, and to be able to specify what a given seg- mentation technique does. We believe that there should be precise definitions of the properties of a resulting segmentation, in order to better understand the method as well as to facilitate the comparison of different approaches

  1. 算法效率一定要高,必须以接近线性复杂度,这样才能将算法运用在video等实时性较高的application中

Be highly efficient, running in time nearly linear in the number of image pixels. In order to be of practical use, we believe that segmentation methods should run at speeds similar to edge detection or other low-level visual processing techniques, meaning nearly linear time and with low constant factors. For example, a segmentation technique that runs at several frames per second can be used in video processing applications

  • 本文提出的算法运行效率是很好的,可以用在大规模的图片数据库中

In contrast, the method described in this paper has been used in large-scale image database applications as described in

  • 分割算法时间复杂度大概在O(nlogn),且具有较低的factor。其中n是图像像素个数

The segmentation technique developed here both captures certain perceptually important non-local image characteristics and is computationally 2 efficient – running in O(n log n) time for n image pixels and with low constant factors, and can run in practice at video rates.

  • 本文提出的算法其大概思路大概就是将一副图像构造成一副无向图,图像的每个像素看作一个顶点,图的边的weight对应的是两个像素之间的差值。算法使用的是贪心策略

本文提出的算法在分割图像时主要是基于两点

  1. 是通过评估两个区域边界之间的像素差值
  2. 是通过评估两个相邻像素之间的差值;如果两个区域边界之间的像素差值相对于至少一个区域内的像素差值很大的话,就说明这个信息是非常重要的

该算法不仅可以运用在空间点聚类,还可以用与图片分割;若将算法用于图,则边的权重是基于两个像素的强度差值。如果用于point聚类,则权重为distance(距离)

Graph-Based Segmentation

算法简介

  • 将图像转为一个无向图,图中的顶点都是图像中的像素点。某像素和该像素的临接点构成一条边,边的权重可以是距离,相似度,等

  • 该算法,会将graph分割成一个个的component,每个component中的元素是相似的,不同component中的分量是不相似的。 其中集合S,包含了所有被分割出的component

首先本算法,先是定义了一个D,用来衡量两个region是否能够分割,其定义如右图
论文笔记:Efficient Graph-Based Image Segmentation_第1张图片

算法通过比较类内差异和类间差异,最终可以得到衡量了全局特征和局部特征的分割
论文笔记:Efficient Graph-Based Image Segmentation_第2张图片
论文笔记:Efficient Graph-Based Image Segmentation_第3张图片

在这里插入图片描述

通常还会引入一个阈值函数,这是为了防止size(component)为1时,类内差异为1的情况。
在这里插入图片描述

当k取值越大,分割出来的component就越大;但是注意,k并不是component的最小值
在这里插入图片描述

算法

论文笔记:Efficient Graph-Based Image Segmentation_第4张图片

算法步骤 第零步:将图像转化为图
第一步:将图的边,按权重进行排序(增序)
第三步,初始化一个S,其中的每个component就是图中的每个顶点
第四步,遍历S,比较每个component的weight是否小于Mint,如果小于且还未合并,则合并两区域,否则跳过步管
第五步,返回S

  • 步骤4中有个重要性质,就是当遍历到某条边的时候,如果没能合并成功,那么之后便利过程就一定不会再成功了。

In Step 3 of the algorithm, when considering edge oq, if two distinct com- ponents are considered and not merged then one of these two components will be in the final segmentation

  • 对于任何一幅图,都有一种既不会太粗也不会太细的分割方法

Property 1 For any (finite) graph G = (V, E) there exists some segmentation S that is neither too coarse nor too fine.

  • 如果有两条边的权重是一样的,那么先便遍历哪个对最终结果都没用影响

The segmentation produced by Algorithm 1 does not depend on which non-decreasing weight order of the edges is used

  • 该算法分割出的图片既不会太细也不会特别糙

The segmentation S produced by Algorithm 1 is not too coarse according to Definition 2, using the region comparison predicate D defined in >>

Result for Grid Graphs

  • 通常,在计算之前我们还会对图像进行平滑处理,我们选用高斯核,其参数sigma = 0.8

We always use a Gaussian with σ = 0.8, which does not produce any visible change to the image but helps remove artifact

  • 对于三通道的彩色图像,我们可以运行算法三次,然后取他们距离。或者同时对三个通道都进行segment,最终取交集

For color images we run the algorithm three times, once for each of the red, green and blue color planes, and then intersect these three sets of components

  • 对于三通道的RGB图,我们可以直接使用类似于distance的定义,取代intensity

  • k effectively sets a scale of observation, in that a larger k causes a preference for larger components

  • 算法的执行效率与k的值很相关

k effectively sets a scale of observation, in that a larger k causes a preference for larger components

Results for Nearest Neighbor Graphs

另外一种方法就是将每个像素映射到一个特征空间中,然后从这个特征空间中进行聚类,Nearest Neighbor Graph与Grid Graphs的唯一区别就是像素的映射方式不同

使用NNG时,是将每个像素映射成(x,y,r,g,b)一个五维向量,使用L2作为衡量距离

For the experiments shown here we map each pixel to the feature point (x, y, r, g, b), where (x, y) is the location of the pixel in the image and (r, g, b) is the color value of the pixel. We use the L2(Euclidean) distance between points as the edge weights, although other distance functions are possible

  • 通常想要找到10个相邻的点是非常复杂的,这里使用ANN算法,这个算法是比较快的,可以找到近似相邻点。

  • NNG与前面提到的Grid Graph不一样的是,前者是在整个特征空间中寻找相似点,而后者只是在像素的临接点中寻找

One of the key differences from the previous section, where the image grid was used to define the graph, is that the nearest neighbors in feature space capture more

  • NNG的一个好处就是如果区域相似的话,不一定非要相邻,可以中间间隔的有别的区域

Here, points can be far apart in the image and still be among a handful of nearest neighbors (if their color is highly similar and intervening image pixels are of dissimilar color). For instance, this can result segmentations with regions that are disconnected in the image, which did not happen in the grid-graph case

  • 但是,两种方法最终分割出来的结果实际上是差不多的

你可能感兴趣的:(论文学习,算法,聚类,计算机视觉,机器学习,深度学习)