上一文对GrabCut做了一个了解。OpenCV中的GrabCut算法是依据《"GrabCut" - Interactive Foreground Extraction using Iterated Graph Cuts》这篇文章来实现的。现在我对源码做了些注释,以便我们更深入的了解该算法。一直觉得论文和代码是有比较大的差别的,个人觉得脱离代码看论文,最多能看懂70%,剩下20%或者更多就需要通过阅读代码来获得了,那还有10%就和每个人的基础和知识储备相挂钩了。
接触时间有限,若有错误,还望各位前辈指正,谢谢。原论文的一些浅解见上一博文:
http://blog.csdn.net/zouxy09/article/details/8534954
一、GrabCut函数使用
在OpenCV的源码目录的samples的文件夹下,有grabCut的使用例程,请参考:
opencv\samples\cpp\grabcut.cpp。
而grabCut函数的API说明如下:
void cv::grabCut( InputArray _img, InputOutputArray _mask, Rect rect,
InputOutputArray _bgdModel, InputOutputArray _fgdModel,
int iterCount, int mode )
/*
****参数说明:
img——待分割的源图像,必须是8位3通道(CV_8UC3)图像,在处理的过程中不会被修改;
mask——掩码图像,如果使用掩码进行初始化,那么mask保存初始化掩码信息;在执行分割的时候,也可以将用户交互所设定的前景与背景保存到mask中,然后再传入grabCut函数;在处理结束之后,mask中会保存结果。mask只能取以下四种值:
GCD_BGD(=0),背景;
GCD_FGD(=1),前景;
GCD_PR_BGD(=2),可能的背景;
GCD_PR_FGD(=3),可能的前景。
如果没有手工标记GCD_BGD或者GCD_FGD,那么结果只会有GCD_PR_BGD或GCD_PR_FGD;
rect——用于限定需要进行分割的图像范围,只有该矩形窗口内的图像部分才被处理;
bgdModel——背景模型,如果为null,函数内部会自动创建一个bgdModel;bgdModel必须是单通道浮点型(CV_32FC1)图像,且行数只能为1,列数只能为13x5;
fgdModel——前景模型,如果为null,函数内部会自动创建一个fgdModel;fgdModel必须是单通道浮点型(CV_32FC1)图像,且行数只能为1,列数只能为13x5;
iterCount——迭代次数,必须大于0;
mode——用于指示grabCut函数进行什么操作,可选的值有:
GC_INIT_WITH_RECT(=0),用矩形窗初始化GrabCut;
GC_INIT_WITH_MASK(=1),用掩码图像初始化GrabCut;
GC_EVAL(=2),执行分割。
*/
二、GrabCut源码解读
其中源码包含了gcgraph.hpp这个构建图和max flow/min cut算法的实现文件,这个文件暂时没有解读,后面再更新了。
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- #include "precomp.hpp"
- #include "gcgraph.hpp"
- #include <limits>
-
- using namespace cv;
-
-
-
-
-
-
-
-
-
-
- class GMM
- {
- public:
- static const int componentsCount = 5;
-
- GMM( Mat& _model );
- double operator()( const Vec3d color ) const;
- double operator()( int ci, const Vec3d color ) const;
- int whichComponent( const Vec3d color ) const;
-
- void initLearning();
- void addSample( int ci, const Vec3d color );
- void endLearning();
-
- private:
- void calcInverseCovAndDeterm( int ci );
- Mat model;
- double* coefs;
- double* mean;
- double* cov;
-
- double inverseCovs[componentsCount][3][3];
- double covDeterms[componentsCount];
-
- double sums[componentsCount][3];
- double prods[componentsCount][3][3];
- int sampleCounts[componentsCount];
- int totalSampleCount;
- };
-
-
- GMM::GMM( Mat& _model )
- {
-
-
- //协方差用来度量两个随机变量的关系,如果为正,则正相关;否则,负相关
- const int modelSize = 3 + 9 + 1;
- if( _model.empty() )
- {
-
- _model.create( 1, modelSize*componentsCount, CV_64FC1 );
- _model.setTo(Scalar(0));
- }
- else if( (_model.type() != CV_64FC1) || (_model.rows != 1) || (_model.cols != modelSize*componentsCount) )
- CV_Error( CV_StsBadArg, "_model must have CV_64FC1 type, rows == 1 and cols == 13*componentsCount" );
-
- model = _model;
-
-
-
- coefs = model.ptr<double>(0);
- mean = coefs + componentsCount;
- cov = mean + 3*componentsCount;
-
- for( int ci = 0; ci < componentsCount; ci++ )
- if( coefs[ci] > 0 )
-
-
- calcInverseCovAndDeterm( ci );
- }
-
-
-
-
-
- double GMM::operator()( const Vec3d color ) const
- {
- double res = 0;
- for( int ci = 0; ci < componentsCount; ci++ )
- res += coefs[ci] * (*this)(ci, color );
- return res;
- }
-
-
-
- double GMM::operator()( int ci, const Vec3d color ) const
- {
- double res = 0;
- if( coefs[ci] > 0 )
- {
- CV_Assert( covDeterms[ci] > std::numeric_limits<double>::epsilon() );
- Vec3d diff = color;
- double* m = mean + 3*ci;
- diff[0] -= m[0]; diff[1] -= m[1]; diff[2] -= m[2];
- double mult = diff[0]*(diff[0]*inverseCovs[ci][0][0] + diff[1]*inverseCovs[ci][1][0] + diff[2]*inverseCovs[ci][2][0])
- + diff[1]*(diff[0]*inverseCovs[ci][0][1] + diff[1]*inverseCovs[ci][1][1] + diff[2]*inverseCovs[ci][2][1])
- + diff[2]*(diff[0]*inverseCovs[ci][0][2] + diff[1]*inverseCovs[ci][1][2] + diff[2]*inverseCovs[ci][2][2]);
- res = 1.0f/sqrt(covDeterms[ci]) * exp(-0.5f*mult);
- }
- return res;
- }
-
-
- int GMM::whichComponent( const Vec3d color ) const
- {
- int k = 0;
- double max = 0;
-
- for( int ci = 0; ci < componentsCount; ci++ )
- {
- double p = (*this)( ci, color );
- if( p > max )
- {
- k = ci;
- max = p;
- }
- }
- return k;
- }
-
-
- void GMM::initLearning()
- {
- for( int ci = 0; ci < componentsCount; ci++)
- {
- sums[ci][0] = sums[ci][1] = sums[ci][2] = 0;
- prods[ci][0][0] = prods[ci][0][1] = prods[ci][0][2] = 0;
- prods[ci][1][0] = prods[ci][1][1] = prods[ci][1][2] = 0;
- prods[ci][2][0] = prods[ci][2][1] = prods[ci][2][2] = 0;
- sampleCounts[ci] = 0;
- }
- totalSampleCount = 0;
- }
-
-
-
-
-
- void GMM::addSample( int ci, const Vec3d color )
- {
- sums[ci][0] += color[0]; sums[ci][1] += color[1]; sums[ci][2] += color[2];
- prods[ci][0][0] += color[0]*color[0]; prods[ci][0][1] += color[0]*color[1]; prods[ci][0][2] += color[0]*color[2];
- prods[ci][1][0] += color[1]*color[0]; prods[ci][1][1] += color[1]*color[1]; prods[ci][1][2] += color[1]*color[2];
- prods[ci][2][0] += color[2]*color[0]; prods[ci][2][1] += color[2]*color[1]; prods[ci][2][2] += color[2]*color[2];
- sampleCounts[ci]++;
- totalSampleCount++;
- }
-
-
-
- void GMM::endLearning()
- {
- const double variance = 0.01;
- for( int ci = 0; ci < componentsCount; ci++ )
- {
- int n = sampleCounts[ci];
- if( n == 0 )
- coefs[ci] = 0;
- else
- {
-
- coefs[ci] = (double)n/totalSampleCount;
-
-
- double* m = mean + 3*ci;
- m[0] = sums[ci][0]/n; m[1] = sums[ci][1]/n; m[2] = sums[ci][2]/n;
-
-
- double* c = cov + 9*ci;
- c[0] = prods[ci][0][0]/n - m[0]*m[0]; c[1] = prods[ci][0][1]/n - m[0]*m[1]; c[2] = prods[ci][0][2]/n - m[0]*m[2];
- c[3] = prods[ci][1][0]/n - m[1]*m[0]; c[4] = prods[ci][1][1]/n - m[1]*m[1]; c[5] = prods[ci][1][2]/n - m[1]*m[2];
- c[6] = prods[ci][2][0]/n - m[2]*m[0]; c[7] = prods[ci][2][1]/n - m[2]*m[1]; c[8] = prods[ci][2][2]/n - m[2]*m[2];
-
-
- double dtrm = c[0]*(c[4]*c[8]-c[5]*c[7]) - c[1]*(c[3]*c[8]-c[5]*c[6]) + c[2]*(c[3]*c[7]-c[4]*c[6]);
- if( dtrm <= std::numeric_limits<double>::epsilon() )
- {
-
-
-
- c[0] += variance;
- c[4] += variance;
- c[8] += variance;
- }
-
-
- calcInverseCovAndDeterm(ci);
- }
- }
- }
-
-
- void GMM::calcInverseCovAndDeterm( int ci )
- {
- if( coefs[ci] > 0 )
- {
-
- double *c = cov + 9*ci;
- double dtrm =
- covDeterms[ci] = c[0]*(c[4]*c[8]-c[5]*c[7]) - c[1]*(c[3]*c[8]-c[5]*c[6])
- + c[2]*(c[3]*c[7]-c[4]*c[6]);
-
-
-
-
-
-
-
- CV_Assert( dtrm > std::numeric_limits<double>::epsilon() );
-
- inverseCovs[ci][0][0] = (c[4]*c[8] - c[5]*c[7]) / dtrm;
- inverseCovs[ci][1][0] = -(c[3]*c[8] - c[5]*c[6]) / dtrm;
- inverseCovs[ci][2][0] = (c[3]*c[7] - c[4]*c[6]) / dtrm;
- inverseCovs[ci][0][1] = -(c[1]*c[8] - c[2]*c[7]) / dtrm;
- inverseCovs[ci][1][1] = (c[0]*c[8] - c[2]*c[6]) / dtrm;
- inverseCovs[ci][2][1] = -(c[0]*c[7] - c[1]*c[6]) / dtrm;
- inverseCovs[ci][0][2] = (c[1]*c[5] - c[2]*c[4]) / dtrm;
- inverseCovs[ci][1][2] = -(c[0]*c[5] - c[2]*c[3]) / dtrm;
- inverseCovs[ci][2][2] = (c[0]*c[4] - c[1]*c[3]) / dtrm;
- }
- }
-
-
-
-
-
-
-
-
-
-
- static double calcBeta( const Mat& img )
- {
- double beta = 0;
- for( int y = 0; y < img.rows; y++ )
- {
- for( int x = 0; x < img.cols; x++ )
- {
-
-
- Vec3d color = img.at<Vec3b>(y,x);
- if( x>0 )
- {
- Vec3d diff = color - (Vec3d)img.at<Vec3b>(y,x-1);
- beta += diff.dot(diff);
- }
- if( y>0 && x>0 )
- {
- Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x-1);
- beta += diff.dot(diff);
- }
- if( y>0 )
- {
- Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x);
- beta += diff.dot(diff);
- }
- if( y>0 && x<img.cols-1)
- {
- Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x+1);
- beta += diff.dot(diff);
- }
- }
- }
- if( beta <= std::numeric_limits<double>::epsilon() )
- beta = 0;
- else
- beta = 1.f / (2 * beta/(4*img.cols*img.rows - 3*img.cols - 3*img.rows + 2) );
-
- return beta;
- }
-
-
-
-
-
-
-
-
-
-
- static void calcNWeights( const Mat& img, Mat& leftW, Mat& upleftW, Mat& upW,
- Mat& uprightW, double beta, double gamma )
- {
-
-
-
- const double gammaDivSqrt2 = gamma / std::sqrt(2.0f);
-
- leftW.create( img.rows, img.cols, CV_64FC1 );
- upleftW.create( img.rows, img.cols, CV_64FC1 );
- upW.create( img.rows, img.cols, CV_64FC1 );
- uprightW.create( img.rows, img.cols, CV_64FC1 );
- for( int y = 0; y < img.rows; y++ )
- {
- for( int x = 0; x < img.cols; x++ )
- {
- Vec3d color = img.at<Vec3b>(y,x);
- if( x-1>=0 )
- {
- Vec3d diff = color - (Vec3d)img.at<Vec3b>(y,x-1);
- leftW.at<double>(y,x) = gamma * exp(-beta*diff.dot(diff));
- }
- else
- leftW.at<double>(y,x) = 0;
- if( x-1>=0 && y-1>=0 )
- {
- Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x-1);
- upleftW.at<double>(y,x) = gammaDivSqrt2 * exp(-beta*diff.dot(diff));
- }
- else
- upleftW.at<double>(y,x) = 0;
- if( y-1>=0 )
- {
- Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x);
- upW.at<double>(y,x) = gamma * exp(-beta*diff.dot(diff));
- }
- else
- upW.at<double>(y,x) = 0;
- if( x+1<img.cols && y-1>=0 )
- {
- Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x+1);
- uprightW.at<double>(y,x) = gammaDivSqrt2 * exp(-beta*diff.dot(diff));
- }
- else
- uprightW.at<double>(y,x) = 0;
- }
- }
- }
-
-
-
-
-
-
-
-
-
- static void checkMask( const Mat& img, const Mat& mask )
- {
- if( mask.empty() )
- CV_Error( CV_StsBadArg, "mask is empty" );
- if( mask.type() != CV_8UC1 )
- CV_Error( CV_StsBadArg, "mask must have CV_8UC1 type" );
- if( mask.cols != img.cols || mask.rows != img.rows )
- CV_Error( CV_StsBadArg, "mask must have as many rows and cols as img" );
- for( int y = 0; y < mask.rows; y++ )
- {
- for( int x = 0; x < mask.cols; x++ )
- {
- uchar val = mask.at<uchar>(y,x);
- if( val!=GC_BGD && val!=GC_FGD && val!=GC_PR_BGD && val!=GC_PR_FGD )
- CV_Error( CV_StsBadArg, "mask element value must be equel"
- "GC_BGD or GC_FGD or GC_PR_BGD or GC_PR_FGD" );
- }
- }
- }
-
-
-
-
-
-
- static void initMaskWithRect( Mat& mask, Size imgSize, Rect rect )
- {
- mask.create( imgSize, CV_8UC1 );
- mask.setTo( GC_BGD );
-
- rect.x = max(0, rect.x);
- rect.y = max(0, rect.y);
- rect.width = min(rect.width, imgSize.width-rect.x);
- rect.height = min(rect.height, imgSize.height-rect.y);
-
- (mask(rect)).setTo( Scalar(GC_PR_FGD) );
- }
-
-
-
-
-
- static void initGMMs( const Mat& img, const Mat& mask, GMM& bgdGMM, GMM& fgdGMM )
- {
- const int kMeansItCount = 10;
- const int kMeansType = KMEANS_PP_CENTERS;
-
- Mat bgdLabels, fgdLabels;
- vector<Vec3f> bgdSamples, fgdSamples;
- Point p;
- for( p.y = 0; p.y < img.rows; p.y++ )
- {
- for( p.x = 0; p.x < img.cols; p.x++ )
- {
-
- if( mask.at<uchar>(p) == GC_BGD || mask.at<uchar>(p) == GC_PR_BGD )
- bgdSamples.push_back( (Vec3f)img.at<Vec3b>(p) );
- else
- fgdSamples.push_back( (Vec3f)img.at<Vec3b>(p) );
- }
- }
- CV_Assert( !bgdSamples.empty() && !fgdSamples.empty() );
-
-
-
- Mat _bgdSamples( (int)bgdSamples.size(), 3, CV_32FC1, &bgdSamples[0][0] );
- kmeans( _bgdSamples, GMM::componentsCount, bgdLabels,
- TermCriteria( CV_TERMCRIT_ITER, kMeansItCount, 0.0), 0, kMeansType );
- Mat _fgdSamples( (int)fgdSamples.size(), 3, CV_32FC1, &fgdSamples[0][0] );
- kmeans( _fgdSamples, GMM::componentsCount, fgdLabels,
- TermCriteria( CV_TERMCRIT_ITER, kMeansItCount, 0.0), 0, kMeansType );
-
-
- bgdGMM.initLearning();
- for( int i = 0; i < (int)bgdSamples.size(); i++ )
- bgdGMM.addSample( bgdLabels.at<int>(i,0), bgdSamples[i] );
- bgdGMM.endLearning();
-
- fgdGMM.initLearning();
- for( int i = 0; i < (int)fgdSamples.size(); i++ )
- fgdGMM.addSample( fgdLabels.at<int>(i,0), fgdSamples[i] );
- fgdGMM.endLearning();
- }
-
-
-
-
-
- static void assignGMMsComponents( const Mat& img, const Mat& mask, const GMM& bgdGMM,
- const GMM& fgdGMM, Mat& compIdxs )
- {
- Point p;
- for( p.y = 0; p.y < img.rows; p.y++ )
- {
- for( p.x = 0; p.x < img.cols; p.x++ )
- {
- Vec3d color = img.at<Vec3b>(p);
-
- compIdxs.at<int>(p) = mask.at<uchar>(p) == GC_BGD || mask.at<uchar>(p) == GC_PR_BGD ?
- bgdGMM.whichComponent(color) : fgdGMM.whichComponent(color);
- }
- }
- }
-
-
-
-
-
- static void learnGMMs( const Mat& img, const Mat& mask, const Mat& compIdxs, GMM& bgdGMM, GMM& fgdGMM )
- {
- bgdGMM.initLearning();
- fgdGMM.initLearning();
- Point p;
- for( int ci = 0; ci < GMM::componentsCount; ci++ )
- {
- for( p.y = 0; p.y < img.rows; p.y++ )
- {
- for( p.x = 0; p.x < img.cols; p.x++ )
- {
- if( compIdxs.at<int>(p) == ci )
- {
- if( mask.at<uchar>(p) == GC_BGD || mask.at<uchar>(p) == GC_PR_BGD )
- bgdGMM.addSample( ci, img.at<Vec3b>(p) );
- else
- fgdGMM.addSample( ci, img.at<Vec3b>(p) );
- }
- }
- }
- }
- bgdGMM.endLearning();
- fgdGMM.endLearning();
- }
-
-
-
-
-
-
-
-
- static void constructGCGraph( const Mat& img, const Mat& mask, const GMM& bgdGMM, const GMM& fgdGMM, double lambda,
- const Mat& leftW, const Mat& upleftW, const Mat& upW, const Mat& uprightW,
- GCGraph<double>& graph )
- {
- int vtxCount = img.cols*img.rows;
- int edgeCount = 2*(4*vtxCount - 3*(img.cols + img.rows) + 2);
-
- graph.create(vtxCount, edgeCount);
- Point p;
- for( p.y = 0; p.y < img.rows; p.y++ )
- {
- for( p.x = 0; p.x < img.cols; p.x++)
- {
-
- int vtxIdx = graph.addVtx();
- Vec3b color = img.at<Vec3b>(p);
-
-
-
-
- double fromSource, toSink;
- if( mask.at<uchar>(p) == GC_PR_BGD || mask.at<uchar>(p) == GC_PR_FGD )
- {
-
- fromSource = -log( bgdGMM(color) );
- toSink = -log( fgdGMM(color) );
- }
- else if( mask.at<uchar>(p) == GC_BGD )
- {
-
- fromSource = 0;
- toSink = lambda;
- }
- else
- {
- fromSource = lambda;
- toSink = 0;
- }
-
- graph.addTermWeights( vtxIdx, fromSource, toSink );
-
-
-
-
- if( p.x>0 )
- {
- double w = leftW.at<double>(p);
- graph.addEdges( vtxIdx, vtxIdx-1, w, w );
- }
- if( p.x>0 && p.y>0 )
- {
- double w = upleftW.at<double>(p);
- graph.addEdges( vtxIdx, vtxIdx-img.cols-1, w, w );
- }
- if( p.y>0 )
- {
- double w = upW.at<double>(p);
- graph.addEdges( vtxIdx, vtxIdx-img.cols, w, w );
- }
- if( p.x<img.cols-1 && p.y>0 )
- {
- double w = uprightW.at<double>(p);
- graph.addEdges( vtxIdx, vtxIdx-img.cols+1, w, w );
- }
- }
- }
- }
-
-
-
-
-
- static void estimateSegmentation( GCGraph<double>& graph, Mat& mask )
- {
-
- graph.maxFlow();
- Point p;
- for( p.y = 0; p.y < mask.rows; p.y++ )
- {
- for( p.x = 0; p.x < mask.cols; p.x++ )
- {
-
-
- if( mask.at<uchar>(p) == GC_PR_BGD || mask.at<uchar>(p) == GC_PR_FGD )
- {
- if( graph.inSourceSegment( p.y*mask.cols+p.x ) )
- mask.at<uchar>(p) = GC_PR_FGD;
- else
- mask.at<uchar>(p) = GC_PR_BGD;
- }
- }
- }
- }
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- void cv::grabCut( InputArray _img, InputOutputArray _mask, Rect rect,
- InputOutputArray _bgdModel, InputOutputArray _fgdModel,
- int iterCount, int mode )
- {
- Mat img = _img.getMat();
- Mat& mask = _mask.getMatRef();
- Mat& bgdModel = _bgdModel.getMatRef();
- Mat& fgdModel = _fgdModel.getMatRef();
-
- if( img.empty() )
- CV_Error( CV_StsBadArg, "image is empty" );
- if( img.type() != CV_8UC3 )
- CV_Error( CV_StsBadArg, "image mush have CV_8UC3 type" );
-
- GMM bgdGMM( bgdModel ), fgdGMM( fgdModel );
- Mat compIdxs( img.size(), CV_32SC1 );
-
- if( mode == GC_INIT_WITH_RECT || mode == GC_INIT_WITH_MASK )
- {
- if( mode == GC_INIT_WITH_RECT )
- initMaskWithRect( mask, img.size(), rect );
- else
- checkMask( img, mask );
- initGMMs( img, mask, bgdGMM, fgdGMM );
- }
-
- if( iterCount <= 0)
- return;
-
- if( mode == GC_EVAL )
- checkMask( img, mask );
-
- const double gamma = 50;
- const double lambda = 9*gamma;
- const double beta = calcBeta( img );
-
- Mat leftW, upleftW, upW, uprightW;
- calcNWeights( img, leftW, upleftW, upW, uprightW, beta, gamma );
-
- for( int i = 0; i < iterCount; i++ )
- {
- GCGraph<double> graph;
- assignGMMsComponents( img, mask, bgdGMM, fgdGMM, compIdxs );
- learnGMMs( img, mask, compIdxs, bgdGMM, fgdGMM );
- constructGCGraph(img, mask, bgdGMM, fgdGMM, lambda, leftW, upleftW, upW, uprightW, graph );
- estimateSegmentation( graph, mask );
- }
- }
GrabCut资源汇总:
https://mywebspace.wisc.edu/pwang6/personal/
is a very high quality code. Do not give out the source code though. The user cannot change the setting of the parameters like the "k" in the GMM. However, it works very smoothly in vista and XP both 32 bit and 64 bit.
http://www.cs.cmu.edu/~mohitg/segmentation.htm
Provides the source code for C++ and Matlab. However, the quality is not so good. It could crash sometimes, and the parameter setting is not straight-forward.
http://research.justintalbot.org/papers/GrabCut.zip
Provides the C++ code for GrabCut implementation. This source code has very high quality, and it provides the step-by-step of the GMM learning and Graph Cutting. The user could see the change of the energy and the color modes.
The user needs to modify the source code to change the setting of the "k" in the GMM.
To compile this code, OpenCV(http://opencv.willowgarage.com/wiki/) and the GLUT (http://www.opengl.org/resources/libraries/glut/) both need to be installed.
VC and Linux both could compile the source code and get it running.
The details of this implementation are described in:
http://students.cs.byu.edu/~jtalbot/research/Grabcut.pdf
Have fun!