
最近调查OpenCV能否使用SoC内置的GPU进行图像处理加速,所采用的SoC是NXP的i.MX6Q,GPU型号是Vivante GC2000。从SoC的年代来说,这款GPU的配置还是不错的,其中:

  1. 1个GPGPU Core
  2. 4个Shader Core
  3. 32GFLOPS

    不过,GPU支持的OpenCL标准只到了OpenCL 1.1 EP。


using namespace cv;
using namespace std;

#define USE_T_API   1

 * ===  FUNCTION  ======================================================================
 *         Name:  example
 *  Description:
 * =====================================================================================
#ifndef USE_T_API
void example( Mat & img )
void example( UMat & img )
#ifndef USE_T_API
    Mat _gray, _canny;
    UMat _gray, _canny;
    double t = (double)getTickCount();

    cvtColor( img, _gray, COLOR_BGR2GRAY );
    GaussianBlur( _gray, _gray, Size(5, 5), 3, 3 );
    Canny( _gray, _canny, 10, 100, 3, true );

    t = ((double)getTickCount() - t)/getTickFrequency();
    cout << "Times passed in seconds: " << t << endl;

    namedWindow( "Gray", WINDOW_AUTOSIZE );
    namedWindow( "Canny", WINDOW_AUTOSIZE );

    imshow( "Gray", _gray );
    imshow( "Canny", _canny );
}       /* -----  end of function example  ----- */
 * ===  FUNCTION  ======================================================================
 *         Name:  main
 *  Description:
 * =====================================================================================
int main ( int argc, char *argv[] )
#ifndef USE_T_API
    Mat _img    =   imread( argv[1], -1 );
    UMat _img;

    imread( argv[1], -1 ).copyTo( _img );

    if( _img.empty() ) return -1;

    example( _img );

    waitKey( 0 );

    return 0;
}               /* ----------  end of function main  ---------- */


Some tidbits:

 1. OpenCL version should be larger than 1.1 with FULL PROFILE.
 2. Currently there’s only one OpenCL context and command queue. We hope to implement multi device and multi queue support in the future.
 3. Many kernels use 256 as its workgroup size if possible, so the max work group size of the device must larger than 256. All GPU devices we are aware of indeed support 256 workitems in a workgroup, however non GPU devices may not. This will be improved in the future.
 4. ...



部分OpenCV接口在i.MX6上,可以基于OpenCL 1.1 EP执行,但性能与CPU执行相比更慢,原因不明。
