OpenCV3.x-OpenCL的糟糕体验

今天看到了一份ppt《OpenCV3_CVPR_2015_Speed》,看到了下面的一组数据,于是想研究研究OpenCV的OpenCL这个家伙。

OpenCV3.x-OpenCL的糟糕体验_第1张图片

可以看到利用OpenCL,算法的运行速度加速明显!

于是下载了opencv3.2源码并在Windows10(64bit)VS2013上进行编译,CMAKE配置如下:

OpenCV3.x-OpenCL的糟糕体验_第2张图片

测试环境:Windows10(64)+AMD GPU+OpenCV3.2

看两张上面的PPT大概知道opencv-opencl是怎么用的。

OpenCV3.x-OpenCL的糟糕体验_第3张图片

OpenCV3.x-OpenCL的糟糕体验_第4张图片

OKAY,环境配置好后那是相当的鸡冻,按照上面的示例代码写个原始版本和CL版本的小程序就能测试speed up的效果了!

demo1(执行1次灰度化、高斯模糊、canny算法)如下:

#include 
#include   
#include   
#include  
#include 
#include 
#include  
#include 
#include 

using namespace cv;
using namespace cv::ocl;
using namespace std;

int main(int argc, char **argv){

	// Test Normal OpenCV
	cv::Mat f = cv::imread("C:\\Users\\KayChan\\Desktop\\kj_color_detect\\4.bmp", 1);
	cv::Mat gray;
	double t = 0.0;
	t = (double)cv::getTickCount();
	
	for (int i = 0; i < 1; i++){

		cv::cvtColor(f, gray, cv::COLOR_BGR2GRAY);
		cv::GaussianBlur(gray, gray, cv::Size(7, 7), 1.5);
		cv::Canny(gray, gray, 0, i);
	}
	t = ((double)cv::getTickCount() - t) / cv::getTickFrequency();
	std::cout << "cpu time:" << t << std::endl;
	
	// Test OpenCL
	cv::UMat uf = cv::imread("C:\\Users\\KayChan\\Desktop\\kj_color_detect\\4.bmp", 1).getUMat(cv::ACCESS_READ);
	cv::UMat ugray;
	t = 0.0;
	t = (double)cv::getTickCount();
	for (int i = 0; i < 1; i++){

		cv::cvtColor(uf, ugray, cv::COLOR_BGR2GRAY);
		cv::GaussianBlur(ugray, ugray, cv::Size(7, 7), 1.5);
		cv::Canny(ugray, ugray, 0, i);
	}
	t = ((double)cv::getTickCount() - t) / cv::getTickFrequency();
	std::cout << "gpu time:" << t << std::endl;

	return 0;
}
运行结果:
OpenCV3.x-OpenCL的糟糕体验_第5张图片

将上面代码的循环改成100次,运行结果:

OpenCV3.x-OpenCL的糟糕体验_第6张图片

将上面代码的循环改成500次,运行结果:

OpenCV3.x-OpenCL的糟糕体验_第7张图片

问题来了,好像不对劲啊,CL运行1次的时间是0.724482,100次的时间是0.779403,而500次的时间确实6.68255···,什么情况!不应该啊···感觉这跨度有点大啊!

demo2(测试opencv的灰度模板匹配)如下:

#include 
#include   
#include   
#include  
#include 
#include 
#include  
#include 
#include 

using namespace cv;
using namespace cv::ocl;
using namespace std;

void runMatchGrayUseCpu(int method);
void runMatchGrayUseGpu(int method);

int main(int argc, char **argv){
	
	int method = CV_TM_SQDIFF_NORMED;

	runMatchGrayUseCpu(method);
	runMatchGrayUseGpu(method);
	
	return 0;
}

void runMatchGrayUseCpu(int method){

	double t = 0.0;

	t = (double)cv::getTickCount();
	// 1.get src image
	cv::Mat src = cv::imread("C:\\Users\\KayChan\\Desktop\\kj_color_detect\\11.bmp", 1);

	// 2.get template image
	cv::Mat tmp = cv::imread("C:\\Users\\KayChan\\Desktop\\testimage\\tmp.png", 1);

	// 3.gray image
	cv::Mat gray_src, gray_tmp;
	if (src.channels() == 1) gray_src = src;
	else cv::cvtColor(src, gray_src, CV_RGB2GRAY);
	if (tmp.channels() == 1) gray_tmp = tmp;
	else cv::cvtColor(tmp, gray_tmp, CV_RGB2GRAY);

	// 4.match
	int result_cols = gray_src.cols - gray_tmp.cols + 1;
	int result_rows = gray_src.rows - gray_tmp.rows + 1;
	cv::Mat result = cv::Mat(result_cols, result_rows, CV_32FC1);
	cv::matchTemplate(gray_src, gray_tmp, result, method);

	cv::Point point;
	double minVal, maxVal;
	cv::Point minLoc, maxLoc;
	cv::minMaxLoc(result, &minVal, &maxVal, &minLoc, &maxLoc, cv::Mat());
	switch (method){

	case CV_TM_SQDIFF:
		point = minLoc;
		break;
	case CV_TM_SQDIFF_NORMED:
		point = minLoc;
		break;
	case CV_TM_CCORR:
	case CV_TM_CCOEFF:
		point = maxLoc;
		break;
	case CV_TM_CCORR_NORMED:
	case CV_TM_CCOEFF_NORMED:
	default:
		point = maxLoc;
		break;
	}
	t = ((double)cv::getTickCount() - t) / cv::getTickFrequency();

	std::cout << "======Test Match Template Use CPU======" << std::endl;
	std::cout << "CPU time :" << t << " second" << std::endl;
	std::cout << "obj.x :" << point.x << " obj.y :" << point.y << std::endl;
	std::cout << " " << std::endl;
}

void runMatchGrayUseGpu(int method){

	double t = 0.0;

	t = (double)cv::getTickCount();
	// 1.get src image
	cv::UMat src = cv::imread("C:\\Users\\KayChan\\Desktop\\kj_color_detect\\11.bmp", 1).getUMat(cv::ACCESS_RW);

	// 2.get template image
	cv::UMat tmp = cv::imread("C:\\Users\\KayChan\\Desktop\\testimage\\tmp.png", 1).getUMat(cv::ACCESS_RW);

	// 3.gray image
	cv::UMat gray_src, gray_tmp;
	if (src.channels() == 1) gray_src = src;
	else cv::cvtColor(src, gray_src, CV_RGB2GRAY);
	if (tmp.channels() == 1) gray_tmp = tmp;
	else cv::cvtColor(tmp, gray_tmp, CV_RGB2GRAY);

	// 4.match
	int result_cols = gray_src.cols - gray_tmp.cols + 1;
	int result_rows = gray_src.rows - gray_tmp.rows + 1;
	cv::UMat result = cv::UMat(result_cols, result_rows, CV_32FC1);
	cv::matchTemplate(gray_src, gray_tmp, result, method);

	cv::Point point;
	double minVal, maxVal;
	cv::Point minLoc, maxLoc;
	cv::minMaxLoc(result, &minVal, &maxVal, &minLoc, &maxLoc, cv::Mat());
	switch (method){

	case CV_TM_SQDIFF:
		point = minLoc;
		break;
	case CV_TM_SQDIFF_NORMED:
		point = minLoc;
		break;
	case CV_TM_CCORR:
	case CV_TM_CCOEFF:
		point = maxLoc;
		break;
	case CV_TM_CCORR_NORMED:
	case CV_TM_CCOEFF_NORMED:
	default:
		point = maxLoc;
		break;
	}
	t = ((double)cv::getTickCount() - t) / cv::getTickFrequency();


	std::cout << "======Test Match Template Use OpenCL======" << std::endl;
	std::cout << "OpenCL time :" << t << " second" << std::endl;
	std::cout << "obj.x :" << point.x << " obj.y :" << point.y << std::endl;
	std::cout << " " << std::endl;
}
运行结果如下:

OpenCV3.x-OpenCL的糟糕体验_第8张图片

我只想说,我了个FUCK,居然比CPU的更慢···。而且慢的不是一个数量级的

然后我以为是没开启OCL,UMat看文档应该是不需要想2.x那样开启的吧。索性也开启一下像2.X那样,加了一段代码代码如下:

#include 
#include   
#include   
#include  
#include 
#include 
#include  
#include 
#include 

using namespace cv;
using namespace cv::ocl;
using namespace std;

void runMatchGrayUseCpu(int method);
void runMatchGrayUseGpu(int method);

int main(int argc, char **argv){

	// 新增加的
	//launch OpenCL environment...  
	std::vector plats;
	cv::ocl::getPlatfomsInfo(plats);
	const cv::ocl::PlatformInfo *platform = &plats[0];
	cout<<"Platform name: "<name().c_str()<getDevice(current_device, 0);
	cout<<"Device name: "<
运行结果如下:

OpenCV3.x-OpenCL的糟糕体验_第9张图片

MMP,看到没有,居然给Mat加速了近10倍(从1秒多编程0.1秒)···,没有给UMat加速。

蛋疼啊···,网上查了查,看到一位女侠和我遇到一样的问题

OpenCV3.x-OpenCL的糟糕体验_第10张图片

不知各位大佬有什么想法,既然官方声称UMat支持了opencl,不应该这么差劲啊···,想想也应该是自己哪里做的不对。

不过晚上找到了一些原因,不知道是不是这些因素导致的,正在思考···,找出猫腻后再来分享




你可能感兴趣的:(图像处理,OpenCV,Halcon,异构并行,CUDA,OpenCL,OpenMP)