openCl-work-item的并行的理解

      最近在看OpenCL的程序,对于work-item的运行机制不是很理解。于是,自己用几个小程序直观的看了一下,主要是在用OpenMP的测试思想,输出work-item及其处理的数据结果。个人感觉这个对于我理解其运行机制很有帮助,以下是程序:

    主机端程序:main.cpp

/*
   项目:openCL的矩阵相乘
   作者:刘荣
   时间:2012.11.20
*/
#include 
#include
#include  
#include
#include 
#include 
#include 
using namespace std;
//kernel函数
std::string
convertToString(const char *filename)//将kernel源码,即自己写的并行化的函数,转化成字符串
{
    size_t size;
    char*  str;
    std::string s;

    std::fstream f(filename, (std::fstream::in | std::fstream::binary));

    if(f.is_open())
    {
        size_t fileSize;
        f.seekg(0, std::fstream::end);
        size = fileSize = (size_t)f.tellg();
        f.seekg(0, std::fstream::beg);

        str = new char[size+1];
        if(!str)
        {
            f.close();
            std::cout << "Memory allocation failed";
            return NULL;
        }

        f.read(str, fileSize);
        f.close();
        str[size] = '\0';
    
        s = str;
        delete[] str;
        return s;
    }
    else
    {
        std::cout << "\nFile containg the kernel code(\".cl\") not found. Please copy the required file in the folder containg the executable.\n";
        exit(1);
    }
    return NULL;
}

int main()
{
	double start,end,time1,time2;
	//查询平台
	cl_int ciErrNum;
	cl_platform_id platform;
	ciErrNum = clGetPlatformIDs(1, &platform, NULL);
	if(ciErrNum != CL_SUCCESS)
	{
		cout<<"获取设备失败"<

kernel函数 simpleMultiply.cl

// Enter your kernel in this window
__kernel                                         
void vecadd(__global float* B,__global float* C)                              
{                                                
   int id = get_global_id(0);
  // barrier(CLK_LOCAL_MEM_FENCE);  
   B[id] = id;  
   for(int i =0;i<2;i++)
   {
         
         C[id*2+i] = i;
		 
   }                    
 //  barrier(CLK_LOCAL_MEM_FENCE);             
};                       


运行结果:

从上面的结果中,可以看出每个work-item独立运行,

你可能感兴趣的:(并行计算)