WindowsVista 64bit+VS2008配置CUDA环境

 

1、  软件准备

1.1   cudadriver_2.3_winvista_64_190.38_general

1.2   cudatoolkit_2.3_win_64

1.3   cudasdk_2.3_win_64

1.4   VS2008

安装前将之前安装的sdktoolkitdriver等卸载,再依次安装上述软件。如果开发平台没有支持CUDA的显卡,则不需要安装cudadriver_2.3_winvista_64_190.38_general

 

2、  安装检查

2.1 cmd下执行nvcc –V可以查看当前版本号

    nvcc: NVIDIA (R) Cuda compiler driver                                           

       Copyright (c) 2005-2009 NVIDIA Corporation                                     

       Built on Mon_Aug__3_19:43:55_PDT_2009                                      

       Cuda compilation tools, release 2.3, V0.2.1221                                     

2.2 执行bandwidthtest查看配置是否正常

       进入/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/bin/win64/Release>目录,执行

       ./bandwidthTest.exe --memory=pinned --mode=range --start=10240000 --end=10240000 -increment=10240000

       若正常会有类似信息

Running on......                                                                                   

      device 0:Quadro FX 580                                                            

Range Mode                                                                                   

Host to Device Bandwidth for Pinned memory                                                             

Transfer Size (Bytes)   Bandwidth(MB/s)                                                            

 10240000               5101.1                                                            

Range Mode                                                                                    

Device to Host Bandwidth for Pinned memory                                                            

Transfer Size (Bytes)   Bandwidth(MB/s)                                                             

 10240000               4650.8                                                            

Range Mode                                                                                   

Device to Device Bandwidth                                                             

Transfer Size (Bytes)   Bandwidth(MB/s)                                                            

 10240000               14812.5                                                            

&&&& Test PASSED                                                              

Press ENTER to exit...                                                            

2.3 执行deviceQuery.exe查看显卡具体型号

       ./ deviceQuery.exe

       若正常会有类似信息

CUDA Device Query (Runtime API) version (CUDART static linking)                       

There is 1 device supporting CUDA                                                     

Device 0: "Quadro FX 580"                                                           

  CUDA Driver Version:                           2.30                                  

  CUDA Runtime Version:                          2.30                             

  CUDA Capability Major revision number:         1                                            

  CUDA Capability Minor revision number:         1                                            

  Total amount of global memory:                 536870912 bytes                             

  Number of multiprocessors:                     4                                       

  Number of cores:                               32                             

  Total amount of constant memory:               65536 bytes                             

  Total amount of shared memory per block:       16384 bytes                             

  Total number of registers available per block: 8192                                         

  Warp size:                                     32                             

  Maximum number of threads per block:           512                                     

  Maximum sizes of each dimension of a block:    512 x 512 x 64                             

  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1                             

  Maximum memory pitch:                          262144 bytes                        

  Texture alignment:                             256 bytes                             

  Clock rate:                                    1.13 GHz                             

  Concurrent copy and execution:                 Yes                                      

  Run time limit on kernels:                     No                                     

  Integrated:                                    No                             

  Support host page-locked memory mapping:       No                                       

  Compute mode:                                  Default (multiple host threads can        use this device simultaneously)                                                            

Test PASSED                                                                          

Press ENTER to exit...                                                                    

       根据信息可以推算显卡的单精度浮点处理性能为3*32*1.13=108.48Gflops

 

3、  设置系统环境变量

3.1 将安装的CUDAsdk的路径加到系统环境变量中:

例如C:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/bin/win64

下的

├─Debug

├─EmuDebug

└─EmuRelease

几个目录都加入到系统环境变量PATH中,这样才能在运行程序的时候找到相应的dll库。

3.2 将编译需要的头文件放到vs2008环境中

复制C:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/common目录到C:/Users/dawning/Documents/Visual Studio 2008

 

4、  VS2008建立CUDA简单的工程

4.1 将模板项目C:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK/C/src/ template拷贝到vs2008项目目录C:/Users/dawning/Documents/Visual Studio 2008/Projects

4.2 打开vs2008,打开模板项目template_vc90

4.3 右键点击template.cu选择自定义编译选项%

你可能感兴趣的:(科学计算软件安装调试优化)