转载请注明:http://blog.csdn.net/stdcoutzyx/article/details/39677727
首先,我装的系统是Ubuntu14.04.1。
按照参考链接1中所示,检查系统。
执行命令:
:~$ lspci | grep -i nvidia 03:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1) 04:00.0 VGA compatible controller: NVIDIA Corporation GK106GL [Quadro K4000] (rev a1) 04:00.1 Audio device: NVIDIA Corporation GK106 HDMI Audio Controller (rev a1)
发现有K20和K4000两块GPU,还有一块Audio的应该是声卡。
然后,执行命令检查系统版本:
~$ uname -m && cat /etc/*release x86_64 DISTRIB_ID=Ubuntu DISTRIB_RELEASE=14.04 DISTRIB_CODENAME=trusty DISTRIB_DESCRIPTION="Ubuntu 14.04.1 LTS" NAME="Ubuntu" VERSION="14.04.1 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 14.04.1 LTS" VERSION_ID="14.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
可以看到,机器是ubuntu14.04的版本。
然后,使用gcc --version检查gcc版本是否符合链接1中的要求:
~$ gcc --version gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2 Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
检查完毕,就去nvidia的官网(参考链接3)上下载驱动,为下载的是ubuntu14.04的deb包。
Deb包安装较为简单,但是安装过程中提示不稳定,不过用着也没啥出错的地方。
先按照参考链接2安装必要的库。
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
还是按照官网上的流程来。
$ sudo dpkg -i cuda-repo-<distro>_<version>_<architecture>.deb $ sudo apt-get update $ sudo apt-get install cuda
可能需要下载较长时间,但是没关系,放在那等着就是。
没啥问题就算安装好了。
安装过程中提示:
*** Please reboot your computer and verify that the nvidia graphics driver is loaded. *** *** If the driver fails to load, please use the NVIDIA graphics driver .run installer *** *** to get into a stable state.
我没管,提示使用.run安装比较稳定,但我现在用着没问题。
我的系统是64位的,因此配置环境时在.bashrc中加入
$ export PATH=/usr/local/cuda-6.5/bin:$PATH $ export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64:$LD_LIBRARY_PATH
配置完环境后,执行命令
~$ source .bashrc
使其立刻生效。
配置好环境后,可以执行如下命令:
$ cuda-install-samples-6.5.sh <dir>
这样,就将cuda的sample拷贝到dir文件夹下了。该命令只是一个拷贝操作。
然后进入该文件夹,执行make命令进行编译,编译时间较长,需要等待。
首先,验证nvidia的驱动是否安装成功。
~$ cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 340.29 Thu Jul 31 20:23:19 PDT 2014 GCC version: gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)
验证cuda toolkit是否成功。
~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2014 NVIDIA Corporation Built on Thu_Jul_17_21:41:27_CDT_2014 Cuda compilation tools, release 6.5, V6.5.12
使用cuda sample已经编译好的deviceQuery来验证。deviceQuery在<cuda_sample_install_path>/bin/x_86_64/linux/release目录下。我的结果如下,检测出了两块GPU来。
~/install/NVIDIA_CUDA-6.5_Samples/bin/x86_64/linux/release$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 2 CUDA Capable device(s) Device 0: "Tesla K20c" CUDA Driver Version / Runtime Version 6.5 / 6.5 CUDA Capability Major/Minor version number: 3.5 Total amount of global memory: 4800 MBytes (5032706048 bytes) (13) Multiprocessors, (192) CUDA Cores/MP: 2496 CUDA Cores GPU Clock rate: 706 MHz (0.71 GHz) Memory Clock rate: 2600 Mhz Memory Bus Width: 320-bit L2 Cache Size: 1310720 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Enabled Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 3 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > Device 1: "Quadro K4000" CUDA Driver Version / Runtime Version 6.5 / 6.5 CUDA Capability Major/Minor version number: 3.0 Total amount of global memory: 3071 MBytes (3220504576 bytes) ( 4) Multiprocessors, (192) CUDA Cores/MP: 768 CUDA Cores GPU Clock rate: 811 MHz (0.81 GHz) Memory Clock rate: 2808 Mhz Memory Bus Width: 192-bit L2 Cache Size: 393216 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 4 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > > Peer access from Tesla K20c (GPU0) -> Quadro K4000 (GPU1) : No > Peer access from Quadro K4000 (GPU1) -> Tesla K20c (GPU0) : No deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 2, Device0 = Tesla K20c, Device1 = Quadro K4000 Result = PASS
1.http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html#axzz3EiJkLjAq
2.http://blog.csdn.net/abcjennifer/article/details/23016583
3.https://developer.nvidia.com/cuda-downloads