cuda9.0和cudnn7.0.5的安装

两个搭配 cuda9.0+tensorflow1.8+cuDNN7.0.5

和cuda8.0+tensorflow1.4+cuDNN6.0

查看当前gpu型号:

lspci  | grep -I vga

查看当前机器环境,到底是ubantu还是centos

lsb_release -a

查看gcc版本号

gcc --version

 确定已经安装了kernel header

sudo apt-get install linux-headers-$(uname -r)

禁用nouveau

lsmod | grep nouveau

若有内容输出:

sudo vi /etc/modprobe.d/blacklist-nouveau.conf

添加

blacklist nouveau

options nouveau modeset=0

sudo reboot

重新安装.run(安装时请留意,在提示是否安装OpenGL时,应该选no)

编译 NVIDIA_CUDA-9.1_Samples

cd NVIDIA_CUDA-9.1_Samples

make

安装cudnn7

下载对应版本的cudnn7

tar -xzvf cudnn-9.1-linux-x64-v7.1.tgz

sudo cp cuda/include/cudnn.h /usr/local/cuda/include

sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64

sudo chmod a+r /usr/local/cuda/include/cudnn.h

查看cuddn 

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

查看cuda

cat /usr/local/cuda/version.txt

cuda卸载:

sudo /usr/local/cuda-9.1/bin/uninstall_cuda_9.1.pl

sudo rm -rf /usr/local/cuda-9.1/

卸载后 新版本安装不要再安装驱动

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?

(y)es/(n)o/(q)uit: n

安装tensorflow-gpu

pip install tensorflow-gpu==1.8.0

查看nvidia信息

nvidia-smi

安装cuda问题;

异常:

It appears that an X server is running. Please exit X before installation. If you're sure that X is not running, but are getting this error, please delete any X lock files in /tmp.

异常原因:X server锁的影响,至于什么是X server锁 目前不知道 按照提示删除就好了

处理方式:

sudo init 3

rm -rf /tmp/.X*

异常:

The driver installation is unable to locate the kernel source. Please make sure that the kernel source packages are installed and set up correctly.

If you know that the kernel source packages are installed and set up correctly, you may pass the location of the kernel source with the '--kernel-source-path' flag.

处理方式:版本问题,你的内核和cuda版本不一致,换一个版本吧,我换了cuda的9版本,记住不要用9.1,9.2 否则tensorflow的版本就是下一个坑

异常:

nvcc: command not found

原因:未加入到环境变量中

export PATH=$PATH:/usr/local/cuda/bin    

export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

异常:

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

异常出现的原因:tensorlow安装时版本问题 如果不是9.0版本的cuda 那就卸载重装吧

还有可能是上面的原因,需要将上面的两个path加入到环境变量中

你可能感兴趣的:(cuda9.0和cudnn7.0.5的安装)