最近学习TensorFlow关于GPU部分,安装过程中踩着坑总结着经验,最终成功。Cuda、cudnn与TensorFlow-gpu三者版本一定要对应(驱动版本与Cuda版本也要对应!),值得注意的是驱动问题,根据踩坑可得,准备好安装驱动的系统环境,安装cuda的同时安装驱动,接着安装cudnn与tensorflow-gpu,如此,成功安装。
## 查看系统是否存在NVIDIA驱动
# lspci | grep -i nvidia
将nvidiafb注释掉
#blacklist nvidiafb
添加两行
blacklist nouveau
options nouveau modeset=0
在GRUB_CMDLINE_LINUX中添加
rd.driver.blacklist nouveau nouveau.modeset=0
重建 initramfs image
# mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
# dracut /boot/initramfs-$(uname -r).img $(uname -r)
搭建本地yum源参考https://blog.csdn.net/Sunny_Future/article/details/78420508或者自行搜索网络yum源,不再赘述。
# yum install gcc kernel-devel kernerl-hearders -y
安装带有GPU支持的TensorFlow,则需要安装CUDA与cuDNN,这里的坑在于版本不一致问题。切记选择对应版本。我这里安装的TensorFlow1.4.0,那么对应的CUDA版本为8.0,cuDNN版本为6.0
# chmod +x cuda_8.0.61_375.26_linux.run
# init 3
# sh cuda_8.0.61_375.26_linux.run
# init 5
# vim /etc/profile
##文末添加如下
PATH=$PATH:/usr/local/cuda-8.0/bin
export PATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib:/usr/local/cuda-8.0/lib64
# source /etc/profile
# vim ~/.bashrc
##文末添加如下
export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
# source ~/.bashrc
# nvcc --version
# mv cudnn-8.0-linux-x64-v6.0.solitairetheme8 cudnn-8.0-linux-x64-v6.0.tar.gz
# tar -zxf cudnn-8.0-linux-x64-v6.0.tar.gz -C /tmp/
# cp /tmp/cuda/include/* /usr/local/cuda-8.0/include/
# cp /tmp/cuda/lib64/lib* /usr/local/cuda-8.0/lib64/
# wget https://bootstrap.pypa.io/get-pip.py
# python get-pip.py
# yum install -y gcc python-devel
# pip install tensorflow-gpu==1.4
# pip install tensorflow_gpu-1.4.0-cp27-cp27mu-manylinux1_x86_64.whl
# python
>>> import tensorflow as tf
>>> hello = tf.constant("Hello TensorFlow-GPU!!")
>>> sess = tf.Session()
>>> print(sess.run(hello))
Extraction failed.
Ensure there is enough space in /tmp
Signal caught, cleaning up
解决办法
1. 将selinux设置为disabled,重启电脑
2. 安装NVIDIA相关驱动,重启电脑
3. 再次安装cuda