美团的知识库上已经有在CentOS 7上安装TF的详细教程,但是有很多坑还是不踩不知道,现在记录一下安装过程遇到的各种问题。
安装方法:
知识库参考链接:https://www.mtyun.com/library/45/how-to-install-tensorflow-on-centos7/
官网安装方法(推荐):https://www.tensorflow.org/install/install_linux
注意:
安装过程一定要按照顺序操作!!!TF和CUDNN一定版本要匹配,否则会报各种错误,如找不到文件、找不到GPU卡等
Tensorflow对CUDNN对版本兼容性看发行说明:
1.3.0:
All our prebuilt binaries have been builtwith cuDNN 6. We anticipate releasingTensorFlow 1.4 with cuDNN 7.
1.2.0:
TensorFlow 1.2 may be the last time we build withcuDNN 5.1. Starting with TensorFlow 1.3, we will try to build all our prebuilt binaries with cuDNN 6.0. While we will try to keep our source code compatible with cuDNN 5.1, it will be best effort.
1.1.0:
TensorFlow 1.1.0 will be the last time we release a binary with Mac GPU support. Going forward, we will stop testing on Mac GPU systems. We continue to welcome patches that maintain Mac GPU support, and we will try to keep the Mac GPU build working.
安装软件准备:
Nvidia 驱动:sh NVIDIA-Linux-x86_64-375.66.run
Tensorflow:https://www.tensorflow.org/install/install_linux#the_url_of_the_tensorflow_python_package
cuDNN v5.1 for CUDA8.0
tensorflow_gpu-1.1.0-cp27-none-linux_x86_64.whl
S3下载地址:http://tonydong-49061403.mtmssdn0.com/NVIDIA.375.66.tar&http://tonydong-49061403.mtmssdn0.com/NVIDIA-Linux-x86_64-384.66.run
官方下载地址:
Nvidia驱动下载:http://www.nvidia.cn/Download/index.aspx?lang=cn
CUDA Toolkit Download:https://developer.nvidia.com/cuda-downloads
CUDA安装及兼容性:http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
下载tensorflow:https://www.tensorflow.org/install/install_linux#the_url_of_the_tensorflow_python_package
根据官网给出的链接,修改版本即可,将1.3.0改成1.1.0,就可以下载老版本的。如https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.3.0-cp27-none-linux_x86_64.whl,可以改为https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.1.0-cp27-none-linux_x86_64.whl
查看TF的版本及安装路径:
python
>>> import tensorflow as tf
>>> tf.__version__
>>> tf.__path__
安装过程中报错解答:
GPU driver 与 CUDNN/CUDA不匹配,降低 driver版本或者升级CUDNN/CUDA,在执行python时,import tensorflow as tf 报错
ImportError:libcusolver.so.8.0: cannot open shared object file: No such file or directory
ImportError: /usr/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: undefined symbol: cudnnConvolutionBiasActivationForward
sess = tf.Session()时找不到GPU,没有安装tensorflow-gpu,或安装中有错误,重新安装
>>> sess = tf.Session()
2017-09-13 18:10:26.267041: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-13 18:10:26.267091: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-13 18:10:26.267107: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-13 18:10:26.267119: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-13 18:10:26.267132: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
>>> print(sess.run(hello))
TensorFlow 版本太高,CUDNN不支持,需要降低TF或者升级cuDNN
ImportError:libcudnn.so.6: cannot open shared object file: No such file or directory