下载后文件名为Anaconda2-5.0.1-Linux-x86_64.sh
增加可执行权限,并安装:
$ chmod a+x ./Anaconda2-5.0.1-Linux-x86_64.sh
$ ./Anaconda2-5.0.1-Linux-x86_64.sh
安装步骤需要注意:
1)弹出 “In order to continue the installation process, please review the license agreement.” 时,点回车看license,再拉到底,点Yes。
2)回车接受default install location。
3)提示 “Do you wish the installer to prepend the Anaconda<2 or 3> install location to PATH in your /home//.bashrc ?” 时,输入Yes
4)提示“Thank you for installing Anaconda2!”时表示安装成功。
5)安装程序自动在~/.bashrc里加入了PATH环境变量,为了在当前terminal起效果,执行:
$ source ~/.bashrc
7)打开anaconda
$ anaconda-navigator
8)更新setuptools
$ pip install --upgrade -I setuptools
否则,安装后tensorflow后,执行import tensorflow as tf时,会出现
ImportError: No module named platflom 错误。
6. 安装Bazel
参考:https://docs.bazel.build/versions/master/install-ubuntu.html
1) 安装JDK 8
尽管ubuntu16.04自带openjdk-8,但发现apt还是找不到jdk,所以在没有卸载的情况下,又重新装了一遍,apt自动安装了openjdk 9。
命令:
$ sudo apt-get install openjdk-8-jdk
尽管上面输入的是jdk8,但自动安装了jdk9,为什么?
2)把Bazel的发行URI临时加入包的源
在bash中执行:
$ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
3) 并装并更新Bazel
$ sudo apt-get update && sudo apt-get install bazel
$ sudo apt-get upgrade bazel
7. 安装Tensorflow依赖的python库
安装Tensorflow前必须安装以下python库:
- numpy, 数值计算用的库.
- dev, python扩展库.
- pip, pip包安装管理库.
- wheel, 用于处理.whl压缩格式的库
如果是基于Python 2.7,使用以下命令:
$ sudo apt-get install python-numpy python-dev python-pip python-wheel
如果是基于Python 3.n,使用以下命令:
$ sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel
因为是装GPU版,还需要安装cuda的libcupti-dev库
$ sudo apt-get install libcupti-dev
8. 编译并安装Tensorflow
1) 配置
进入git下载的tensorflow目录,进行编译前参数配置
$ cd tensorflow
$ ./configure
Please specify the location of python. [Default is /home/ceiec/anaconda2/envs/tensorflow/bin/python]:
Found possible Python library paths:
/home/ceiec/anaconda2/envs/tensorflow/lib/python2.7/site-packages
Please input the desired Python library path to use. Default is [/home/ceiec/anaconda2/envs/tensorflow/lib/python2.7/site-packages]
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]: N
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]: N
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]: N
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL support? [y/N]:N
No OpenCL support will be enabled for TensorFlow.
如果选opencl,则configure会检查opencl相关的文件,这里不需要,所以选择No
Do you wish to build TensorFlow with CUDA support? [y/N]:Y 这里是必须的
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]:9.0
Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]:7
上面不能写成7.0,否则会报错。
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]3.0
兼容性等级填3.0
Do you want to use clang as CUDA compiler? [y/N]:
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished
如果选择错了,就中断再来配置一遍。
2)编译
编译GPU版本pip安装包
$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
这里时间会比较长,可能要半小时以上。bazel编译完后,在bazel-bin/....文件夹内生成了一个build_pip_package脚本,运行该脚本可以在/tmp/tensorflow_pkg下生成.whl安装文件(也可以选择其他文件夹),命令如下:
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
生成的安装文件为tensorflow-1.4.0-cp27-cp27mu-linux-x86_64.whl
3) 安装pip包
安装上一步生成的.whl包文件:
$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.4.0-cp27-cp27mu-linux-x86_64.whl
9. 验证是否安装正确
1)简单验证
开一个terminal,cd到非tensorflow代码之外的目录:
$ python
在交互命令中,输入小测试程序:
# Python
>>>import tensorflow as tf
>>>hello = tf.constant('Hello, TensorFlow!')
>>>sess = tf.Session()
>>>print(sess.run(hello))
如果Tensorfow正常安装,会显示:
Hello, TensorFlow!
如果执行import tensorflow as tf时,出现了
ImportError: No module named platflom,则可以升级setuptool后,用.whl文件重装tensorflow来解决该问题。
$pip install --upgrade -I setuptools
$pip install --ignore-installed --upgrade tensorflow-1.4.0-cp27-cp27mu-linux-x86_64.whl
2)验证GPU计算是否正常
下载MNIST的训练数据
https://storage.googleapis.com/cvdf-datasets/mnist/train-images-idx3-ubyte.gz
https://storage.googleapis.com/cvdf-datasets/mnist/train-labels-idx1-ubyte.gz
https://storage.googleapis.com/cvdf-datasets/mnist/t10k-images-idx3-ubyte.gz
https://storage.googleapis.com/cvdf-datasets/mnist/t10k-labels-idx1-ubyte.gz
放入~/Downloads/MNIST-data文件夹内。
用tutorial中自带的MNIST例子进行GPU训练。
$ python /tensorflow/examples/tutorials/mnist/mnist_deep.py
--data-dir ~/Downloads/MNIST-data
正常的结果如下,红色标出的/device:GPU:0说明已经使用GPU了进行训练。