1.首先安装nvidia显卡驱动:
系统设置->软件更新->附加驱动->选择nvidia最新驱动(第一项)->应用更改
在ubuntu16.04中,更换驱动非常方便,去
系统设置->软件更新->附加驱动->切换到最新的NVIDIA驱动即可。应用更改->重启
2.下载CUDA8.0地址https://developer.nvidia.com/cuda-release-candidate-download(需要登陆)
请先确定显卡型号和是否支持GPU加速,查询网址:https://developer.nvidia.com/cuda-gpus
下载.run文件,进入文件目录,执行安装命令:
sudo ./cuda_8.0.61_375.26_linux.run (目录和文件名由你下载的文件进行更改)
根据提示输入y或回车等操作:此安装过程可选择不安装显卡驱动。
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver
for
Linux-x86_64
367.48
?
(y)es/(n)o/(q)uit: n
Install the CUDA
8.0
Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[
default
is /usr/local/cuda-
8.0
]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA
8.0
Samples?
(y)es/(n)o/(q)uit: y
安装完毕后,再声明一下环境变量,并将其写入到 ~/.bashrc 的尾部:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda
3.测试是否安装成功(可选)
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery
4.安装CuDNN
如果要使用gpu来对tensorflow进行加速,除了安装CUDA以外,cuDNN也是必须要安装的。跟cuda一样,去nvidia的官网下载cuDNN的安装包。不过这次没法直接下载,需要先注册,然后还要做个调查问卷什么的,稍微有点麻烦。我下的是cuDNN v5.1 Library for Linux这个版本。不要下cuDNN v5.1 Developer Library for Ubuntu16.04 Power8 (Deb)这个版本,因为是给powe8处理器用的,不是amd64.
下载地址:https://developer.nvidia.com/cudnn(需要登录)
下载完成后复制文件到cuda目录/usr/local/cuda/,解压下载文件:
tar xvzf cudnn-8.0-linux-x64-v5.1-ga.tgz ###(解压这个文件)
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn* (root用户可以忽略)
tensorflow github上面提到 4 种安装方式,本教程使用 第四种 源码安装:
参看https://github.com/tensorflow/tensorflow/blob/master/README.md
https://github.com/tensorflow/tensorflow(下载地址)
说明:
(1)打开README.md页面,往下翻,直到下图这个位置:
See Installing TensorFlow for instructions on how to install our release binaries or how to build from source.
People who are a little more adventurous can also try our nightly binaries:
最后,将1.2-1.4中下载文件全部存放至相应文件夹内,等待安装时候使用。
ubuntu的gcc编译器是5.4.0,然而cuda8.0不支持5.0以上的编译器,因此需要降级,把编译器版本降到4.9:
在terminal中执行:
sudo apt-get install g++-4.9
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10
sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 30
sudo update-alternatives --set cc /usr/bin/gcc
sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 30
sudo update-alternatives --set c++ /usr/bin/g++
输入gcc -v查看版本是否是4.9.3
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md
我们是在github的Tensorflow官方网页上,根据提示安装,地址如上。
在terminal中输入以下命令:
sudo apt-get install python-pip python-dev
由于本教程使用tensorflow源码编译/安装,所以需要使用 bazel build。
链接:https://www.bazel.io/versions/master/docs/install.html
See the instructions for installing Bazel on:
在terminal中依次输入以下1-7的命令
Supported Ubuntu Linux platforms:
Install Bazel on Ubuntu using one of the following methods:
Bazel comes with two completion scripts. After installing Bazel, you can:
Install JDK 8 by using:
sudo apt-get install openjdk-8-jdk
On Ubuntu 14.04 LTS you'll have to use a PPA:
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update && sudo apt-get install oracle-java8-installer
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
If you want to install the testing version of Bazel, replace stable
with testing
.
sudo apt-get update && sudo apt-get install bazel
Once installed, you can upgrade to a newer version of Bazel with:
sudo apt-get upgrade bazel
If you ran the Bazel installer with the --user
flag as above, the Bazel executable is installed in your $HOME/bin
directory. It's a good idea to add this directory to your default paths, as follows:
export PATH="$PATH:$HOME/bin"
You can also add this command to your ~/.bashrc
file.
Once installed, you can upgrade to a newer version of Bazel with:
sudo apt-get upgrade bazel
在terminal中输入以下命令
sudo apt-get install python-numpy swig python-dev python-wheel #安装第三方库
sudo apt-get install git
git clone git://github.com/numpy/numpy.git numpy #也可以直接在输入网址打包ZIP下载
在terminal中输入以下命令
git clone https://github.com/tensorflow/tensorflow
特别注意,我使用的是tensorflow 0.11版本,该版本要求cuda 7.5 以上,cuDNN v5。
默认下载目录是在/home下
还是刚刚的网址
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md
3) 编译安装TensorFlow:
首先从github上克隆TensorFlow最新的代码:
代码下载完毕之后,进入tensorflow主目录,执行:
会有一系列提示:
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] y
Google Cloud Platform support will be enabled for TensorFlow
ERROR: It appears that the development version of libcurl is not available. Please install the libcurl3-dev package.
第二项"是否选择Google云平台的支持"选择y之后出现了一个错误,需要libcurl,用apt-get安装,当然,基于国内的网络现状,这一项也可以选择no:
安装完毕之后重新执行
除了两处选择yes or no 的地方外,其他地方一路回车:
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] y
Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc nvcc should use as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]:
Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]:
Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]:
Setting up Cuda include
Setting up Cuda lib64
Setting up Cuda bin
Setting up Cuda nvvm
Setting up CUPTI include
Setting up CUPTI lib64
Configuration finished
最后就是通过Bazel进行编译安装了:
这个过程中需要通过git下载和编译google protobuf 和 boringssl:
INFO: Cloning https://github.com/google/protobuf: Receiving objects
INFO: Cloning https://github.com/google/boringssl.git: Receiving objects
....
不过第一次安装的时候遇到报错:
configure: error: zlib not installed
Target //tensorflow/cc:tutorials_example_trainer failed to build
google了一下,需要安装zlib1g-dev:
之后重新编译安装TensorFlow就没有问题了,不过需要等待一段时间:
编译TensorFlow成功结束的时候,提示如下:
......
Target //tensorflow/cc:tutorials_example_trainer up-to-date:
bazel-bin/tensorflow/cc/tutorials_example_trainer
INFO: Elapsed time: 897.845s, Critical Path: 533.72s
执行一下TensorFlow官方文档里的例子,看看能否成功调用GTX 1080:
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.835
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.65GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
000003/000006 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000006/000007 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000009/000006 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000009/000004 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000000/000005 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000000/000004 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
......
没有问题,说明这种通过源代码编译TensorFlow使其支持GPU的方式已经成功了。再在Python中调用一下TensorFlow:
提示错误:
ImportError: cannot import name pywrap_tensorflow
虽然我们通过源代码安装编译的TensorFlow可用,但是Python版本并没有ready,所以继续:
Requirement already satisfied (use --upgrade to upgrade): setuptools in /usr/lib/python2.7/dist-packages (from protobuf==3.0.0b2->tensorflow==0.9.0)
Installing collected packages: six, funcsigs, pbr, mock, protobuf, tensorflow
Successfully installed funcsigs-1.0.2 mock-2.0.0 pbr-1.10.0 protobuf-3.0.0b2 six-1.10.0 tensorflow-0.9.0
我们再次打开ipython,试一下tensorflow官方样例:
终于OK了,之后就可以尽情享用基于GTX 1080 GPU版的TensorFlow了。
这里进行测试,如果你能跟我看到同样的画面,那恭喜你成功配置GPU版的tensorflow啦!
跑这个例子,会出现很多提示,如果你在运行过程中发现自己的显卡型号,并提示成功调用cuda库,并每次step小于100ms,说明成功,否则就检查下哪里出现问题吧~
下面就尽情调戏tensorflow啦!
这里给出很有意思的教程链接:http://m.blog.csdn.net/article/details?hmsr=toutiao.io&id=52658965&utm_medium=toutiao.io&utm_source=toutiao.io
用tensorflow实现梵高作画。
在ubuntu14.04安装N卡驱动后,会出现无法显示登录界面或者循环登录的问题。这主要是显卡不兼容,具体解决思路可以参考google上的解决方案,关键词 ubuntu login loop。
经过测试,网上的教程对我都不适用,无奈转向ubuntu16.04
因为这个教程是我安装成功之后写的,其中难免遗忘某些库的安装,例如Git、pip这些库,安装过程很简单,具体可以google。
在执行./configure 或者设置tensorflow环境时,如果出现无法找到某个库的路径,那么检查是否正确的设置了cuda的环境变量,具体参考 4.1节。
这个问题可以通过对gcc降版本解决。相关连接 http://m.blog.csdn.net/article/details?id=51999566
在测试tensorflow中,执行
python convolutional.py
出现 IOError错误,这是由于convolutional.py中需要从网上下载MNIST数据库。如果出现错误,那么重新执行Python convolutional.py命令,或者手动从网站下载数据库并放在相应文件夹就好啦。
[1] http://blog.csdn.net/u010789558/article/details/51867648
[2] http://textminingonline.com/dive-into-tensorflow-part-iii-gtx-1080-ubuntu16-04-cuda8-0-cudnn5-0-tensorflow
[3] http://m.blog.csdn.net/article/details?id=52658965
[4] https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#installing-from-sources
[5] http://www.tensorfly.cn/tfdoc/get_started/os_setup.html
[6] http://ramhiser.com/2016/01/05/installing-tensorflow-on-an-aws-ec2-instance-with-gpu-support/
[7]http://blog.csdn.net/u012436149/article/details/52554176
[8] http://m.blog.csdn.net/article/details?id=51999566
Ubuntu16.04从U盘安装纯净单系统
Ubuntu16.04安装NVIDIA显卡官方驱动
1.点桌面左上角搜索本机程序的图标,找到“附加驱动”
2.在“附加驱动”里,系统会自动搜索N卡驱动,列表里会提供对应你显卡的最新版官方驱动。例如我的显卡是GT730,选择第一项361.42就可以了。
3.最后点“应用更改”,等待安装完毕即可。
安装CUDA【Debian安装】
1、下载安装
进入下载文件所在目录,执行下列命令:
$ sudo dpkg --install cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64.deb
$ sudo apt-get update
$ sudo apt-get install cuda
安装cuDNN
1、下载安装Cudnn v5.1(https://developer.nvidia.com/cudnn)
进入下载文件所在目录,执行下列命令:
$ tar xvzf cudnn-8.0-linux-x64-v5.1.tgz
$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
退到根目录,运行下面语句:
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
2、配置环境变量:
在terminal根目录中输入以下命令:
$ sudo gedit ~/.bash_profile
然后在打开的文本末尾加入:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda
继续在terminal中输入:
$ source ~/.bash_profile
安装pip
$ sudo apt-get install python-pip python-dev
$ sudo apt-get install python-numpy swig python-dev python-wheel
安装TensorFlow
# Ubuntu/Linux 64-bit, GPU enabled, Python 2.7
# Requires CUDA toolkit 8.0 and CuDNN v5. For other versions, see "Install from sources" below.
$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0-cp27-none-linux_x86_64.whl
$ sudo -H pip install --upgrade $TF_BINARY_URL