Centos 7.4.1708 安装cuda8.0、cudnn v5.1、Tensorflow 1.2.1、Python 2.7.5教程

首先说一下按照Tensorflow的初衷,博主经过多个版本的尝试以及当前项目需要选择了以下版本安装:

配置

Tensorflow使用的是1.2.1版本

[beer@localhost ~]$ python
Python 2.7.5 (default, Aug  4 2017, 00:39:18) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'1.2.1'
>>> 

Linux机器版本
CentOS 7.4.1708 P100机器

[root@localhost ~]# cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)

显卡内存信息

[beer@localhost ~]$ nvidia-smi
Fri Dec 29 01:13:55 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.66                 Driver Version: 384.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:04:00.0 Off |                    0 |
| N/A   34C    P0    31W / 250W |      0MiB / 16276MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  Off  | 00000000:05:00.0 Off |                    0 |
| N/A   30C    P0    30W / 250W |      0MiB / 16276MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla P100-PCIE...  Off  | 00000000:08:00.0 Off |                    0 |
| N/A   31C    P0    31W / 250W |      0MiB / 16276MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla P100-PCIE...  Off  | 00000000:09:00.0 Off |                    0 |
| N/A   32C    P0    31W / 250W |      0MiB / 16276MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   4  Tesla P100-PCIE...  Off  | 00000000:84:00.0 Off |                    0 |
| N/A   33C    P0    32W / 250W |      0MiB / 16276MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   5  Tesla P100-PCIE...  Off  | 00000000:85:00.0 Off |                    0 |
| N/A   33C    P0    31W / 250W |      0MiB / 16276MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   6  Tesla P100-PCIE...  Off  | 00000000:88:00.0 Off |                    0 |
| N/A   32C    P0    31W / 250W |      0MiB / 16276MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   7  Tesla P100-PCIE...  Off  | 00000000:89:00.0 Off |                    0 |
| N/A   33C    P0    29W / 250W |      0MiB / 16276MiB |      3%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

cuda版本是8.0

[beer@localhost ~]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

cudnn版本是cudnn v5.1
cudnn百度网盘下载地址:
链接:https://pan.baidu.com/s/1i5KmPPr 密码:2qnt

Python版本是2.7.5

[beer@localhost ~]$ python
Python 2.7.5 (default, Aug  4 2017, 00:39:18) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 

步骤:

1.先安装依赖的库

yum install gcc gcc-c++
yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r)

2.去Cuda官网下载相应的Cuda版本,我们这里使用的是cuda-8.0

cuda_8.0.61_375.26_linux.run包
百度网盘地址:
链接:https://pan.baidu.com/s/1kV1lVeB 密码:mxl8
3.屏蔽系统自带的nouveau
使用su命令切换到root用户下:

su root

打开/lib/modprobe.d/dist-blacklist.conf
将nvidiafb注释掉

# blacklist nvidiafb

然后添加以下语句:

blacklist nouveau
options nouveau modeset=0

4.备份以及重建initramfs image
备份原来的镜像

mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak

新建镜像

dracut /boot/initramfs-$(uname -r).img $(uname -r)

5.修改为文本模式

systemctl set-default multi-user.target

6.重新启动, 使用root用户登陆

reboot

7.查看nouveau是否已经禁用

ls mod | grep nouveau

如果没有显示相关的内容,说明已禁用
8.进入Cuda所在目录,安装cuda和驱动

chmod +x cuda_8.0.61_375.26_linux.run
sh cuda_8.0.61_375.26_linux.run

注意:安装cuda时一定不要安装OpenGL;切记,否则安装完之后图形化桌面启动不了
出现如下信息:

Do you accept the previously read EULA? (accept/decline/quit): accept     
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 346.46? ((y)es/(n)o/(q)uit): y
Do you want to install the OpenGL libraries? ((y)es/(n)o/(q)uit) [ default is yes ]: n
Install the CUDA 7.0 Toolkit? ((y)es/(n)o/(q)uit): y
Enter Toolkit Location [ default is /usr/local/cuda-7.0 ]:
Do you want to install a symbolic link at /usr/local/cuda? ((y)es/(n)o/(q)uit): y
Install the CUDA 7.0 Samples? ((y)es/(n)o/(q)uit): y
...

9.安装cudnn

tar -zxf cudnn-8.0-linux-x64-v5.1.tgz
cd cuda
sudo cp lib64/* /usr/local/cuda/lib64/
sudo cp include/* /usr/local/cuda/include/

10.设置cuda的环境变量,在用户的.bashrc文件的末尾添加如下代码

# cuda
export CUDA_HOME=/usr/local/cuda-8.0
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$CUDA_HOME/lib:$PATHH

11.修改运行级别回图形模式

systemctl set-default graphical.target

12.重新启动,并测试是否安装成功

nvidia-smi

显示如下图所示信息则表示安装成功:
Centos 7.4.1708 安装cuda8.0、cudnn v5.1、Tensorflow 1.2.1、Python 2.7.5教程_第1张图片

注意事项

注意:安装cuda时一定不要安装OpenGL;切记,否则安装完之后图形化桌面启动不了

以上按照的是cuda和cudnn,下面是安装tensorflow的过程:
我们这里使用的PIP安装
这是官方给出的方法,我们使用下面方法安装:

# Ubuntu/Linux 64-bit, CPU only, Python 2.7:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl

# Ubuntu/Linux 64-bit, GPU enabled, Python 2.7. Requires CUDA toolkit 7.5 and CuDNN v4.
# For other versions, see "Install from sources" below.
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl

# Mac OS X, CPU only:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py2-none-any.whl

CPU版

sudo pip install  tensorflow==【版本号】
我们这里使用的版本是1.2.1
sudo pip install  tensorflow==1.2.1

GPU版

sudo pip install  tensorflow-gpu==【版本号】
我们这里使用的版本是1.2.1
sudo pip install  tensorflow-gpu==1.2.1

提示成功后我们通过以下命名验证tensorflow能不能正常运行

[beer@localhost ~]$ python
Python 2.7.5 (default, Aug  4 2017, 00:39:18) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'1.2.1'
>>> 

可能会出现各种错误,主要是因为如果含有GPU版的Tensorflow需要Cudnn库进行加速,库的版本出现问题,我们这里使用的是v5.1版本对应的cuda 8.0是没有问题的。至此安装Tensorflow的过程已经结束了,如果有问题可以通过博客联系我,我们可以基于Tensorflow跑自己的模型,当然跑模型的过程也会出现各种错误,需要一步一步解决安装过程中出现的错误,比如某个模块不存在,我们一般通过

sudo pip install 库名

解决,如果找不到库名去百度一下就有很多类似的问题了。

你可能感兴趣的:(机器学习,Linux学习笔记)