ubuntu18.04配置nvidia驱动+tensorflow-gpu1.15.0总结

安装显卡驱动

1.禁用secure boot

这一步很重要,如果没有禁用之后会报错。

首先,根据自己电脑的情况(F12或F10)进入BIOS。
将Secure Boot Option改成Disabled

我用的是雷神电脑,修改这里之后重启又恢复成了Enable,其他的电脑也有可能出现这种情况,需要调整为自定义模式,其是就是将下面一栏,Change to Customization启用,这样Secure boot会自动变为Disabled。

2.禁用nouveau

编辑文件blacklist.conf

sudo vim /etc/modprobe.d/blacklist.conf

在文件最后部分插入以下两行内容

blacklist nouveau
options nouveau modeset=0

更新系统

sudo update-initramfs -u

重启系统

验证nouveau是否已禁用

lsmod | grep nouveau

没有信息显示,说明nouveau已被禁用,接下来可以安装nvidia的显卡驱动。

3. 在英伟达的官网上查找你自己电脑的显卡型号然后下载相应的驱动。网址:http://www.nvidia.cn

将下载后的run文件拷贝至home目录下

4. 在ubuntu下进入命令行界面

我是ctrl+alt+f3,不同的电脑会不同。

首先切换至root用户:

su root

关闭图形界面,不执行会出错。

service lightdm stop 

然后卸载掉原有驱动:

apt-get remove nvidia-*

给驱动run文件赋予执行权限

chmod  a+x [NVIDIA run文件]

安装:

./[NVIDIA run文件] -no-x-check -no-nouveau-check -no-opengl-files 

-no-x-check:安装驱动时关闭X服务
-no-nouveau-check:安装驱动时禁用nouveau
-no-opengl-files:只安装驱动文件,不安装OpenGL文件
避免出现循环登陆的问题。

安装过程中的选项:

  1. Continue installation
  2. Install without signing

其他选择ok或者yes就行。

挂载Nvidia驱动:

modprobe nvidia

检查驱动是否安装成功:

nvidia-smi

ubuntu18.04配置nvidia驱动+tensorflow-gpu1.15.0总结_第1张图片

conda安装tensorflow-gpu1.15.0

之所以选择这个版本是因为它是一个承前启后的版本,可以向后兼容2.0.0的内容。
而通过conda安装可以自动配置合适的cuda和cudnn。

conda install tensorflow-gpu=1.15.0

报错解决:
首先是可能因为网速的问题出现下载失败的情况,需要将conda配置为清华源:
运行以下命令:

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --set show_channel_urls yes

其次我出现了以下错误:

Verifying transaction: failed

RemoveError: 'setuptools' is a dependency of conda and cannot be removed from
conda's operating environment.

一开始使用:

conda install -c anaconda setuptools

但还是报错。

感觉是conda版本需要更新:

conda update --force conda

成功解决

验证gpu

import tensorflow as tf
a = tf.test.is_built_with_cuda()  # 判断CUDA是否可以用
b = tf.test.is_gpu_available(
    cuda_only=False,
    min_cuda_compute_capability=None
)                                  # 判断GPU是否可以用
print(a)
print(b)

输出结果是:
True
True
代表CUDA和GPU可用

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

输出如下:

2020-04-13 22:44:58.936998: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-04-13 22:44:58.968713: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2799925000 Hz
2020-04-13 22:44:58.969389: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aab2112f20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-13 22:44:58.969426: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-04-13 22:44:58.972287: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-04-13 22:44:59.320078: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.320520: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aab1df0a10 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-04-13 22:44:59.320539: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1050 Ti, Compute Capability 6.1
2020-04-13 22:44:59.320701: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.320951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
2020-04-13 22:44:59.357052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-04-13 22:44:59.361052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-04-13 22:44:59.400897: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-04-13 22:44:59.445225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-04-13 22:44:59.446472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-04-13 22:44:59.497395: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-04-13 22:44:59.528163: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-13 22:44:59.528302: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.528658: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.528860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-04-13 22:44:59.528901: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-04-13 22:44:59.529559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-13 22:44:59.529571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-04-13 22:44:59.529576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-04-13 22:44:59.529651: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.529887: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-13 22:44:59.530106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3686 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2020-04-13 22:44:59.530773: I tensorflow/core/common_runtime/direct_session.cc:359] Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1

你可能感兴趣的:(python,深度学习,ubuntu)